/r/PostgreSQL

Photograph via snooOG

The home of the most advanced Open Source database server on the worlds largest and most active Front Page of the Internet.

/r/PostgreSQL

49,866 Subscribers

0

PGSQL 16 - Standby server log file filled with "waiting for WAL to become available at xxx"

Greetings all,

We are running a master/slave PGSQL v16 setup with pgbackrest where the WAL files are placed in S3. For some reason, the PGSQL log file on the standby is getting filled with the message: "waiting for WAL to become available at xxx/yyyyy" every 5 seconds. According to the docs, this is due to the "wal_retrieve_retry_interval" default setting of 5sec. We can increase the timeout but that seems to affect how often the WAL logs are check in S3.

We have setup this type of config many times in the past and don't remember these messages getting spammed in the log file. From what we can tell, the standby system is running properly, and the WAL files are properly getting applied as they come in.

Can someone please let me know how to disable these messages? Or, more importantly, is something wrong with our setup? Happy to provide more context/info if needed.

Thanks.

8 Comments
2024/12/10
11:05 UTC

0

I can't import a database

Basically, I'm doing a project for my college, and I first created a database in PostgreSQL version 17.2, however, my friend uses Linux and only has version 16.6 available. I installed version 16.6, but when creating the database with JPA everything works fine and the export too. However, when my friend or even I import this database, it gives an error. I managed to do this process in 17.2 and works. I've tried importing all types of file formats and it didn't work. I don't know what else to do.

13 Comments
2024/12/09
22:38 UTC

0

Best self-service bi tools for PostgreSql

Heey, I collected in this blogpost my personal favorites product analytics tools for PostgreSQL. If you have any suggestion or feedback feel free to comment. I hope it helps.

https://medium.com/@pambrus7/top-5-self-service-bi-solutions-for-postgresql-b6959e54ed5f

6 Comments
2024/12/09
16:40 UTC

10

Is it good to learn from documentation

Sorry for this dumb question but I just need more confirmation. If I can stand reading the documentation, is it good? Is there something fundenmental missing that I would need to learn from other sources? I know a little database but definitely not a expert of any other db.

18 Comments
2024/12/09
14:24 UTC

0

timeseries indexes

I'm working with timescaleDB. I was wondering if there was a standard index table or a bunch of them. Indexes would be like day of week, or hour of the day, etc all depending on timezones. The primary key would be timestamp. This kind of index table could be linked to any timeseries table.

3 Comments
2024/12/09
13:36 UTC

0

Any tips on writing a function that will paginate through many records using offset and num_rows as input parameters?

What the title says

I'm primarily an MSSQL / TSQL dev and completely new to PGSQL but need to replicate an SP that allows pagination and takes number of records(to return) and offset as input parameters.

Pretty straightforward in TSQL SELECT X,Y,Z FROM table OFFSET @offset ROWS FETCH NEXT @num_rows ROWS ONLY.

10 Comments
2024/12/09
12:45 UTC

1

Central management of Postgres servers/databases ?

Hi, what is your strategy around management of backup jobs/monitoring etc of your Postgres servers?

Traditionally from my MSSQL background I had a "central management server" that i used for connecting to all servers and ran queries across them.

I'm in the process of setting up backup of our servers, should I set up the backup jobs from a central server, which connects to the rest and run backups? For example using pgbackrest.

3 Comments
2024/12/09
09:15 UTC

11

How do you test your backups

In my company we want to start testing our backups, but we are kind of confused about it. It comes from reading and wandering around the web and hearing about the importance of testing your backups.

When a pg_dump succeeds - isn’t the successful result enough for us to say that it works? For physical backups - I guess we can test that the backup is working by applying WALs and seeing that there is no missing WAL.

So how do you test your backups? Is pg_restore completing without errors enough for testing the backup? Do you also test the data inside? If so, how? And why isn’t the backup successful exit code isn’t enough?

15 Comments
2024/12/08
19:34 UTC

0

Question about rounding with null

I am trying to use the following code(s) to get a column of data, all of which have decimal points except for the null values, to round to 1 decimal place:

(1) SELECT ROUND(NULLIF(GPA, NULL) :: NUMERIC, 1) FROM store_choices;

(2) SELECT ROUND(CAST(GPA AS NUMERIC), 1) FROM store_choices;

However, I get the following errors for both:

(1) ERROR: invalid input syntax for type numeric: "NULL"

SQL state: 22P02

(2) ERROR: invalid input syntax for type numeric: "NULL"

SQL state: 22P02

So the same error.

Do I have to treat the null value differently?

3 Comments
2024/12/08
14:39 UTC

15

7+ million Postgres tables

15 Comments
2024/12/08
12:15 UTC

0

Throw data into PG and "forget"?

Hi PostgreSQL Community,

TLDR: I am a big fan of PostgreSQL and find it truly inspiring. I am wondering if PostgreSQL could someday automatically adapt its own storage, indexing, and optimization strategies, allowing it to handle transactional, analytical, cache like, time series, graph, and vector workloads without constant manual tuning. I am not an expert, just a curious enthusiast, and I would love to know if others have thought about or discussed these ideas before.

Introduction
First, I want to thank everyone who has contributed to PostgreSQL. Your work has made it a flexible, reliable, and continually evolving database. Features like advanced indexing, logical replication, and pluggable storage show how PostgreSQL can address a wide range of needs.

A Curious Dream
I am imagining a future where PostgreSQL observes how data is used and then automatically adjusts—or at least makes suggestions for—its internal storage formats, indexing strategies, and caching policies as conditions change. Instead of manually fine tuning configurations, users might provide a hint at the start, indicating that a table is primarily transactional, analytical, time series, cache like, graph oriented, based on vector data, etc..

Adapting Over Time and Hybrid Approaches
As workloads evolve, PostgreSQL could gradually shift to more suitable formats and indexing choices. For example, some parts of a table might become column oriented to speed up aggregations, while other parts remain row oriented for direct lookups. It could blend multiple strategies to handle mixed workloads more efficiently. It might also manage indexes automatically, adding or removing them based on real query patterns. Although these hybrid approaches would likely mean higher storage and compute costs, such trade offs may be acceptable as hardware continues to become more affordable and abundant.

Why It Matters
As data and workloads grow more complex, reducing the need for manual tuning would be a huge benefit. A self optimizing PostgreSQL would help developers and DBAs focus on building solutions without juggling multiple specialized databases.

Open Questions
I am not sure if these ideas are entirely new. Perhaps they have been explored or discussed before. If anyone knows of previous discussions, research, or has thoughts on the feasibility of this vision, I would be grateful to learn more.

Closing Thoughts
Again, thank you for reading and for all the hard work that makes PostgreSQL such a wonderful platform. I am just a curious enthusiast who dreams about what PostgreSQL might become, and I would greatly appreciate any insights or guidance you can share.

P.S. I used ChatGPT to help clarify and organize my message. I hope it worked!

15 Comments
2024/12/07
15:42 UTC

10

Text search (exact text, prefix search, approx text, vector search)

I'm wondering what's the best practice these days for implementing various kinds of search on text data from a Postgres database

This could be:

  • exact search (or prefix / postfix / any exact match, for id-like data, etc)
  • approx text search (for descriptions, names, etc)
  • vector search (for sentiment, or approx search using embeddings, etc)

are there good ways to do all of this directly inside postgres, using extensions, or is it better to have an external Meillisearch / Algolia / Elasticsearch / other that is sync'd to the data?

What setup / extensions / architecture would you recommend?

19 Comments
2024/12/07
10:40 UTC

2

what are your little helpers, like DBATOOLS for MSSQL, which fill gaps in tooling like automating restore testing, user import/export

Hi

what are your little helpers, like DBATOOLS for MSSQL, which fill gaps in tooling like automating restore testing, user import/export and the like

bye

2 Comments
2024/12/07
07:13 UTC

2

Changing the psql \e editor

Apologies in advance if the answer is right in front of me.

Learning Postgresql right now and got to the part for the command line psql command. Everything was fine and dandy until the part where I used \e. Now it gave me an option to use Nano or Vi(m). I don't know why but I picked Nano when I meant to pick Vi(m). I googled and searched a few and I'm not getting the answer (and I'm not sure I am asking the right question as well) to how to a) swap the editor to Vi or b) better yet, get it back to default to asking me if I either want Nano or Vi.

It's not the end of the world but it's just bugging my brain. Any help to silence this bugging in my head would be deeply appreciated. Thank you.

echo $EDITOR shows as blank.

EDIT: I think I found a solution that works for me. I looked in my home directory and saw my .selected_editor saying SELECTED_EDITOR= was pointed to nano. I just changed that and pointed to vi and now vi opens in \e. Just in closing, when \e asked if I wanted to use nano or vi, when I chose nano, did that change my .selected_editor file hence it was opening up nano? Just out of curiosity if that's what happened I want to know. Thanks again folks.

Also EDIT: Thank you for the reply u/threeminutemonta

4 Comments
2024/12/07
07:07 UTC

2

Missing the Obvious: Streaming Replication, hot standby lags for ANY query??

Have a "simple" setup, a main server and a hot standby via streaming replication. I have to be missing something obvious here. Running say pgbench with -s 15, so that updates/inserts are constant and steady. A very simple query run on the hot standby causes the replication to lag, reach 30 seconds and then query is canceled with the typical "conflict, might need row versions". Emphasis on might. And connection is reset. I note when starting the query, lag immediately starts and increases by the second (so ten seconds later, ten seconds lag). Clearly replication is frozen, period.

Here's my problem, the query, to hot standby, is: select pg_sleep(40);

WHY is this a problem? What simple setting am I missing? How can the hot standby not act like the main where it can handle transactions?!

NOTE: I'm aware of possible vacuums issues and the Postgres 'problem query' list, this isn't those. I'm also aware of the settings to change the lag threshold. 10 minute lag is not acceptable!

4 Comments
2024/12/07
04:47 UTC

10

Ep22 of the Talking Postgres podcast just published with guest Affan Dar of Microsoft (cross-post from r/SQL)

The latest episode of Talking Postgres is out, and I hope y'all enjoy it. Many of the recent guests have been open source contributors so this time I wanted to invite someone who works on Postgres in the cloud. Ep22 on Leading engineering for Postgres on Azure with Affan Dar just published with guest Affan Dar, who heads up engineering for Postgres at Microsoft. And in this hourlong episode Affan shared his perspectives on management vs. IC roles, what his job is like, what the strategy is at Microsoft for Postgres, the world of Postgres extensions, plus a bit of discussion about AI and pgvector. Affan is thoughtful & candid, am so glad he agreed to be on the podcast! (Disclosure: I am the host.)

Let me know what you think. Suggestions for future guests always welcome. The podcast is a TON of fun to produce and I hope you enjoy it as much as I do!

1 Comment
2024/12/06
20:01 UTC

3

Export large tabs (Backups)

I have a large log table with about 4billion records and 600GB in size. The increase has been significant recently and the pg_dump takes a day now.

Does anyone have experience with this scale that could help?

Also recommendations on importing too. (My first time dealing with something at this scale)

Thank you in advance!

11 Comments
2024/12/06
14:30 UTC

2

Recommendations for decent paid instructor lead training course

Hi, I appreciate that this question has probably been asked many times already, but our company has budget to spend on training before the end of the year and I would like to get up to speed on PostgreSQL.

Could anyone recommend an instructor lead PostgreSQL training course that facilitates west Europe?

I have 20 years SQL Server experience but feel it's about time I learnt how the other half lived.

3 Comments
2024/12/06
09:14 UTC

1

Pgbackrest and Docker

I'm having a hard time setting up Postgres and pgbackrest for backups through docker. I thought it wouldn't be that hard since there's a mounted volume so technically pgbackrest can access the data, but then I also ran into problems with WAL archiving.

I'm currently using postgres:15-alpine postgres image, but do I need to build my own custom image for pgbackrest to do WAL archiving. I also tried using docker-pgbackrest but I have no idea how to set that up.

On top of that, I'm so confused about what the difference between postgres-client and the postgres:15-alpine postgres image. In the pgbackrest docs, it says to install a cluster, but can't I just do the same with the official postgres image and conf?

1 Comment
2024/12/06
08:26 UTC

3

Concurrent inserts causes ExclusiveLock

Hello.
I'm inserting a "large" number of records (tens of millions) into a Postgres table in parallel, using different processes/db connections.

Initially, the table had a UUID PK set by the Client but I've observed that the inserts were performed sequentially, as they acquired ExclusiveLock on the table (I saw the lock in pg_stat_activity).

I tried to remove the PK and repeat the migration but during the execution, randomly, some Insert statements acquire the ExclusiveLock and blocks the other inserts. Sometimes these locks would resolve themselves and the inserts would return to parallel execution, other times they would continue (different pids would acquire the lock one at a time).

I'm assuming that you can execute several inserts in parallel. I couldn't find anything specific on the documentation (if anyone has anything to point out to me i'd be grateful) but reading some old answers on StackOverflow it seemed to me that it was possible. So I don't understand if I'm doing something wrong or if it is the normal operation of Postgres

8 Comments
2024/12/05
16:39 UTC

8

How to fetch large amount data from the table which is frequently updating?

So we have primary db and replica db both hosted on AWS and some of the tables has huge volume and high frequency of writing and reading requests

While we fetch limited and smaller size of data it works and gives the result but the query has bigger time range and more data to scan to it breaks and doesn't give result instead we get an error

The reason for this is because while we are trying to fetch the data at the same time someone is inserting new data

I have stumbled upon the solution that says use lock but it is not possible because we cannot lock the table for even a second due to the nature of our product

So what should I do in such case ??

16 Comments
2024/12/05
15:50 UTC

Back To Top