Postgresql: database is not accepting commands to avoid wraparound data loss

10,711

The problem isn't dead tuples, it's transaction ids, which control row visibility. Each transaction gets a sequential XID, since they're 32 bit ints, they will eventually wrap around.

See here for more detail: http://www.postgresql.org/docs/9.3/static/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND, but the short version is that all tables need to be VACUUMed (either manually or with autovacuum) at least every 2 billion transactions. The longer you go without vacuuming the longer it takes.

To fix your current problem you don't need to do a VACUUM ANALYZE, just a VACUUM - I am not sure how much of a speed difference there is, but it should be faster.

What kind of hardware is this running on, and what's your maintenance_work_mem set to? You may want to raise it (possibly temporarily) to complete the VACUUM faster.

In the future, you basically just need to VACUUM more: either increase autovacuum frequency (see here: https://dba.stackexchange.com/questions/21068/aggressive-autovacuum-on-postgresql, for example) or even schedule manual VACUUMs with cron. Also look at vacuum_freeze_min_age and related settings.

What kind of data is it, and what kind of transactions are you running? That's a pretty big table, can it be partitioned (by date, for instance)?

Edit

You may also want to enable log_autovacuum_min_duration (set it to a small value), to see what autovacuum is actually doing when the database is live, and if there are locking issues preventing it from running.

Responding to Comments

You don't have to run VACUUM standalone, you can run it now, unless that will interfere too much with your other databases. Just need to do it as superuser, so system tables are also vacuumed.

Doing a dump/restore seems drastic, and I can't imagine it would be faster than completing the VACUUM.

Switching away from stored procedures will not help: any queries that modify data will generate XIDs, it doesn't matter if you use transactions explicitly, they're still transactions.

You're on the right way - getting autovacuum to keep up with your inserts/updates is the best solution (logging its activity should help understand what's going wrong now).

Judging by your table structure, this may be the classic case for table partitioning (http://www.postgresql.org/docs/9.3/static/ddl-partitioning.html) - am I right in thinking that it's all inserts, rather than updates/deletes? If you're always writing to one small partition, you can vacuum it more aggressively (autovacuum can be configured per table), and VACUUM FREEZE the others.

Share:
10,711
xardas
Author by

xardas

Updated on June 05, 2022

Comments

  • xardas
    xardas almost 2 years

    Got the error upon create/delete/update queries:

    ERROR: database is not accepting commands to avoid wraparound data loss in database "mydb" HINT: Stop the postmaster and use a standalone backend to vacuum that database. You might also need to commit or roll back old prepared transactions.

    So, the database is blocked and it is only possible to perform SELECT queries.

    Database's size 350 GB. 1 table(my_table) has ~1 billion rows.

    system: "PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4), 64-bit"

    postgresq.conf some settings:

    effective_io_concurrency = 15           # 1-1000; 0 disables prefetching
    autovacuum_vacuum_cost_delay = -1
    
    #vacuum_cost_delay = 0                  # 0-100 milliseconds
    #vacuum_cost_page_hit = 1               # 0-10000 credits
    #vacuum_cost_page_miss = 10             # 0-10000 credits
    #vacuum_cost_page_dirty = 20            # 0-10000 credits
    #vacuum_cost_limit = 200 
    

    I do not use prepared transactions. But basic stored procedures are used(which means, automatic tranactions, right?) 50mln times per day.

    Сurrently "autovacuum: VACUUM ANALYZE public.my_table (to prevent wraparound)" is perforing, it is almost 12 hours of that query activity.

    As far as I understand, the problem with not-vacuumed dead touples, right?

    How to resolve this problem and prevent this in the future? Please, help :)

    The end of story( ~one month later) Now my big table is partitioned by thousands of tables. Each small table is vacuumed much faster. Autovacuum configuration was set more closer to default. If needed, i could be set to more agressive again, but so far database with billions of rows works pretty well.

    So, the problem of the topic should not appear again.

    ps now i'm looking at Postgres-XL as a next step of data scalability.

  • Denis de Bernardy
    Denis de Bernardy about 10 years
    As I understand the docs whose link you might have borrowed from my comment, auto-vacuuming isn't enough or useful here: what's actually needed here is a manual vacuum of all tables every 2bn transactions or so, in order to allow txids to reset. That said, I never have experienced this first-hand, and that is why I left the comment instead of an answer. Methinks you should delete your answer (or turn it into a wiki for Craig or Erwin to amend) if you haven't either.
  • Dmitri
    Dmitri about 10 years
    @Denis: The docs seem pretty clear that autovacuum will clear old XIDs (when working correctly). I don't believe there is a functional difference between a manual VACUUM and autovacuum, though I could be wrong. I appreciate your thoughts on what I should do with my answers.
  • Denis de Bernardy
    Denis de Bernardy about 10 years
    As I understand them there's a functional difference between the two, in the sense that auto-vacuum is a partial thing: it sweeps dead rows in a few disk pages, and then stops to avoid locking things for too long; and then does the same again for another few disk pages later on, and so forth, so as to progressively sweep out dead rows. Issuing a vacuum full, in contrast, rewrites the entire table for all intents and purposes, locking it for as long as is needed to do so.
  • xardas
    xardas about 10 years
    Dmitri, thank you for such broad answer. maintenance_work_mem = 512MB #vacuum_freeze_min_age = 50000000 - is set to default. The table structure is something like history: id, sub_id, date, param1, param2 .... Which has index (sub_id, date)
  • xardas
    xardas about 10 years
    BTW, I restarted PG (so, vacuum operation was stopped) and received message: < 2014-05-01 16:57:46.353 >WARNING: database with OID 12345 must be vacuumed within 999671 transactions I can not stop PG and start in standalone mode for VACUUM without ANALYZE manually because I need other database running on this server. So, it seems that i should wait until it is finished. What about backuping db, droppping and restoring? Will it help in my situation, since i would like to enable my database as fast as possible?
  • xardas
    xardas about 10 years
    Some steps I see as a part of solution for preventing this problem in future: 1) perform autovacuum more often: autovacuum_naptime = 55 autovacuum_vacuum_threshold = 45 2) reduce amount of used stored procedures when it is possible, because they transaction-based.Instead, i'll try to use simple queries. This should lower amount of created xid and therefore lower the probability to encounter the wraparound problem. Will that help?
  • xardas
    xardas about 10 years
    I also made some changes in config: to see the vacuum logs - log_autovacuum_min_duration = 0 autovacuum_analyze_threshold = 150 - increasing this value from default 50 to 150 because DB has much more insert/update/delete operations than select. Am I on right way?
  • Dmitri
    Dmitri about 10 years
    @Denis: VACUUM FULL is different from VACUUM (and also not required to clear XIDs). What I meant was that once the incremental autovacuum completes, the end result is the same as running VACUUM manually.
  • xardas
    xardas almost 10 years
    Dmitry, thank you for your "Responding to Comments". I've just read it and it described what i actually did after the problem in topic.