PostgreSQL - how to run VACUUM from code outside transaction block?

24,011

Solution 1

After more searching I have discovered the isolation_level property of the psycopg2 connection object. It turns out that changing this to 0 will move you out of a transaction block. Changing the vacuum method of the above class to the following solves it. Note that I also set the isolation level back to what it previously was just in case (seems to be 1 by default).

def vacuum(self):
    old_isolation_level = self.conn.isolation_level
    self.conn.set_isolation_level(0)
    query = "VACUUM FULL"
    self._doQuery(query)
    self.conn.set_isolation_level(old_isolation_level)

This article (near the end on that page) provides a brief explanation of isolation levels in this context.

Solution 2

For anyone else that has tried all the suggestions around this problem without success, you may be suffering the same fate as me: I had 2 (or more) SQL statements in one execute() call. It turns out that Postgres itself resets any autocommit/isolation after the first statement (separated by a ;). I finally came across the solution here: https://github.com/psycopg/psycopg2/issues/1201

So don't do something like this:

cursor.execute("SELECT 1; VACUUM FULL")

Instead do:

cursor.execute("SELECT 1")
cursor.execute("VACUUM FULL")

Solution 3

Additionally, you can also get the messages given by the Vacuum or Analyse using:

>> print conn.notices #conn is the connection object

this command print a list with the log message of queries like Vacuum and Analyse:

INFO:  "usuario": processados 1 de 1 páginas, contendo 7 registros vigentes e 0 registros não vigentes; 7 registros amostrados, 7 registros totais estimados   
INFO:  analisando "public.usuario"

This can be useful to the DBAs ^^

Solution 4

While vacuum full is questionable in current versions of postgresql, forcing a 'vacuum analyze' or 'reindex' after certain massive actions can improve performance, or clean up disk usage. This is postgresql specific, and needs to be cleaned up to do the right thing for other databases.

from django.db import connection
# Much of the proxy is not defined until this is done
force_proxy = connection.cursor()
realconn=connection.connection
old_isolation_level = realconn.isolation_level
realconn.set_isolation_level(0)
cursor = realconn.cursor()
cursor.execute('VACUUM ANALYZE')
realconn.set_isolation_level(old_isolation_level)

Unfortunately the connection proxy provided by django doesn't provide access to set_isolation_level.

Solution 5

Note if you're using Django with South to perform a migration you can use the following code to execute a VACUUM ANALYZE.

def forwards(self, orm):

    db.commit_transaction()
    db.execute("VACUUM ANALYZE <table>")

    #Optionally start another transaction to do some more work...
    db.start_transaction()
Share:
24,011
Wayne Koorts
Author by

Wayne Koorts

Software developer who loves learning new technologies. My (mostly) programming blog can be found here. How to be a nice editor on Stack Exchange.

Updated on July 09, 2022

Comments

  • Wayne Koorts
    Wayne Koorts almost 2 years

    I am using Python with psycopg2 and I'm trying to run a full VACUUM after a daily operation which inserts several thousand rows. The problem is that when I try to run the VACUUM command within my code I get the following error:

    psycopg2.InternalError: VACUUM cannot run inside a transaction block
    

    How do I run this from the code outside a transaction block?

    If it makes a difference, I have a simple DB abstraction class, a subset of which is displayed below for context (not runnable, exception-handling and docstrings omitted and line spanning adjustments made):

    class db(object):
        def __init__(dbname, host, port, user, password):
            self.conn = psycopg2.connect("dbname=%s host=%s port=%s \
                                          user=%s password=%s" \
                                          % (dbname, host, port, user, password))
    
            self.cursor = self.conn.cursor()
    
        def _doQuery(self, query):
            self.cursor.execute(query)
            self.conn.commit()
    
        def vacuum(self):
            query = "VACUUM FULL"
            self._doQuery(query)
    
    • Wayne Koorts
      Wayne Koorts almost 15 years
      @nosklo, Good suggestion, but according to the Postgres docs that is the same as COMMIT.
    • Michael Aquilina
      Michael Aquilina about 9 years
      Are you using SQLAlchemy by any chance? I experienced a similiar issue because setting autocommit=True in SqlAlchemy does not actually turn off transactions. Using set_isolation_level is a work around that accesses the internal methods of the psycopg2 connection.
    • Wayne Koorts
      Wayne Koorts about 9 years
      @MichaelAquilina I believe at the time (this was 6 years ago now) this was for part of a project that wasn't using an ORM.
  • Wayne Koorts
    Wayne Koorts almost 15 years
    Thanks for your detailed answer. It turns out the solution was to do with "isolation levels", see my answer below.
  • rfusca
    rfusca almost 15 years
    Depending on your usage patterns, there are still times it makes sense to vacuum manually imho.
  • Magnus Hagander
    Magnus Hagander almost 15 years
    There are, but there aren't as many anymore. And it should definitely not be VACUUM FULL.
  • winwaed
    winwaed over 13 years
    I'm getting into PostGres and with some large tables. All the books (talking from a 8.* or 9.* perspective) talk about running VACUUM ANALYZE manually after a lot of updates, or automatically with a daemon.
  • Brad Koch
    Brad Koch over 10 years
    "after a daily operation which inserts several thousand rows", utilities like this should definitely VACUUM when they're finished.
  • Milen A. Radev
    Milen A. Radev over 10 years
    @Brad Koch: there's a significant difference between VACUUM and VACUUM FULL.
  • Brad Koch
    Brad Koch over 10 years
    Huge difference, but you do want a VACUUM ANALYZE after a large batch of updates, whether it's triggered by you or autovacuum. "you don't even need to run plain VACUUM manually" on its own is a bit misleading.
  • Nuno André
    Nuno André over 7 years
    Or, avoiding magic numbers: self.conn.set_isolation_level(psycopg2.extensions.ISOLATION_‌​LEVEL_AUTOCOMMIT)
  • Vlax
    Vlax about 4 years
    You need to run cursor.execute('VACUUM FULL VERBOSE') to actually get something in that property.
  • Cerin
    Cerin over 2 years
    What does force_proxy = connection.cursor()? You don't use that variable.