build SQL dynamic query with psycopg2 python library and using good conversion type tools

11,781

Solution 1

You are trying to pass a table name as a parameter. You probably could've seen this immediately if you'd just looked at the PostgreSQL error log.

The table name you're trying to pass through psycopg2 as a parameter is being escaped, producing a query like:

INSERT INTO E'my_table'(name, url, id, point_geom, poly_geom) VALUES (E'ST_GeomFromText(''POLYGON(( 52.146542 19.050557, 52.148430 19.045527, 52.149525 19.045831, 52.147400 19.050780, 52.147400 19.050780, 52.146542 19.050557))'',4326)');'

This isn't what you intended and won't work; you can't escape a table name like a literal. You must use normal Python string interpolation to construct dynamic SQL, you can only use parameterized statement placeholders for actual literal values.

params = ('POLYGON(( 52.146542 19.050557, 52.148430 19.045527, 52.149525 19.045831, 52.147400 19.050780, 52.147400 19.050780, 52.146542 19.050557))',4326)
escaped_name = name.replace('"",'""')
curs.execute('INSERT INTO "%s"(name, url, id, point_geom, poly_geom) VALUES (ST_GeomFromText(%%s,%%s));' % escaped_name, params)

See how I've interpolated the name directly to produce the query string:

INSERT INTO my_table(name, url, id, point_geom, poly_geom) VALUES (ST_GeomFromText(%s,%s));

(%% gets converted to plain % by % substitution). Then I'm using that query with the string defining the POLYGON and the other argument to ST_GeomFromText as query parameters.

I haven't tested this, but it should give you the right idea and help explain what's wrong.

BE EXTEMELY CAREFUL when doing string interpolation like this, it's an easy avenue for SQL injection. I've done very crude quoting in the code shown above, but I'd want to use a proper identifier quoting function if your client library offers one.

Solution 2

Now that 2.7 is on PyPi here is is an example for a dynamic query.

In this example I'll assume the polygon as dictionary from your csv file. Keys could be name, url, id, point_geom, poly_geom as mentioned above but they won't matter really as long as the table structure contains the same keys.

There's probably a way to shorten this but I hope this clarifies the use of the sql functions, namely sql.SQL, sql.Identifier, and sql.Placeholder and how to concatenate a list of strings sql.SQL('..').join(list()).

from psycopg2 import sql
table = 'my_table'
polygon = Polyogon.from_file()  # or something
column_list = list()
value_list = list()

# Convert the dictionary to lists
for column, value in polygon.items():
    column_list.append(sql.Identifier(column))  # Convert to identifiers
    value_list.append(value)

# Build the query, values will be inserted later
query = sql.SQL("INSERT INTO {} ({}) VALUES ({}) ON CONFLICT DO NOTHING").format(
                sql.Identifier(table),
                sql.SQL(', ').join(column_list),  # already sql.Identifier
                sql.SQL(', ').join([sql.Placeholder()] * len(value_list)))

# Execute the cursor
with postgres.cursor() as p_cursor:
    # execute requires tuples and not a list
    p_cursor.execute(insert_query, tuple(value_list))  

Reference: http://initd.org/psycopg/docs/sql.html

Solution 3

The proper way is to use psycopg2 2.7's new sql module which includes an Identifier object. This allows you to dynamically specify SQL identifiers in a safe way.

Unfortunately 2.7 is not on PyPi yet (2.6.2 as of writing).

Until then, psycopg2 cover this under the heading "How can I pass field/table names to a query?" http://initd.org/psycopg/docs/faq.html#problems-with-type-conversions

You can pass SQL identifiers in along with data values to the execute function by using the AsIs function.

Note: this provides NO security. It is as good as using a format string, which is not recommended. The only real advantage of this is you encourage future code to follow the execute + data style. You can also easily search for AsIs in future.

from psycopg2.extensions import AsIs
<snip>
with transaction() as cur:
    # WARNING: not secure
    cur.execute('SELECT * from %(table)s', {'table': AsIs('mytable')})
Share:
11,781
reyman64
Author by

reyman64

Updated on July 26, 2022

Comments

  • reyman64
    reyman64 almost 2 years

    I have some problem to design a good algorithm which use specification of psycopg2 library described here

    I want to build a dynamic query equal to this string :

    SELECT ST_GeomFromText('POLYGON((0.0 0.0,20.0 0.0,20.0 20.0,0.0 20.0,0.0 0.0))');
    

    As you can see, my POLYGON object contain multiple point, read in a simple csv file some.csv which contain :

    0.0;0.0
    20.0;0.0
    20.0;20.0
    0.0;20.0
    0.0;0.0
    

    So i build the query dynamically, function of the number of line/data in the csv.

    Here my program to generate the SQL Query string to execute :

    import psycopg2
    import csv 
    
    # list of points
    lXy = []
    
    DSN= "dbname='testS' user='postgres' password='postgres' host='localhost'"
    conn = psycopg2.connect(DSN)
    
    curs = conn.cursor()
    
    def genPointText(curs,x,y):
        generatedPoint = "%s %s" % (x,y)
        return generatedPoint
    
    #Lecture fichier csv
    polygonFile = open('some.csv', 'rb')
    readerCSV = csv.reader(polygonFile,delimiter = ';')
    
    for coordinates in readerCSV:
        lXy.append(genPointText(curs,float(coordinates[0]),float(coordinates[1])))
    
    # function of list concatenation by separator
    def convert(myList,separator):
        return separator.join([str(i) for i in myList])
    
    # construct simple query with psycopg
    def genPolygonText(curs,l):
        # http://initd.org/psycopg/docs/usage.html#python-types-adaptation
        generatedPolygon = "POLYGON((%s))" % convert(l, ",")
        return generatedPolygon
    
    def executeWKT(curs,geomObject,srid):
        try:
                # geometry ST_GeomFromText(text WKT, integer srid);
            finalWKT = "SELECT ST_GeomFromText('%s');" % (geomObject) 
            print finalWKT
            curs.execute(finalWKT)
        except psycopg2.ProgrammingError,err:
            print "ERROR = " , err
    
    polygonQuery = genPolygonText(curs,lXy)
    executeWKT(curs,polygonQuery,4326)
    

    As you can see, that's works, but this way is not correct because of conversion problem between python object and sql postgresql object.

    In the documentation, i see only example to feed and convert data for static query. Do you know an "elegant" way to create correct string with correct type in a dynamic build for query ?

    UPDATE 1 :

    As you can see, when i use psycopg type transformation function on this simple example, i have error like this :

    query = "ST_GeomFromText('POLYGON(( 52.146542 19.050557, 52.148430 19.045527, 52.149525 19.045831, 52.147400 19.050780, 52.147400 19.050780, 52.146542 19.050557))',4326)"
    name = "my_table"
    
    try:
        curs.execute('INSERT INTO %s(name, url, id, point_geom, poly_geom) VALUES (%s);', (name,query))
    except psycopg2.ProgrammingError,err:
        print "ERROR = " , err
    

    Error equal :

    ERROR =  ERREUR:  erreur de syntaxe sur ou près de « E'my_table' »
    LINE 1: INSERT INTO E'my_table'(name, poly_geom) VALUES (E'ST_GeomFr...
    

    UPDATE 2 :

    Final code which work thanks to stackoverflow users !

    #info lib : http://www.initd.org/psycopg/docs/
    import psycopg2
    # info lib : http://docs.python.org/2/library/csv.html
    import csv 
    
    # list of points
    lXy = []
    
    DSN= "dbname='testS' user='postgres' password='postgres' host='localhost'"
    
    print "Opening connection using dns:", DSN
    conn = psycopg2.connect(DSN)
    
    curs = conn.cursor()
    
    def genPointText(curs,x,y):
        generatedPoint = "%s %s" % (x,y)
        return generatedPoint
    
    #Lecture fichier csv
    polygonFile = open('some.csv', 'rb')
    readerCSV = csv.reader(polygonFile,delimiter = ';')
    
    for coordinates in readerCSV:
        lXy.append(genPointText(curs,float(coordinates[0]),float(coordinates[1])))
    
    # function of list concatenation by separator
    def convert(myList,separator):
        return separator.join([str(i) for i in myList])
    
    # construct simple query with psycopg
    def genPolygonText(l):
        # http://initd.org/psycopg/docs/usage.html#python-types-adaptation
        generatedPolygon = "POLYGON((%s))" % convert(l, ",")
        return generatedPolygon
    
    def generateInsert(curs,tableName,name,geomObject):
        curs.execute('INSERT INTO binome1(name,geom) VALUES (%s, %s);' , (name,geomObject))
    
    
    def create_db_binome(conn,name):
    
        curs = conn.cursor()
    
        SQL = (
            "CREATE TABLE %s"
            " ("
            " polyname character varying(15),"
            " geom geometry,"
            " id serial NOT NULL,"
            " CONSTRAINT id_key PRIMARY KEY (id)"
            " )" 
            " WITH ("
            " OIDS=FALSE"
            " );"
            " ALTER TABLE %s OWNER TO postgres;"
            ) %(name,name)
        try:
          #print SQL
          curs.execute(SQL)
    
        except psycopg2.ProgrammingError,err:
          conn.rollback()
          dropQuery = "ALTER TABLE %s DROP CONSTRAINT id_key; DROP TABLE %s;" % (name,name)
          curs.execute(dropQuery)
          curs.execute(SQL)
    
        conn.commit()
    
    def insert_geometry(polyname,tablename,geometry):
    
        escaped_name = tablename.replace('""','""')
    
        try:
            test = 'INSERT INTO %s(polyname, geom) VALUES(%%s, ST_GeomFromText(%%s,%%s))' % (escaped_name)
            curs.execute(test, (tablename, geometry, 4326))
            conn.commit()
        except psycopg2.ProgrammingError,err:
            print "ERROR = " , err
    
    ################
    # PROGRAM MAIN #
    ################
    
    polygonQuery = genPolygonText(lXy)
    srid = 4326
    table = "binome1"
    
    create_db_binome(conn,table)
    insert_geometry("Berlin",table,polygonQuery)
    insert_geometry("Paris",table,polygonQuery)
    
    polygonFile.close()
    conn.close()
    
  • reyman64
    reyman64 over 11 years
    Thanks ! I update my problem with a complete working solution/program with creation of database, based on your answer.