Copy (from) csv with heades in postgres with python

19,546

Solution 1

I'd try con.commit() after cur.copy_expert().

Also I would avoid preprocessing and uploading the file row by row as Sam P. pointed out above if the dataset is large. cur.copy_expert() is significantly faster.

conn = psycopg2.connect('postgresql://scott:tiger@localhost:5432/database')
cur = conn.cursor()
copy_sql = """
           COPY table_name FROM stdin WITH CSV HEADER
           DELIMITER as ','
           """
with open(path, 'r') as f:
    cur.copy_expert(sql=copy_sql, file=f)
    conn.commit()
    cur.close()

Solution 2

I would recommend dealing with the csv file in python first. It will be best to structure the data pulled from the csv file into rows/columns (in python this will nested lists, or a list of a tuples) then you can construct & execute SQL commands based on that data iteratively.

Use the csv library to interact with the csv file, take a look at the documentation here: https://docs.python.org/2/library/csv.html. It's very user friendly and will help you with a lot of your problems.

Here's a way to do it without csv (as I can't remember all the functions off the top of my head), however it would be best not to use this approach:

#/usr/bin/env python
# -*- coding: utf-8 -*-
import psycopg2
import os
import glob
DSN = "dbname=gps user=postgres host=localhost"
con = psycopg2.connect(DSN)
cur = con.cursor()

# 'rb' used as I don't know the encoding of your file
# just use r if it's in utf-8 or a known/consistent charset
with open(file,'rb') as open_file:
    my_file = open_file.read().decode('utf-8','ignore')

data = my_file.splitlines()
data = [r.split(delimiter) for r in data]

data = data[1:] # get rid of headers

for r in data:
     # create command
     # cur.execute(command)
Share:
19,546
Pablo Pardo
Author by

Pablo Pardo

Updated on July 28, 2022

Comments

  • Pablo Pardo
    Pablo Pardo almost 2 years

    I'm trying to fill a table from CSV files in a python script.

    The SQL statement, which follows, runs without error:

    COPY registro
    FROM '/home/pablo/Escritorio/puntos/20140227.csv'
    DELIMITER ','
    CSV header;
    

    CSV has headers, and using header parameter, it imports without error.

    The problem comes when I execute it from my python script. The only way I've found not to try to import the headers is with copy_expert() method. I get no error message but the table is still empty after I run the Python script below.

    Any possible clue? Or maybe any other way to copy a table from CSV with headers?

    Thanks.

    #/usr/bin/env python
    # -*- coding: utf-8 -*-
    import psycopg2
    import os
    import glob
    DSN = "dbname=gps user=postgres host=localhost"
    con = psycopg2.connect(DSN)
    cur = con.cursor()
    my_file = open('/home/pablo/Escritorio/puntos/20140227.csv')
    #This is only a test file, not all the directory
    sql = "COPY registro FROM stdin DELIMITER \',\' CSV header;"
    cur.copy_expert(sql, my_file)
    cur.close()
    con.close()
    
  • Waqas Ali
    Waqas Ali over 7 years
    This line helps me while reading data from csv/json and copy into flex in vertica. Thanks -:) my_file = open_file.read().decode('utf-8','ignore')
  • Waqas Ali
    Waqas Ali over 7 years
  • aydow
    aydow about 7 years
    should you not also close the connection?
  • Kemin Zhou
    Kemin Zhou over 6 years
    Does the copy_from(file, table_name, feature=value) method have an option for header=true?
  • Seth
    Seth over 5 years
    @aydow, closing the connection is not a bad idea. I should've included that in my original post outside of the with block