How to make Postgres Copy ignore first line of large txt file

43,081

Use HEADER option with CSV option:

\copy <table_name>  from '/source_file.csv' delimiter ',' CSV HEADER ;

HEADER Specifies that the file contains a header line with the names of each column in the file. On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV format.

Share:
43,081
thiakx
Author by

thiakx

Updated on January 19, 2022

Comments

  • thiakx
    thiakx over 2 years

    I have a fairly large .txt file ~9gb and I will like to load this txt file into postgres. The first row is the header, followed by all the data. If I postgres COPY the data directly, the header will cause an error that data type do not match with my postgres table, so I will need to remove it somehow.

    Sample data: ProjectId,MailId,MailCodeId,prospectid,listid,datemailed,amount,donated,zip,zip4,VectorMajor,VectorMinor,packageid,phase,databaseid,amount2

    15,53568419,89734,219906,15,2011-05-11 00:00:00,0,0,90720,2915,NonProfit,POLICY,230,3,1,0
    
    16,84141863,87936,164657,243,2011-03-10 00:00:00,0,0,48362,2523,NonProfit,POLICY,1507,5,1,0
    
    16,81442028,86632,15181625,243,2011-01-19 00:00:00,0,0,11501,2115,NonProfit,POLICY,1508,2,1,0
    

    While the COPY function for postgres has the "header" setting that can ignore the first row, it only works for csv files:

    copy training from 'C:/testCSV.csv' DELIMITER ',' csv header;
    

    when I try to run the code above on my txt file, it gets an error:

    copy training from 'C:/testTXTFile.txt' DELIMITER ',' csv header
    ERROR:  unquoted newline found in data
    HINT:  Use quoted CSV field to represent newline.
    

    I have tried adding "quote" and "escape" attributes but the command just won't seem to work for txt file:

    copy training from 'C:/testTXTFile.txt' DELIMITER ',' csv header quote as E'"'  escape as E'\\N';
    ERROR:  COPY escape must be a single one-byte character
    

    Alternatively, I thought about running java or create a seperate stagging table to remove the first row...but these solutions are expansive and time consuming. I will need to load 9gb of data just to remove the first row of headers... are there other solutions out there to remove the first row of a txt file easily so that I can load the data into my postgres database?

  • thiakx
    thiakx almost 12 years
    Sorry, my misatke, CSV header does work. The data in my txt file actually had a different format from my csv file that resulted in the error.
  • Admin
    Admin over 2 years
    Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.