Pandas parsing csv error - expected 1 fields found 9
Solution 1
The function pandas.read_csv()
gets the number of columns and their names from the first line. By default it does not consider the option of the first lines being comments.
What is happening is that pandas reads the first line, splits it and finds there is only one column, insetad of doing this split to the line 13 which is the first not commented line. To solve this, the argument comment
can be used.
planets = pd.read_csv("planets.csv", comment='#')
Compared to using skiprows
, this allows the same code to load the planets.csv
file even if the number of comment lines vary.
Solution 2
In addition to the above answer, if you got problem only with row 13th, you may skip it .
pd.read_csv("plants.csv", skiprows = 12, header=None)
Solution 3
Looks like you need skiprows
. You can skip all the comments.
Ex:
planets = pd.read_csv("planets.csv", sep=',', skiprows=12)
Related videos on Youtube
Baalateja Kataru
Updated on June 04, 2022Comments
-
Baalateja Kataru almost 2 years
I'm trying to parse from a .csv file:
planets = pd.read_csv("planets.csv", sep=',')
But I always end up with this error:
ParserError: Error tokenizing data. C error: Expected 1 fields in line 13, saw 9
This is how the first few lines of my csv file look like:
# This file was produced by the test # Tue Apr 3 06:03:27 2018 # # COLUMN pl_hostname: Host Name # COLUMN pl_discmethod: Discovery Method # COLUMN pl_pnum: Number of Planets in System # COLUMN pl_orbper: Orbital Period [days] # COLUMN pl_orbsmax: Orbit Semi-Major Axis [AU]) # COLUMN st_dist: Distance [pc] # COLUMN st_teff: Effective Temperature [K] # COLUMN st_mass: Stellar Mass [Solar mass] # loc_rowid,pl_hostname,pl_discmethod,pl_pnum,pl_orbper,pl_orbsmax,st_dist,st_teff,st_mass 1,11 Com,Radial Velocity,1,326.03000000,1.290000,110.62,4742.00,2.70 2,11 UMi,Radial Velocity,1,516.22000000,1.540000,119.47,4340.00,1.80 3,14 And,Radial Velocity,1,185.84000000,0.830000,76.39,4813.00,2.20 4,14 Her,Radial Velocity,1,1773.40000000,2.770000,18.15,5311.00,0.90 5,16 Cyg B,Radial Velocity,1,798.50000000,1.681000,21.41,5674.00,0.99 6,18 Del,Radial Velocity,1,993.30000000,2.600000,73.10,4979.00,2.30 7,1RXS J160929.1-210524,Imaging,1,,330.000000,145.00,4060.00,0.85
Edit: this is line 13:
loc_rowid,pl_hostname,pl_discmethod,pl_pnum,pl_orbper,pl_orbsmax,st_dist,st_teff,st_mass
Edit: Thanks to @Rakesh, Skipping the first 12 lines solved the problem
planets = pd.read_csv("planets.csv", sep=',', skiprows=12)
-
michaelg about 6 yearsYou probably need to check line 13.
-
in_user about 6 yearsYou should post the data in line 13, that would give more clue
-
Baalateja Kataru about 6 years@NEOmen This is line 13: loc_rowid,pl_hostname,pl_discmethod,pl_pnum,pl_orbper,pl_orbsmax,st_dist,st_teff,st_mass
-
OriolAbril about 6 yearsWhat are lines 1-12 then? Do you want to read the info there?
-
Baalateja Kataru about 6 years@xg.plt.py Those are just comments that contain info about the data in the file.
-
OriolAbril about 6 yearsAre they started with a specific charechter like
#
? -
in_user about 6 yearsCan you post first 20 lines of the csv file, because what you are saying is your line 13, you have the same value in line 1 as well
-
Baalateja Kataru about 6 years@xg.plt.py Yeah they start with #
-
-
Baalateja Kataru about 6 yearsThat just skips all the lines in the csv file.
-
Rakesh about 6 yearsCan you post the first 20 line in your question?
-
Baalateja Kataru about 6 yearsAgain, that just skips all the lines in the csv file.
-
Baalateja Kataru about 6 yearsYeah the updated snippet worked like a charm, thanks :)
-
OriolAbril about 6 yearsYou are welcome, I cannot remember all the options of
read_csv
and I have to check the documentation frequently, no problem :).