Pandas parsing csv error - expected 1 fields found 9

10,630

Solution 1

The function pandas.read_csv() gets the number of columns and their names from the first line. By default it does not consider the option of the first lines being comments.

What is happening is that pandas reads the first line, splits it and finds there is only one column, insetad of doing this split to the line 13 which is the first not commented line. To solve this, the argument comment can be used.

planets = pd.read_csv("planets.csv", comment='#')

Compared to using skiprows, this allows the same code to load the planets.csv file even if the number of comment lines vary.

Solution 2

In addition to the above answer, if you got problem only with row 13th, you may skip it .

pd.read_csv("plants.csv", skiprows = 12, header=None)

Solution 3

Looks like you need skiprows. You can skip all the comments.

Ex:

planets = pd.read_csv("planets.csv", sep=',', skiprows=12)
Share:
10,630

Related videos on Youtube

Baalateja Kataru
Author by

Baalateja Kataru

Updated on June 04, 2022

Comments

  • Baalateja Kataru
    Baalateja Kataru almost 2 years

    I'm trying to parse from a .csv file:

    planets = pd.read_csv("planets.csv", sep=',')
    

    But I always end up with this error:

    ParserError: Error tokenizing data. C error: Expected 1 fields in line 13, saw 9
    

    This is how the first few lines of my csv file look like:

    # This file was produced by the test
    # Tue Apr  3 06:03:27 2018
    #
    # COLUMN pl_hostname:    Host Name
    # COLUMN pl_discmethod:  Discovery Method
    # COLUMN pl_pnum:        Number of Planets in System
    # COLUMN pl_orbper:      Orbital Period [days]
    # COLUMN pl_orbsmax:     Orbit Semi-Major Axis [AU])
    # COLUMN st_dist:        Distance [pc]
    # COLUMN st_teff:        Effective Temperature [K]
    # COLUMN st_mass:        Stellar Mass [Solar mass] 
    #
    loc_rowid,pl_hostname,pl_discmethod,pl_pnum,pl_orbper,pl_orbsmax,st_dist,st_teff,st_mass
    1,11 Com,Radial Velocity,1,326.03000000,1.290000,110.62,4742.00,2.70
    2,11 UMi,Radial Velocity,1,516.22000000,1.540000,119.47,4340.00,1.80
    3,14 And,Radial Velocity,1,185.84000000,0.830000,76.39,4813.00,2.20
    4,14 Her,Radial Velocity,1,1773.40000000,2.770000,18.15,5311.00,0.90
    5,16 Cyg B,Radial Velocity,1,798.50000000,1.681000,21.41,5674.00,0.99
    6,18 Del,Radial Velocity,1,993.30000000,2.600000,73.10,4979.00,2.30
    7,1RXS J160929.1-210524,Imaging,1,,330.000000,145.00,4060.00,0.85
    

    Edit: this is line 13:

    loc_rowid,pl_hostname,pl_discmethod,pl_pnum,pl_orbper,pl_orbsmax,st_dist,st_teff,st_mass
    

    Edit: Thanks to @Rakesh, Skipping the first 12 lines solved the problem

    planets = pd.read_csv("planets.csv", sep=',', skiprows=12)

    • michaelg
      michaelg about 6 years
      You probably need to check line 13.
    • in_user
      in_user about 6 years
      You should post the data in line 13, that would give more clue
    • Baalateja Kataru
      Baalateja Kataru about 6 years
      @NEOmen This is line 13: loc_rowid,pl_hostname,pl_discmethod,pl_pnum,pl_orbper,pl_orb‌​smax,st_dist,st_teff‌​,st_mass
    • OriolAbril
      OriolAbril about 6 years
      What are lines 1-12 then? Do you want to read the info there?
    • Baalateja Kataru
      Baalateja Kataru about 6 years
      @xg.plt.py Those are just comments that contain info about the data in the file.
    • OriolAbril
      OriolAbril about 6 years
      Are they started with a specific charechter like #?
    • in_user
      in_user about 6 years
      Can you post first 20 lines of the csv file, because what you are saying is your line 13, you have the same value in line 1 as well
    • Baalateja Kataru
      Baalateja Kataru about 6 years
      @xg.plt.py Yeah they start with #
  • Baalateja Kataru
    Baalateja Kataru about 6 years
    That just skips all the lines in the csv file.
  • Rakesh
    Rakesh about 6 years
    Can you post the first 20 line in your question?
  • Baalateja Kataru
    Baalateja Kataru about 6 years
    Again, that just skips all the lines in the csv file.
  • Baalateja Kataru
    Baalateja Kataru about 6 years
    Yeah the updated snippet worked like a charm, thanks :)
  • OriolAbril
    OriolAbril about 6 years
    You are welcome, I cannot remember all the options of read_csv and I have to check the documentation frequently, no problem :).