ParserError with panda read_csv

python python-3.x pandas csv

16,012

If you've managed to get the data in a single column, you can use Series.str.split() to workaround this issue.

Here is an example with some sample data you provided (you can use any string or a regex as a delimiter in split()) :

df[0].str.split(' ', expand=True)

     0      1      2      3      4      5      6      7      8      9   \
0  0.270   0.30   0.30   0.70   0.70   None   None   None   None   None
1  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988

If you do this, then you might as well create the dataframe with pd.DataFrame(open(...).readlines()) or something like that, since you don't benefit at all from read_csv(), and your file isn't exactly a standard csv file.

# f is a StringIO with some of your sample data to simulate a file
df = pd.DataFrame(line.strip().split(' ') for line in f)

       0      1      2      3      4      5      6      7      8      9   \
0   60381      6   None   None   None   None   None   None   None   None
1       1  0.270   0.30   0.30   0.70   0.70   None   None   None   None
2   4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988  4.988
3       2  0.078   0.30   0.30   0.70   0.70   None   None   None   None
4   5.387  5.312  5.338  4.463  4.675  4.275  4.238  3.562  3.175  3.925

Of course, you can also fix the input file by making sure every line contains the same number of columns, which will solve the ParserError issue.

16,012

Author by

lanadaquenada

Updated on June 28, 2022

Comments

lanadaquenada almost 2 years

I'm trying to read a txt file that as different number of columns per row. Here's the beginning of my file:

60381 6
1 0.270 0.30 0.30 0.70 0.70
4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988 4.988
2 0.078 0.30 0.30 0.70 0.70
5.387 5.312 5.338 4.463 4.675 4.275 4.238 3.562 3.175 3.925 4.950 4.762
6 0.241 0.30 0.60 0.70 0.40
3.700 3.200 2.738 2.325 1.250 0.975 1.175 1.950 2.488 3.613 3.987 3.950
7 0.357 0.30 0.60 0.70 0.40
1.212 1.125 1.050 0.950 0.663 0.488 0.425 0.512 0.637 0.900 1.112 1.188
8 0.031 0.30 0.70 0.70 0.30
0.225 0.213 0.200 0.175 0.200 0.213 0.375 0.887 0.975 0.512 0.262 0.262
10 0.022 0.30 0.80 0.70 0.20
0.712 0.700 0.738 0.550 0.513 0.688 0.613 0.600 0.850 0.812 0.800 0.775
60382 5
6 0.197 0.30 0.60 0.70 0.40
3.700 3.200 2.738 2.325 1.250 0.975 1.175 1.950 2.488 3.613 3.987 3.950
7 0.413 0.30 0.60 0.70 0.40
1.212 1.125 1.050 0.950 0.663 0.488 0.425 0.512 0.637 0.900 1.112 1.188
8 0.016 0.30 0.70 0.70 0.30
0.225 0.213 0.200 0.175 0.200 0.213 0.375 0.887 0.975 0.512 0.262 0.262
10 0.111 0.30 0.80 0.70 0.20
0.712 0.700 0.738 0.550 0.513 0.688 0.613 0.600 0.850 0.812 0.800 0.775
11 0.263 0.30 0.50 0.70 0.50
1.812 1.388 1.087 0.825 0.538 0.400 0.338 0.400 0.500 0.925 0.962 1.100

I've tried using pandas read_csv to read it:

import pandas as pd
data = pd.read_csv('./myfile.txt',header=None,sep='\s')

Which gives the following error:

ParserError: Expected 6 fields in line 3, saw 12. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

So my file doesn't have a multi-char delimiter or quotation marks. I've tried a solution for this I found in this forum, which suggested using:

data = pd.read_csv(open('./myfile.txt','r'), header=None,encoding='utf-8', engine='c')

Although this solves the error above, the array I'm presented with does not use space as a delimiter of columns, and the output has only 1 column:

How should I read the file in order to get a column for each value? I don't mind if there are nan values that fill the rest.

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Engines in Python Pandas read_csv

Pandas parsing csv error - expected 1 fields found 9

How to read html table in pandas and output to dataframe not list

Unable to convert JSON file to CSV using Python

python pandas read_csv delimiter in column data

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte, while reading csv file in pandas

Pandas ParserError EOF character when reading multiple csv files to HDF5

pandas.read_csv from string or package data

Pandas: How to read specific rows from a CSV file

Sort CSV by column name

ParserError with panda read_csv

lanadaquenada

Comments

Recents

Related