How can I fix "Error tokenizing data" on pandas csv reader?
Solution 1
I struggled with this almost a half day , I opened the csv with notepad and noticed that separate is TAB not comma and then tried belo combination.
df = pd.read_csv('C:\\myfile.csv',sep='\t', lineterminator='\r')
Solution 2
Try df = pd.read_csv(file, header=None, error_bad_lines=False)
Solution 3
The existing answer will not include these additional lines in your dataframe. If you'd like your dataframe to be as wide as its widest point, you can use the following:
delimiter = ','
max_columns = max(open(path_name, 'r'), key = lambda x: x.count(delimiter)).count(delimiter)
df = pd.read_csv(path_name, header = None, skiprows = 1, names = list(range(0,max_columns)))
Set skiprows = 1 if there's actually a header, you can always retrieve the header column names later. You can also identify rows that have more columns populated than the number of column names in the original header.
user9191983
Updated on July 09, 2022Comments
-
user9191983 almost 2 years
I'm trying to read a csv file with pandas.
This file actually has only one row but it causes an error whenever I try to read it.
Something wrong seems happening in line 8 but I could hardly find the 8th line since there's clearly only one row on it.
I do like:
with codecs.open("path_to_file", "rU", "Shift-JIS", "ignore") as file: df = pd.read_csv(file, header=None, sep="\t") df
Then I get:
ParserError: Error tokenizing data. C error: Expected 1 fields in line 8, saw 3
I don't get what's really going on, so any of your advice will be appreciated.
-
user9191983 over 5 yearsThanks so much fo your comment Po Xin, I've tried that and got another error like this
ParserError: Error tokenizing data. C error: Buffer overflow caught - possible malformed input file.
-
Admin over 5 years
-
M. Mariscal about 4 yearsHow to avoid showing errors in terminal furthermore?