Error tokenizing data. C error: out of memory pandas python, large file csv

38,784

Solution 1

try this bro:

mylist = []

for chunk in  pd.read_csv('train_2011_2012_2013.csv', sep=';', chunksize=20000):
    mylist.append(chunk)

big_data = pd.concat(mylist, axis= 0)
del mylist

Solution 2

You may try setting error_bad_lines = False when calling the csv file i.e.

import pandas as pd
df = pd.read_csv('my_big_file.csv', error_bad_lines = False)

Solution 3

This error could also be caused by the chunksize=20000000. Decreasing that fixed the issue in my case. In ℕʘʘḆḽḘ's solution chunksize is also decreased which might have done the trick.

Share:
38,784
Amal Kostali Targhi
Author by

Amal Kostali Targhi

Updated on July 05, 2022

Comments

  • Amal Kostali Targhi
    Amal Kostali Targhi almost 2 years

    I have a large csv file of 3.5 go and I want to read it using pandas.

    This is my code:

    import pandas as pd
    tp = pd.read_csv('train_2011_2012_2013.csv', sep=';', iterator=True, chunksize=20000000, low_memory = False)
    df = pd.concat(tp, ignore_index=True)
    

    I get this error:

    pandas/parser.pyx in pandas.parser.TextReader.read (pandas/parser.c:8771)()
    
    pandas/parser.pyx in pandas.parser.TextReader._read_rows (pandas/parser.c:9731)()
    
    pandas/parser.pyx in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:9602)()
    
    pandas/parser.pyx in pandas.parser.raise_parser_error (pandas/parser.c:23325)()
    
    CParserError: Error tokenizing data. C error: out of 
    

    The capacity of my ram is 8 Go.

  • Amal Kostali Targhi
    Amal Kostali Targhi over 7 years
    Thanks for your help but there an error in big_data = pd.concat(mylist, axis=0) out = np.empty(out_shape, dtype=dtype, order='F') 929 else: --> 930 out = np.empty(out_shape, dtype=dtype) 931 932 func = _get_take_nd_function(arr.ndim, arr.dtype, out.dtype, axis=axis, MemoryError:
  • hzitoun
    hzitoun almost 6 years
    Loaded 3G CVS successfully! Thanks!
  • Mark Melgo
    Mark Melgo about 5 years
    If it is already answered in ℕʘʘḆḽḘ's solution then just comment this. No need to put it as an answer.
  • Justas
    Justas about 5 years
    I wanted to do that but didn't have enough reputation. Just wanted to leave this info for future reference, I haven't found it when I was googling for this error
  • Kokokoko
    Kokokoko over 4 years
    Just came across this. Perfect!
  • Hasnu zama
    Hasnu zama almost 4 years
    I am reading two big csv files one after another. This is not working. Any suggestions please? My csv size is 980 MB