The pd.read_csv documentation notes specific differences between 'c' (default) and 'python' engines. The names indicate the language in which the parsers are written. Specifically, the docs note:

Where possible pandas uses the C parser (specified as engine='c'), but may fall back to Python if C-unsupported options are specified.

Here are the main differences you should note (as of v0.23.4):

  • 'c' is faster, while 'python' is currently more feature-complete.
  • 'python' supports skipfooter, while 'c' does not.
  • 'python' supports flexible sep other than a single character (inc regex), while 'c' does not.
  • 'python' supports sep=None with delim_whitespace=False, which means it can auto-detect a delimiter, while 'c' does not.
  • 'c' supports float_precision, while 'python' does not (or not necessary).

Version notes:

  • dtype supported in 'python' v0.20.0+.
  • delim_whitespace supported in 'python' v0.18.1+.

Note the above may change as features are developed. You should check IO Tools (Text, CSV, HDF5, …) if you see unexpected behaviour in later versions.


Updated on October 01, 2022


    PUNEET AGARWAL about 2 months

    In the document for pd.read_csv() method in pandas in python while describing the "sep" parameter there is a mention of engines such as C engine and Python engine.

    The document link is : https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

    What are these engines? What is the role of each engine? Is there any analogy which can help understand these engines better?

  • seralouk
    seralouk over 3 years
    For a 1.2 GB csv file, engine='python' is much faster than c. Why is that?
  • jpp
    jpp over 3 years
    @serafeim, Without your CSV file, it's difficult to tell. Perhaps there is specific content or combination or arguments where engine='python' is more efficient. Generally, though, 'c' is more efficient while 'python' is more feature-complete.
  • seralouk
    seralouk over 3 years
    Here is the file: filebin.net/fkyil2m5yhvr1dbh any tip would be great. c takes forever whereas python is faster