csv with different encoding

15,191

Solution 1

The solution for now:

def reencode(file):
    for line in file:
        yield line.decode('windows-1250').encode('utf-8')

csv_reader = csv.reader(reencode(open(filepath)), delimiter=";",quotechar='"')

Solution 2

Have a look at the examples section of the csv module documentation. At the end, you'll find classes you can use for exactly that purpose, specifying the encoding.

Solution 3

Pass a file-descriptor opened with codecs.open. You can't autorecognize encodings, or not very well. To guess the encoding you can use chardet.

Share:
15,191
Tomasz Brzezina
Author by

Tomasz Brzezina

Updated on June 04, 2022

Comments

  • Tomasz Brzezina
    Tomasz Brzezina almost 2 years

    Possible Duplicate:
    Open a file in the proper encoding automatically

    my code:

    import csv
    
    def handle_uploaded_file(f):
      dataReader = csv.reader(f, delimiter=';', quotechar='"')
    
    for row in dataReader:
      do_sth
    

    the problem is that it works well only if csv is UTF-8 encoded. What should I change to serve the iso-8859-2 or windows-1250 encoding? (the best solution is to autorecognize the encoding, but hand converting is also acceptable)

  • Admin
    Admin over 6 years
    this is not he crrect answer , csv documentation : Since open() is used to open a CSV file for reading, the file will by default be decoded into unicode using the system default encoding (see locale.getpreferredencoding()). To decode a file using a different encoding, use the encoding argument of open:
  • Max Candocia
    Max Candocia over 6 years
    I was able to open the file with open(filename, 'r', encoding='latin-1') as f: and it fixed the encoding errors I was getting. A standard list of encodings can be found here: docs.python.org/3/library/codecs.html#standard-encodings