How can I use io.StringIO() with the csv module?

54,858

Solution 1

The Python 2.7 csv module doesn't support Unicode input: see the note at the beginning of the documentation.

It seems that you'll have to encode the Unicode strings to byte strings, and use io.BytesIO, instead of io.StringIO.

The examples section of the documentation includes examples for a UnicodeReader and UnicodeWriter wrapper classes (thanks @AlexeyKachayev for the pointer).

Solution 2

Please use StringIO.StringIO().

http://docs.python.org/library/io.html#io.StringIO

http://docs.python.org/library/stringio.html

io.StringIO is a class. It handles Unicode. It reflects the preferred Python 3 library structure.

StringIO.StringIO is a class. It handles strings. It reflects the legacy Python 2 library structure.

Solution 3

I found this when I tried to serve a CSV file via Flask directly without creating the CSV file on the file system. This works:

import io
import csv

data = [[u'cell one', u'cell two'], [u'cell three', u'cell four']]

output = io.BytesIO()
writer = csv.writer(output, delimiter=',')
writer.writerows(data)
your_csv_string = output.getvalue()

See also

Solution 4

From csv documentation:

The csv module doesn’t directly support reading and writing Unicode, but it is 8-bit-clean save for some problems with ASCII NUL characters. So you can write functions or classes that handle the encoding and decoding for you as long as you avoid encodings like UTF-16 that use NULs. UTF-8 is recommended.

You can find example of UnicodeReader, UnicodeWriter here http://docs.python.org/2/library/csv.html

Solution 5

To use CSV reader/writer with 'memory files' in python 2.7:

from io import BytesIO
import csv

csv_data = """a,b,c
foo,bar,foo"""

# creates and stores your csv data into a file the csv reader can read (bytes)
memory_file_in = BytesIO(csv_data.encode(encoding='utf-8'))

# classic reader
reader = csv.DictReader(memory_file_in)

# writes a csv file
fieldnames = reader.fieldnames  # here we use the data from the above csv file
memory_file_out = BytesIO()     # create a memory file (bytes)

# classic writer (here we copy the first file in the second file)
writer = csv.DictWriter(memory_file_out, fieldnames)
for row in reader:
    print(row)
    writer.writerow(row)
Share:
54,858
Tim Pietzcker
Author by

Tim Pietzcker

Python aficionado (both Monty's and Guido's versions) Regex hobbyist Avid musician (piano, drums and percussion, currently playing with the Symphony Orchestra and the Jazz Band of the University of Applied Sciences in Ulm) My professional life so far: Studied medicine in Freiburg (Germany), Zürich (Switzerland) and Seattle, WA Worked as a doctor in Internal and Intensive Care Medicine, then switched to Microbiology, Virology and Epidemiology Always loved chasing bugs (microbial and logical) Back to school a few years ago for an MBA in Hospital Management Since 2010: Working as a Professor of Medicine and Health Technologies at the University of Applied Sciences in Ulm, teaching in the faculty of Computer Science (yeah, sounds weird, I know) If you're interested in a (German) BSc program in Health Information Management, where you'll learn about the intricate connections between IT, management, and medicine, check out our homepage! Funny StackOverflow achievements: Winner of the gold Java badge despite never having written a single program in this language. (Being a regex one-trick pony pays off in unexpected ways :)) Winner of the Populist badge for an answer that outscored another answer by myself Sole owner (at least from May 2013 to November 2016) of the regex-negation badge

Updated on June 15, 2020

Comments

  • Tim Pietzcker
    Tim Pietzcker almost 4 years

    I tried to backport a Python 3 program to 2.7, and I'm stuck with a strange problem:

    >>> import io
    >>> import csv
    >>> output = io.StringIO()
    >>> output.write("Hello!")            # Fail: io.StringIO expects Unicode
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unicode argument expected, got 'str'
    >>> output.write(u"Hello!")           # This works as expected.
    6L
    >>> writer = csv.writer(output)       # Now let's try this with the csv module:
    >>> csvdata = [u"Hello", u"Goodbye"]  # Look ma, all Unicode! (?)
    >>> writer.writerow(csvdata)          # Sadly, no.
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unicode argument expected, got 'str'
    

    According to the docs, io.StringIO() returns an in-memory stream for Unicode text. It works correctly when I try and feed it a Unicode string manually. Why does it fail in conjunction with the csv module, even if all the strings being written are Unicode strings? Where does the str come from that causes the Exception?

    (I do know that I can use StringIO.StringIO() instead, but I'm wondering what's wrong with io.StringIO() in this scenario)