How to write UTF-8 in a CSV file
Solution 1
It's very simple for Python 3.x (docs).
import csv
with open('output_file_name', 'w', newline='', encoding='utf-8') as csv_file:
writer = csv.writer(csv_file, delimiter=';')
writer.writerow('my_utf8_string')
For Python 2.x, look here.
Solution 2
From your shell run:
pip2 install unicodecsv
And (unlike the original question) presuming you're using Python's built in csv
module, turn
import csv
into
import unicodecsv as csv
in your code.
Solution 3
Use this package, it just works: https://github.com/jdunck/python-unicodecsv.
Solution 4
For me the UnicodeWriter
class from Python 2 CSV module documentation didn't really work as it breaks the csv.writer.write_row()
interface.
For example:
csv_writer = csv.writer(csv_file)
row = ['The meaning', 42]
csv_writer.writerow(row)
works, while:
csv_writer = UnicodeWriter(csv_file)
row = ['The meaning', 42]
csv_writer.writerow(row)
will throw AttributeError: 'int' object has no attribute 'encode'
.
As UnicodeWriter
obviously expects all column values to be strings, we can convert the values ourselves and just use the default CSV module:
def to_utf8(lst):
return [unicode(elem).encode('utf-8') for elem in lst]
...
csv_writer.writerow(to_utf8(row))
Or we can even monkey-patch csv_writer to add a write_utf8_row
function - the exercise is left to the reader.
Solution 5
The examples in the Python documentation show how to write Unicode CSV files: http://docs.python.org/2/library/csv.html#examples
(can't copy the code here because it's protected by copyright)
Martin
Updated on January 30, 2020Comments
-
Martin over 4 years
I am trying to create a text file in csv format out of a PyQt4
QTableWidget
. I want to write the text with a UTF-8 encoding because it contains special characters. I use following code:import codecs ... myfile = codecs.open(filename, 'w','utf-8') ... f = result.table.item(i,c).text() myfile.write(f+";")
It works until the cell contains a special character. I tried also with
myfile = open(filename, 'w') ... f = unicode(result.table.item(i,c).text(), "utf-8")
But it also stops when a special character appears. I have no idea what I am doing wrong.
-
Mutant over 8 yearsThanks for the link. It was helpful. For my knowledge, even if you have posted the link you can't copy paste the code here? (+1 for ownering the copyright)
-
Aaron Digulla over 8 years@Mutant: Code isn't like scientific papers. Code is protected by copyright. While I'm 99.999% sure that the Python owners wouldn't sue SO for copying their code, I didn't feel like reading their lengthy license to find out whether it's allowed or not. Also, it's good to remind people once in a while that "I can see it on my monitor" != "I can do whatever I want with it" :-)
-
Mutant over 8 yearsThanks for the reminder. Unfortunately the world we live in became so (unreasonably) fast and careless where information is flowing faster than one can imagine, it does require reminder once and while on the restriction that matters. Thanks for that :)
-
Suzana over 8 yearsIt didn't work just by replacing the import, I also had to add the encoding when creating the writer:
writer = csv.writer(out, dialect='excel', encoding='utf-8')
, and create the file handler withopen(...
, notcodecs.open(...
. -
Charles Chow almost 8 yearsI tried all suggestions on StackOverflow and only this one works for me.
-
alttag over 6 yearsThe docs link is semi-useful (examples are better), but the "copyright" argument here is overblown and asinine. Python is explicitly open source (v2 v3). The license is clear: "royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute ... [etc., etc.]" Even the simple phrase at the top of the page, "GPL-compatible" should give you comfort. Share open source stuff. Even modify it if you want to. It's open source for a reason.
-
Aaron Digulla over 6 years@alttag Copying or using GPLd code in a project means that all the other code in the same project is now under GPL as well. Since I'm not a copyright lawyer, I don't know what that means with regards to code published on a web site.
-
CKM almost 6 yearswhat if the content to
writerow
is not a utf-8? will it work? -
Vaibhav Vishal almost 5 yearsGreat no need for third party pip installs.
-
khan over 3 yearsmuch simpler solution for py2.x for those of us still stuck with using it.
-
Ricky Levi about 2 yearsi'm not using a file, i'm using
sys.stdout
so how the content can be utf8 in that case ?