How to read Unicode file as Unicode string in Python
11,249
The term "Unicode" refers to the standard, not to a particular encoding. Since files in computers are binary, there exist different ways of encoding Unicode data in binary files. One of them is "UTF-8".
You can consult https://docs.python.org/3/howto/unicode.html
An example taken from this document (in the section "Reading and Writing Unicode Data")
with open('unicode.txt', encoding='utf-8') as f:
for line in f:
print(repr(line))
In python 3, unlike python2, unicode string constants are not written with a "u".
Author by
Melab
Updated on June 04, 2022Comments
-
Melab almost 2 years
I have a file that is encoded in Unicode or UTF-8 (I don't know which). When I read the file in Python 3.4, the resulting string is interpreted as an ASCII string. How do I convert it to a Unicode string like
u"text"
?