How to read Unicode file as Unicode string in Python

11,249

The term "Unicode" refers to the standard, not to a particular encoding. Since files in computers are binary, there exist different ways of encoding Unicode data in binary files. One of them is "UTF-8".

You can consult https://docs.python.org/3/howto/unicode.html

An example taken from this document (in the section "Reading and Writing Unicode Data")

with open('unicode.txt', encoding='utf-8') as f:
  for line in f:
    print(repr(line))

In python 3, unlike python2, unicode string constants are not written with a "u".

Share:
11,249
Melab
Author by

Melab

Updated on June 04, 2022

Comments

  • Melab
    Melab almost 2 years

    I have a file that is encoded in Unicode or UTF-8 (I don't know which). When I read the file in Python 3.4, the resulting string is interpreted as an ASCII string. How do I convert it to a Unicode string like u"text"?