String In python with my unicode?
Solution 1
There is nothing wrong with your string! You just have confused encode()
and decode()
. The string is meaningful symbols. To turn it into bytes that could be stored in a file or transmitted over the Internet, use encode()
with an encoding like UTF-8. Each encoding is a scheme for converting meaningful symbols to flat bytes of output.
When the time comes to do the opposite — to take some raw bytes from a file or a socket and turn them into symbols like letters and numbers — you will decode the bytes using the decode()
method of bytestrings in Python 3.
>>> str_version = 'នយោបាយ'
>>> str_version.encode('utf-8')
b'\xe1\x9e\x93\xe1\x9e\x99\xe1\x9f\x84\xe1\x9e\x94\xe1\x9e\xb6\xe1\x9e\x99'
See that big long line of bytes? Those are the bytes that UTF-8 uses to represent your string, if you need to transmit the string over a network, or store them in a document. There are many other encodings in use, but it seems to be the most popular. Each encoding can turn meaningful symbols like ន and យោ into bytes — the little 8-bit numbers with which computers communicate.
>>> rawbytes = str_version.encode('utf-8')
>>> rawbytes
b'\xe1\x9e\x93\xe1\x9e\x99\xe1\x9f\x84\xe1\x9e\x94\xe1\x9e\xb6\xe1\x9e\x99'
>>> rawbytes.decode('utf-8')
'នយោបាយ'
Solution 2
You're reading the 2.x docs. str.decode()
(and bytes.encode()
) was dropped in 3.x. And str
is already a Unicode string; there's no need to decode it.
Comments
-
kn3l almost 2 years
Python 3.2 (r32:88445, Feb 20 2011, 21:29:02) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> str_version = 'នយោបាយ' >>> type(str_version) <class 'str'> >>> print (str_version) នយោបាយ >>> unicode_version = 'នយោបាយ'.decode('utf-8') Traceback (most recent call last): File "<pyshell#3>", line 1, in <module> unicode_version = 'នយោបាយ'.decode('utf-8') AttributeError: 'str' object has no attribute 'decode' >>>
What the problem with my unicode string?
-
kn3l about 13 yearsstill not clean .Could you more clear explain ? thanks Brandon Craig Rhodes
-
Brandon Rhodes about 13 yearsI have added another paragraph, and some code samples — do those make it any clearer?
-
kn3l about 13 yearsNow it's clear .I understand right now from your example ,thank you so much @Brandon Craig Rhodes