Is there any function like iconv in Python?
Solution 1
I don't know PHP, but does this work :
mystring.decode('shift-jis').encode('utf-8') ?
Also I assume the CSV content is from a file. There are a few options for opening a file in python.
with open(myfile, 'rb') as fin
would be the first and you would get data as it is
with open(myfile, 'r') as fin
would be the default file opening
Also I tried on my computed with a shift-js text and the following code worked :
with open("shift.txt" , "rb") as fin :
text = fin.read()
text.decode('shift-jis').encode('utf-8')
result was the following in UTF-8 (without any errors)
' \xe3\x81\xa6 \xe3\x81\xa7 \xe3\x81\xa8'
Ok I validate my solution :)
The first char is indeed the good character: "\xe3\x81\xa6" means "E3 81 A6" It gives the correct result.
You can try yourself at this URL
Solution 2
for when pythons built-in encodings are insufficient there's an iconv
at PyPi.
pip install iconv
unfortunately the documentation is nonexistant.
There's also iconv_codecs
pip install iconv_codecs
eg:
>>> import iconv_codecs
>>> iconv_codecs.register('ansi_x3.110-1983')
>>> "foo".encode('ansi_x3.110-1983')
Related videos on Youtube
hugowan
Updated on June 26, 2022Comments
-
hugowan almost 2 years
I have some CSV files need to convert from shift-jis to utf-8.
Here is my code in PHP, which is successful transcode to readable text.
$str = utf8_decode($str); $str = iconv('shift-jis', 'utf-8'. '//TRANSLIT', $str); echo $str;
My problem is how to do same thing in Python.
-
jlh over 5 yearsThis answer works fine for transforming strings between encodings, but iconv can also do more than just that, for example you can use it to transliterate characters, as asked by OP. The
//TRANSLIT
will cause characters that can't be represented by the target encoding to be substituted by something meaningful.