Is there any function like iconv in Python?

12,455

Solution 1

I don't know PHP, but does this work :

mystring.decode('shift-jis').encode('utf-8') ?

Also I assume the CSV content is from a file. There are a few options for opening a file in python.

with open(myfile, 'rb') as fin

would be the first and you would get data as it is

with open(myfile, 'r') as fin

would be the default file opening

Also I tried on my computed with a shift-js text and the following code worked :

with open("shift.txt" , "rb") as  fin :
    text = fin.read()

text.decode('shift-jis').encode('utf-8')

result was the following in UTF-8 (without any errors)

' \xe3\x81\xa6 \xe3\x81\xa7 \xe3\x81\xa8'

Ok I validate my solution :)

The first char is indeed the good character: "\xe3\x81\xa6" means "E3 81 A6" It gives the correct result.

enter image description here

You can try yourself at this URL

Solution 2

for when pythons built-in encodings are insufficient there's an iconv at PyPi.

pip install iconv

unfortunately the documentation is nonexistant.

There's also iconv_codecs

pip install iconv_codecs

eg:

>>> import iconv_codecs
>>> iconv_codecs.register('ansi_x3.110-1983')
>>> "foo".encode('ansi_x3.110-1983')
Share:
12,455

Related videos on Youtube

hugowan
Author by

hugowan

Updated on June 26, 2022

Comments

  • hugowan
    hugowan almost 2 years

    I have some CSV files need to convert from shift-jis to utf-8.

    Here is my code in PHP, which is successful transcode to readable text.

    $str = utf8_decode($str);
    $str = iconv('shift-jis', 'utf-8'. '//TRANSLIT', $str);
    echo $str;
    

    My problem is how to do same thing in Python.

  • jlh
    jlh over 5 years
    This answer works fine for transforming strings between encodings, but iconv can also do more than just that, for example you can use it to transliterate characters, as asked by OP. The //TRANSLIT will cause characters that can't be represented by the target encoding to be substituted by something meaningful.

Related