How to convert a string to UTF8 in Ruby

99,763

Solution 1

Your string seems to have been encoded the wrong way round:

"Développement".encode("iso-8859-1").force_encoding("utf-8")
#=> "Développement"

Solution 2

Seems your string thinks it is UTF-8, but in reality, it is something else, probably ISO-8859-1.

Define (force) the correct encoding first, then convert it to UTF-8.

In your example:

puts "Développement".encode('iso-8859-1').encode('utf-8')

An alternative is:

puts "\xC3".force_encoding('iso-8859-1').encode('utf-8') #-> Ã

If the à makes no sense, then try another encoding.

Solution 3

"ruby 1.9: invalid byte sequence in UTF-8" described another good approach with less code:

file_contents.encode!('UTF-16', 'UTF-8')
Share:
99,763
ciembor
Author by

ciembor

Updated on February 28, 2020

Comments

  • ciembor
    ciembor about 4 years

    I'm writing a crawler which uses Hpricot. It downloads a list of strings from some webpage, then I try to write it to the file. Something is wrong with the encoding:

    "\xC3" from ASCII-8BIT to UTF-8
    

    I have items which are rendered on a webpage and printed this way:

    Développement
    

    the str.encoding returns UTF-8, so force_encoding('UTF-8') doesn't help. How may I convert this to readable UTF-8?

  • ciembor
    ciembor almost 11 years
    It works good for most of cases. But sometimes: U+201C from UTF-8 to ISO-8859-1 in CIDEM / ACC1Ó U+20AC from UTF-8 to ISO-8859-1 in Citi’s Sustainable Development Investments it doesn't. Also some names are converted but wrong and I can't seed it in a database with incomplete multibyte character error message
  • Stefan
    Stefan almost 11 years
    Sorry, this was not meant as a fix. You should fix the problem by setting/detecting the correct encoding when reading the strings into your app.
  • Todd
    Todd about 6 years
    There is also the option of using Encoding::UTF_8 instead of using more memory for the "utf-8" string literal (or any other encoding string).
  • Lucas Andrade
    Lucas Andrade over 5 years
    Works for pdfs created with Wicked PDF gem