How to ensure that Strings are in UTF-8?
Solution 1
Note that when you call text.getBytes()
without arguments, you're in fact getting an array of bytes representing the string in your platform's default encoding. On Windows, for example, it could be some single-byte encoding; on Linux it can be UTF-8 already.
To be correct you need to specify exact encoding in getBytes()
method call. For Java 7 and later do this:
import java.nio.charset.StandardCharsets
val bytes = text.getBytes(StandardCharsets.UTF_8)
For Java 6 do this:
import java.nio.charset.Charset
val bytes = text.getBytes(Charset.forName("UTF-8"))
Then bytes
will contain UTF-8-encoded text.
Solution 2
Just set the JVM's file.encoding
parameter to UTF-8
as follows:
-Dfile.encoding=UTF-8
It makes sure that UTF-8
is the default encoding.
Using scala
it could be scala -Dfile.encoding=UTF-8
.
YoBre
I'm an italian web developer. Currently working with Angular, Coffee scripts and Javascript. I love photography and video editing.
Updated on February 09, 2020Comments
-
YoBre about 4 years
How to convert this String
the surveyÂ’s rules
toUTF-8
in Scala?I tried these roads but does not work:
scala> val text = "the surveyÂ’s rules" text: String = the surveyÂ’s rules scala> scala.io.Source.fromBytes(text.getBytes(), "UTF-8").mkString res17: String = the surveyÂ’s rules scala> new String(text.getBytes(),"UTF8") res21: String = the surveyÂ’s rules
Ok, i'm resolved in this way. Not a converting but a simple reading
implicit val codec = Codec("US-ASCII").onMalformedInput(CodingErrorAction.IGNORE).onUnmappableCharacter(CodingErrorAction.IGNORE) val src = Source.fromFile(new File (folderDestination + name + ".csv")) val src2 = Source.fromFile(new File (folderDestination + name + ".csv")) val reader = CSVReader.open(src.reader())