Java encoding for Japanese characters
Solution 1
Let's see what your code actually does:
//Assign to bytes the UTF-16 String fileName Encoded in Shift_JIS
//bytes now contains the binary Shift_JIS representation of your String
final byte[] bytes = fileName.getBytes("Shift_JIS");
//Create a new String UTF-16 by interpreting bytes as ISO8859_1
//Takes the Shift_JIS encoded bytes and interprets it as ISO8859_1
new String(bytes,"ISO8859_1");
Java strings use UTF-16 for their internal representation. You cannot specify a target encoding when you create a string as UTF-16 is fixed, you have to Specify the correct source encoding which is "Shift_JIS" for the bytes array.
The fileNameX should come out correct without converting.
Solution 2
This is the mapping problem both Shift_JIS code and Unicode. Shift_JIS doesn't have all the characters of Unicode so some characters become "?".
Following is the result of conversion from Unicode to Shift_JIS.
RESULT UNICODE
[NG] U+2012 (FIGURE DASH)
[NG] U+2013 (EN DASH)
<OK> U+2014 (EM DASH)
[NG] U+2015 (HORIZONTAL BAR)
<OK> U+2212 (MINUS SIGN)
[NG] U+FF0D (FULLWIDTH HYPHEN-MINUS)
One solution is a replacement of the code.
U+2012,U+2013,U+2015 --> U+2014
U+FF0D --> U+2212
Prasanna
Updated on August 23, 2022Comments
-
Prasanna over 1 year
I have a file name with Japanese characters. file name:
S-最終条件.pdf
. In Java, file name:S-最終条件.pdf
.// Support for Japanese file name fileNameX = new String(fileName.getBytes("Shift_JIS"),"ISO8859_1");
The output
fileNameX
is coming outS?最終条件.pdf
. Hence it is throwing an error. I am trying to outstream the file in PDF format, but the particular Japanese character "-" is not recognised and it is throwing error while streaming.Please help me solve this issue.
Thanks, Prasanna