Java, UTF-8, and Windows console
Solution 1
Try chcp 65001 && start.bat
The chcp
command changes the code page, and 65001 is the Win32 code page identifier for UTF-8 under Windows 7 and up. A code page, or character encoding, specifies how to convert a Unicode code point to a sequence of bytes or back again.
Solution 2
Java on windows does NOT support unicode ouput by default. I have written a workaround method by calling Native API with JNA library.The method will call WriteConsoleW for unicode output on the console.
import com.sun.jna.Native;
import com.sun.jna.Pointer;
import com.sun.jna.ptr.IntByReference;
import com.sun.jna.win32.StdCallLibrary;
/** For unicode output on windows platform
* @author Sandy_Yin
*
*/
public class Console {
private static Kernel32 INSTANCE = null;
public interface Kernel32 extends StdCallLibrary {
public Pointer GetStdHandle(int nStdHandle);
public boolean WriteConsoleW(Pointer hConsoleOutput, char[] lpBuffer,
int nNumberOfCharsToWrite,
IntByReference lpNumberOfCharsWritten, Pointer lpReserved);
}
static {
String os = System.getProperty("os.name").toLowerCase();
if (os.startsWith("win")) {
INSTANCE = (Kernel32) Native
.loadLibrary("kernel32", Kernel32.class);
}
}
public static void println(String message) {
boolean successful = false;
if (INSTANCE != null) {
Pointer handle = INSTANCE.GetStdHandle(-11);
char[] buffer = message.toCharArray();
IntByReference lpNumberOfCharsWritten = new IntByReference();
successful = INSTANCE.WriteConsoleW(handle, buffer, buffer.length,
lpNumberOfCharsWritten, null);
if(successful){
System.out.println();
}
}
if (!successful) {
System.out.println(message);
}
}
}
tofcoder
Updated on July 09, 2022Comments
-
tofcoder almost 2 years
We try to use Java and UTF-8 on Windows. The application writes logs on the console, and we would like to use UTF-8 for the logs as our application has internationalized logs.
It is possible to configure the JVM so it generates UTF-8, using
-Dfile.encoding=UTF-8
as arguments to the JVM. It works fine, but the output on a Windows console is garbled.Then, we can set the code page of the console to 65001 (
chcp 65001
), but in this case, the.bat
files do not work. This means that when we try to launch our application through our script (named start.bat), absolutely nothing happens. The command simple returns:C:\Application> chcp 65001 Activated code page: 65001 C:\Application> start.bat C:\Application>
But without
chcp 65001
, there is no problem, and the application can be launched.Any hints about that?
-
KarolDepka over 14 yearsSeems like a step backwards to stick (and modify things) to iso-8859-1 instead of utf-8 . But probably You had your reasons.
-
Hakanai about 11 yearsPowerShell still uses the same console, so it is just as old and crap as cmd.exe.
-
Axel Fontaine about 10 yearsThis must be used in conjunction with -Dfile.encoding=UTF-8 to work correctly.
-
Cj1m over 9 years@AxelFontaine I tried using -Dfile.encoding=UTF-8 but when using the square root symbol, the last 2 numbers after the symbol would repeat. E.g instead of
√125
the output would be√12525
-
brady almost 4 yearsIt started supporting it with Windows 7.