What is the default encoding of the JVM?
Solution 1
The default character set of the JVM is that of the system it's running on. There's no specific value for this and you shouldn't generally depend on the default encoding being any particular value.
It can be accessed at runtime via Charset.defaultCharset()
, if that's any use to you, though really you should make a point of always specifying encoding explicitly when you can do so.
Solution 2
Note that you can change the default encoding of the JVM using the confusingly-named property file.encoding
.
If your application is particularly sensitive to encodings (perhaps through usage of APIs implying default encodings), then you should explicitly set this on JVM startup to a consistent (known) value.
Solution 3
There are three "default" encodings:
file.encoding:
System.getProperty("file.encoding")
java.nio.Charset:
Charset.defaultCharset()
And the encoding of the InputStreamReader:
InputStreamReader.getEncoding()
You can read more about it on this page.
Solution 4
To get default java settings just use :
java -XshowSettings
Solution 5
I am sure that this is JVM implemenation specific, but I was able to "influence" my JVM's default file.encoding by executing:
export LC_ALL=en_US.UTF-8
(running java version 1.7.0_80 on Ubuntu 12.04)
Also, if you type "locale" from your unix console, you should see more info there.
All the credit goes to http://www.philvarner.com/2009/10/24/unicode-in-java-default-charset-part-4/
user67722
Updated on July 05, 2022Comments
-
user67722 almost 2 years
Is UTF-8 the default encoding in Java?
If not, how can I know which encoding is used by default? -
sleske about 14 yearsNote that
file.encoding
must be specified on JVM startup (i.e. as cmdline parameter -Dfile.encoding or via JAVA_TOOLS_OPTIONS); you can set it at runtime, but it will not matter. See stackoverflow.com/questions/361975/… -
Jonas Elfström over 12 yearsIf you are correct I find it a bit strange java.sun.com/javase/technologies/core/basic/intl/… says that it's always UTF-16.
-
JesperE over 12 yearsUTF-16 is how text is represented internally in the JVM. The default encoding determines how the JVM interprets bytes read from files (using
FileReader
, for example). -
Koray Tugay almost 9 yearsSo it depends on the encoding the host operating system has?
-
Jeutnarg over 8 yearsThis answer is correct, but for reference, on Linux it's usually "UTF-8", and on Windows it's usually "cp1252".
-
Gunslinger over 7 yearsI have just experienced an linux installation that report UTF-8 from locale, but java says US-ASCII.
-
Artem Novikov about 6 yearsWrong. Check
Charset.defaultCharset()
source code. It readsfile.encoding
property, otherwise uses UTF-8. -
Artem Novikov about 6 yearsHow did you check it? I can't find a proof Java pays any attention to the encoding in the locale string. Only from
file.encoding
property. -
Jules almost 6 years@ArtemNovikov - yes, but what is the default value of
file.encoding
? It's initialised injava.lang.System.initProperties
based on the value ofsprops.encoding
, wheresprops
is a structure returned by the native functionGetJavaProperties()
, the implementation of which varies according to platform. In the Windows version, for example, it callsGetUserDefaultLCID()
and thenGetLocaleInfo (lcid, LOCALE_IDEFAULTANSICODEPAGE, ...)
to find the user's default ANSI code page and uses that. On Unix platforms, it parses the return ofsetlocale(LC_CTYPE, NULL)
. -
Jules almost 6 years
-
Rahul almost 6 years@JesperE : "text is represented internally in the JVM" : you mean bytecode ?
-
JesperE almost 6 years@Rahul No, I mean how text is represented in memory in the JVM. Not sure what the bytecode spec says, I was referring to how the JVM stores text in memory at runtime. At least I think so, but my comment was made 6 years ago, so I might misremember.