SQLplus and Oracle, bad UTF-8 display
Solution 1
The question does not mention the actual locale settings. According to Oracle, NLS_LANG
should be set to correspond with those. If your actual locale were not using UTF-8, you would get the replacement character as shown in the question.
Further reading:
How to setup the NLS_LANG Properly for UNIX
To specify the locale behavior of your client Oracle software, you have to set your
NLS_LANG
parameter. It sets the language, territory and also the character set of your client. You need to check the locale environment settings to set yourNLS_LANG
3rd field (character set) in accordance with it. To do this, use the "locale" command like this:
7- Do the Locale and NLS_LANG need to match the database character set?
No, the Locale andNLS_LANG
setting (and if applicable the telnet/ssh config) need to match, but they are ALL technically unrelated to the database character set and they are only relevant for that client environment.
Solution 2
Are you sure the text in the database has the correct characters? In other words, have you verified that it has not been damaged on its way into the database? You could pipe the output of sqlplus
to od -t x1z
and use that to verify whether or not the database is outputting the right encoding. If the encoding is correct, the fault lies elsewhere.
The settings you showed confirm that the database and Oracle client should be successfully set to use UTF-8. Then, once the characters are output by the Oracle client (e.g. sqlplus
), they are handled by the operating system.
It is not clear whether you have used the locale
command to verify the locale settings of the operating system - but I assume you have.
Have you also verified the stty
settings? If these settings are not 8-bit clean, any UTF-8 characters will be damaged in transit. But it does not quite look like this problem either: UTF-8 encoding for ò
is 0xc3 0xb2; if the highest bit is forced to 0, this would become 0x43 0x32, or capital C and number 2 - both valid UTF-8 characters.
The next thing that will process the characters is your terminal emulator: this might commonly be PuTTY if you are using Windows. Very old versions of PuTTY (0.62 and older) used ISO-8859-1 character set by default; newer versions use UTF-8 by default, unless you have specifically chosen otherwise and saved new default settings.
The fact that the single ò
gets mapped to two characters suggests your terminal emulator is not configured for UTF-8, but rather some character set that uses single-byte characters.
It might even be configured to expect straight ASCII, as it seems to replace both bytes of the UTF-8 ò
with a symbol that indicates an unknown character. So either the terminal emulator is not expecting characters outside the 7-bit ASCII set at all, or the font it's using currently does not have the letter shapes (glyphs) corresponding to bytes 0xc3 and 0xb2.
So, you'll need to fix two things in your terminal emulator settings:
First, you should set it to expect the UTF-8 character set instead of whatever it's currently expecting. Once you've done this successfully, the letter ò
should map to just one character, not two. It may or may not be the correct character, but the important thing is that only one character is displayed, not two.
If the displayed character is still not correct, you should verify that the font you're using with the terminal emulator actually has the glyph for ò
.
Related videos on Youtube
elbarna
Updated on September 18, 2022Comments
-
elbarna almost 2 years
I have set
export NLS_LANG=Italian_ITALY.UTF8
db is on UTF-8
SELECT * FROM NLS_DATABASE_PARAMETERS NLS_CHARACTERSET |AL32UTF8
But sqlplus displays bad
ò
à
ì
characters.select titolo from generale where titolo like '%%Roma d%%'; TITOLO ------------------------------------------------------------ Roma drogata: la polizia non pu�� intervenire
must be
select titolo from generale where titolo like '%%Roma d%%'; TITOLO ------------------------------------------------------------ Roma drogata: la polizia non può intervenire
I have tried AIX and Linux client, all on UTF-8.
What do I miss?