SQLplus and Oracle, bad UTF-8 display

6,851

Solution 1

The question does not mention the actual locale settings. According to Oracle, NLS_LANG should be set to correspond with those. If your actual locale were not using UTF-8, you would get the replacement character as shown in the question.

Further reading:

How to setup the NLS_LANG Properly for UNIX

To specify the locale behavior of your client Oracle software, you have to set your NLS_LANG parameter. It sets the language, territory and also the character set of your client. You need to check the locale environment settings to set your NLS_LANG 3rd field (character set) in accordance with it. To do this, use the "locale" command like this:

7- Do the Locale and NLS_LANG need to match the database character set?
No, the Locale and NLS_LANG setting (and if applicable the telnet/ssh config) need to match, but they are ALL technically unrelated to the database character set and they are only relevant for that client environment.

Solution 2

Are you sure the text in the database has the correct characters? In other words, have you verified that it has not been damaged on its way into the database? You could pipe the output of sqlplus to od -t x1z and use that to verify whether or not the database is outputting the right encoding. If the encoding is correct, the fault lies elsewhere.

The settings you showed confirm that the database and Oracle client should be successfully set to use UTF-8. Then, once the characters are output by the Oracle client (e.g. sqlplus), they are handled by the operating system.

It is not clear whether you have used the locale command to verify the locale settings of the operating system - but I assume you have.

Have you also verified the stty settings? If these settings are not 8-bit clean, any UTF-8 characters will be damaged in transit. But it does not quite look like this problem either: UTF-8 encoding for ò is 0xc3 0xb2; if the highest bit is forced to 0, this would become 0x43 0x32, or capital C and number 2 - both valid UTF-8 characters.

The next thing that will process the characters is your terminal emulator: this might commonly be PuTTY if you are using Windows. Very old versions of PuTTY (0.62 and older) used ISO-8859-1 character set by default; newer versions use UTF-8 by default, unless you have specifically chosen otherwise and saved new default settings.

The fact that the single ò gets mapped to two characters suggests your terminal emulator is not configured for UTF-8, but rather some character set that uses single-byte characters.

It might even be configured to expect straight ASCII, as it seems to replace both bytes of the UTF-8 ò with a symbol that indicates an unknown character. So either the terminal emulator is not expecting characters outside the 7-bit ASCII set at all, or the font it's using currently does not have the letter shapes (glyphs) corresponding to bytes 0xc3 and 0xb2.

So, you'll need to fix two things in your terminal emulator settings:

First, you should set it to expect the UTF-8 character set instead of whatever it's currently expecting. Once you've done this successfully, the letter ò should map to just one character, not two. It may or may not be the correct character, but the important thing is that only one character is displayed, not two.

If the displayed character is still not correct, you should verify that the font you're using with the terminal emulator actually has the glyph for ò.

Share:
6,851

Related videos on Youtube

elbarna
Author by

elbarna

Updated on September 18, 2022

Comments

  • elbarna
    elbarna almost 2 years

    I have set

    export NLS_LANG=Italian_ITALY.UTF8
    

    db is on UTF-8

    SELECT * FROM NLS_DATABASE_PARAMETERS
    NLS_CHARACTERSET                                                            |AL32UTF8
    

    But sqlplus displays bad ò à ì characters.

    select titolo from generale where titolo like '%%Roma d%%';
    
    TITOLO
    ------------------------------------------------------------
    Roma drogata: la polizia non pu�� intervenire
    

    must be

    select titolo from generale where titolo like '%%Roma d%%';
    
    TITOLO
    ------------------------------------------------------------
    Roma drogata: la polizia non può intervenire
    

    I have tried AIX and Linux client, all on UTF-8.

    What do I miss?