Are there any disadvantages of using UTF8 in an oracle database?

7,135

Solution 1

You should have two choices to make :

  1. Choose your database character set (used by VARCHAR2, CHAR, CLOB datatypes).
  2. Choose your national character set (used by NVARCHAR2, NCHAR, NCLOB datatypes).

As seen here :

Oracle recommends using Unicode for all new system deployments.

National character sets can only be Unicode : UTF-8 or UTF-16. So choosing the same character set for both would be redundant...

My advice (you say your application is in English only) :

  • Ask for your database character set to be UTF-8.
  • Ask for your national character set to be UTF-16.

And here is my general advice for your schema definition. Table by table, column by column (I take the VARCHAR2/NVARCHAR2 sample here) :

  • if your column could contain any character in the world (as in user input), make it NVARCHAR2.
  • if you have control about what is going to be stored (English then), make it VARCHAR2.

Solution 2

But watch out :

Do not use the character set named UTF8 as the database character set unless required for compatibility with Oracle Database clients and servers in version 8.1.7 and earlier, or unless explicitly requested by your application vendor. Despite having a very similar name, UTF8 is not a proper implementation of the Unicode encoding UTF-8. If the UTF8 character set is used where UTF-8 processing is expected, data loss and security issues may occur. This is especially true for Web related data, such as XML and URL addresses.

Oracle recommends AL32UTF8 as the database character set. AL32UTF8 is Oracle's name for the UTF-8 encoding of the Unicode standard.

Solution 3

Are there any motivations for NOT using UTF8 or other unicode character set?

Just the one; you have an existing dataset of which you can't guarantee the current charset encoding.

In which case you probably want to remedy that and still use UTF8.

Solution 4

No, not at all.

Share:
7,135

Related videos on Youtube

Admin
Author by

Admin

Updated on September 17, 2022

Comments

  • Admin
    Admin over 1 year

    We are installing ordering a configured oracle database and they are asking us what character encoding we would like to have. The application (in Java) is in English only but users are from different parts of the world.

    Are there any motivations for NOT using UTF8 or other unicode character set?

  • Mac
    Mac over 14 years
    I'll add more links as soon as I can get access to the Oracle docs (site is down for now).
  • Mac
    Mac over 14 years
    Oracle site is up, and reading the documentation made me slightly change my answer...
  • Admin
    Admin over 14 years
    Thank you very much.. fortunatly AL32UTF8 was what they proposed.. :-)