PostgreSQL 9.1 installation and database encoding

19,581

Solution 1

The answer provided by Erwin Brandstetter, was helpful, but for whatever reason, it didn't work for me. The reason is that I couldn't initdb to ever run. I kept getting a "bash: command not found" error when trying to run it with the locale. What I ended up doing was:

changing the locale of the OS. For me, this was:

$ update-locale LANG=en_US.UTF-8

Note: then I had to reboot the server. To confirm that it worked, just run:

$ locale

with that set, I stopped and dropped the cluster:

$ su postgres
$ pg_dropcluster --stop 9.1 main

note: main is the default cluster that got created for me (your cluster name might be different)

$ pg_createcluster --start 9.1 main

Again, I just recreated the cluster with the same name (main).

Note: because I picked the same cluster name (main), I had to go back and update my .conf files. For me, this specifically was postgres.conf and pg_hba.conf to re-enable remote access to the box. I'm not going to include how to do that here as there is lots of documentation on the web for this. But, if someone wants to edit this answer later to include it, that would be fine! :)

Solution 2

The relevant option is --locale=locale to the initdb command which initializes your database cluster. If you don't supply it explicitly it defaults to the system locale. (You probably run your Ubuntu on locale 'C'.)

Read more about it in the excellent manual here.

In PostgreSQL you can still sneak in a database with different locale by basing a new database off template0 instead of the default templeate1. I quote the manual here:

The encoding and locale settings must match those of the template database, except when template0 is used as template.

But I'd rather recreate the database cluster with the desired locale. Much cleaner.


Edit: info about available locales

You can only use locales that are provided by the operating system. I quote the manual here:

What locales are available on your system under what names depends on what was provided by the operating system vendor and what was installed. On most Unix systems, the command locale -a will provide a list of available locales. Windows uses more verbose locale names, such as German_Germany or Swedish_Sweden.1252, but the principles are the same.

Look at locale-gen in a Unix-system, if you want to use a locale that has not yet been generated. The important thing to understand is that multiple locales can be installed in your OS, but only one of them can be picked for system parameters like LC_CTYPE, LC_COLLATE, etc. Look at the output of locale versus locale -a in the shell. Usually it is the same for all, set via LC_ALL.

@David: What you did may have solved your problem, but you could have had that easier. Also, be aware that the environment variable LANG only provides the default for all locale settings. If any of them is set to something different, LANG will be overridden. Set LC_ALL to override any existing setting. Here is one of many sites in the web telling you more about that.

To check all current locale settings of your database (cluster), run in your database:

SHOW ALL;

Or more specifically:

SELECT *
FROM   pg_settings
WHERE  name ~~ 'lc%';

Solution 3

Creating the database via initdb with the encoding option (-E or --encoding=) did the trick for me on Mac Os X Snow Leopard:

initdb -D <database_directory> -E UTF8

If you init the database this way it will create the database templates with the correct encoding also.

Refer to the initdb documentation here and the encoding documentation here to choose the correct encoding.

Share:
19,581
David S
Author by

David S

Updated on July 21, 2022

Comments

  • David S
    David S over 1 year

    I just did my first install of PostgreSQL 9.1 on Ubuntu 10.04.

    note: I have done it a few times on Windows with an installer without issues.

    After a bit of effort, I got it set up to connect remotely via pgAdminIII. However, I was really surprised after connecting to the db, that I got a warning about the encoding. The "postgres" database itself was created with "SQL_ASCII" encoding. Every time that I've installed on windows, it created the postgres DB with "UTF8" - which seems like it would be a lot better and would stop the warning message when opening up the database via pgAdminIII.

    Is there something I did wrong? Is there an installation option/param to use to set the default encoding to use?

    And is there anyway to fix this? I've read some things on the web that says that you need to dumb and restore to change the encoding of a database, but I'm not sure this is even possible on the postgres db. Is it?

    Thanks for your help!

  • David S
    David S over 12 years
    Thanks for the reply! Let me try it out and I'll post update with comments, etc. thanks again!
  • daxelrod
    daxelrod over 10 years
    Note that pg_createcluster also has a --locale switch that takes a unix-style locale.
  • daxelrod
    daxelrod over 10 years
    Using your example, if locale -a shows en_US.UTF-8, you can pass that as the --locale. If it doesn't, you will have to generate it, using sudo locale-gen en_US.UTF-8. Afterwards, you can pass --locale=en_US.UTF-8 to createcluster.
  • Adam Silenko
    Adam Silenko almost 8 years
    next time try use full path of initdb, for example: /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data -E WIN1250