locale.getlocale() problems on OSX

10,322

Solution 1

Odd on OSX (Smow Leopard 10.6.1) I get

$ python
Python 2.6.1 (r261:67515, Jul  7 2009, 23:51:51) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.  
>>> import locale
>>> locale.getlocale()
(None, None)
>>> locale.setlocale(locale.LC_ALL, '')
'en_GB.UTF-8'
>>> locale.getlocale()
('en_GB', 'UTF8')

Edit:

I just found this on the apple python mailing list

Basically it depends on what is set in your environment at run time (one of LANG, LANGUAGE, LC_ALL) I had LANG=en_GB.UTF-8 in my shell environment

Solution 2

Looks like you can change locale by changing environment variable LC_ALL.

$ export LC_ALL=C
$ python
Python 2.5.1 (r251:54863, Feb  6 2009, 19:02:12) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getlocale()
(None, None)
>>> locale.setlocale(locale.LC_ALL, "")
'C'
>>> locale.getlocale()
(None, None)    

$ export LC_ALL=en_GB.UTF-8
$ python
Python 2.5.1 (r251:54863, Feb  6 2009, 19:02:12) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getlocale()
(None, None)
>>> locale.setlocale(locale.LC_ALL, "")
'en_GB.UTF-8'
>>> locale.getlocale()
('en_GB', 'UTF8')

Solution 3

Addmittedly a horrible hack, but I inserted this:

import platform

# ...

# XXX horrendous OS X invalid locale hack
if platform.system() == 'Darwin':
    import locale
    if locale.getlocale()[0] is None:
        locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

at an early point in a program of mine. After that I could run my program using unmodified shell environment on all OS'es relevant to me (my program figures out the language to be used later in it's processing anyway).

Solution 4

From here: Try adding or editing the ~/.profile or ~/.bash_profile file for it to correctly export your locale settings upon initiating a new session.

export LC_ALL=en_US.UTF-8  
export LANG=en_US.UTF-8

Solution 5

Old question, but this may help others: this is a Python bug that as of March 2016 is still unresolved in either Python 2 or 3: https://bugs.python.org/issue18378 .

The summary is that Python assumes GNU-like locales and balks on (POSIXly correct) divergences like those in BSD environments (as is OS X). And the UTF8 locale exists in BSD, not in Linux, hence the problem.

As for solutions or debugging: the local environment variables can be set by Terminal.app (see Preferences - Profiles - Advanced - International; similarly so for iTerm or whatever). So one can find the locale environment variables set when in a terminal window, but find the variables NOT set when running a packaged application.

For some cases (like Sphinx in python 2.7 and 3.5 dying in OS X because of "ValueError: unknown locale: UTF-8"), disabling the preference checkbox to set locale environment variables is the solution.

But that can cause problems in other programs: if the locale vars are not set, bash 4.3 (from MacPorts) will complain at every prompt with "warning: setlocale: LC_CTYPE: cannot change locale (): No such file or directory" ...

So, given that the bug is in Python, the workaround should be probably done in the python program (as in @Jacob Oscarson's answer) or python invocation (by setting the locale vars to some adequate value).

Share:
10,322
pojo
Author by

pojo

Updated on June 05, 2022

Comments

  • pojo
    pojo about 2 years

    I need to get the system locale to do a number of things, ultimately I want to translate my app using gettext. I am going to distribute it on both Linux and OSX, but I ran into problems on OSX Snow Leopard:

    $ python
    Python 2.5.2 (r252:60911, Jan  4 2009, 17:40:26) 
    [GCC 4.3.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import locale
    >>> locale.setlocale(locale.LC_ALL, '')
    'sv_SE.UTF-8'
    >>> locale.getlocale()
    ('sv_SE', 'UTF8')
    
    $ python
    Python 2.6.1 (r261:67515, Jul  7 2009, 23:51:51) 
    [GCC 4.2.1 (Apple Inc. build 5646)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import locale
    >>> locale.setlocale(locale.LC_ALL, '')
    'C'
    >>> locale.getlocale()
    (None, None)
    

    Both systems are using Swedish languages. On Linux, the environment variable LANG is already set to "sv_SE.UTF-8". If I pass that variable to python on OSX (LANG="sv_SE.UTF-8" python instead), locale is detected nicely. But shouldn't locale.getlocale()be able to fetch whatever language the operating system has? I don't want to force users to set LANG, LC_ALL or any environment variable at all.

    Output of locale command:

    $ locale
    LANG=
    LC_COLLATE="C"
    LC_CTYPE="C"
    LC_MESSAGES="C"
    LC_MONETARY="C"
    LC_NUMERIC="C"
    LC_TIME="C"
    LC_ALL=
    
  • pojo
    pojo over 14 years
    Strange. In the original post I was using iTerm, but if I use Terminal.app I get an error (ValueError: unknown locale: UTF-8). The locale looks weird: 'C/UTF-8/C/C/C/C'. Maybe my system is messed up somehow, but it's a fairly fresh install of Snow Leopard.
  • pojo
    pojo over 14 years
    But I don't see the point of having to set LC_ALL explicitly this way to get my application to detect language properly.
  • mmmmmm
    mmmmmm over 14 years
    See my edit for why the change appears - your system is not messed up (well no more that all OSX python) - Sorry should have added this when I edited
  • pojo
    pojo over 14 years
    I saw your link now, and from what I can gather, "it can't be done" since OSX doesn't make use of LANG or LC_ALL. I was intrigued by the __CF_USER_TEXT_ENCODING variable, but it seems kind of stupid to parse that. IMO getlocale() should call the appropriate API:s and parse that for you, not rely on some environment variables.
  • hmijail
    hmijail over 8 years
    1) The locale vars might be set by the very terminal emulation program, so this is only hiding the problem. 2) The locale vars are already being set correctly, but python is not understanding them.
  • hmijail
    hmijail over 8 years
    @pojo, and you were rather right: this is a python bug, and looks like they might end up using the native locale APIs instead of the environment vars. Alas, still unfixed. bugs.python.org/issue18378
  • hmijail
    hmijail over 7 years
    +1. It is a horrible hack, but it is necessary given that this is a Python bug. (More details at my answer ;P)