How can I manually determine the CodePage and Locale of the current OS

52,180

Solution 1

chcp will get you the active code page.

systeminfo will display system locale and input locale, among other things.

"Note: This command (systeminfo) is not available in Windows 2000 but you can still query Windows 2000 computer by running this command on Windows XP or Windows 2003 computer and set remote computer to Windows 2000 computer. If the current user logon that execute this command already has privilege on remote machine (for instance, Domain Administrators), you don’t have to use /u and /p."
From here.

Solution 2

Note that a given system has two active code pages of interest, as determined by the legacy setting named language for non-Unicode programs, formerly known as system locale (see the bottom section for background information):

  • the OEM code page for use by legacy console applications,
  • the ANSI code page for use by legacy GUI applications.

Note: There are two more code pages, but they are rarely used anymore, and therefore not discussed here: the EBCDIC code and the (pre-OS X) Mac code page - see the WinAPI docs.

The active OEM code page is most easily obtained via chcp, as shown in Forgotten Semicolon's helpful answer - assuming the console window wasn't configured with a custom code page via the registry and that the code page wasn't explicitly changed in the session with chcp <codePageNum>.

Determining the active ANSI code page is not as simple, but PowerShell can help, also with determining the name and language of the system locale:

In Windows 8+ / Windows Server 2012+: Use the Get-WinSystemLocale cmdlet:

Get-WinSystemLocale | Select-Object Name, DisplayName, 
                        @{ n='OEMCP'; e={ $_.TextInfo.OemCodePage } }, 
                        @{ n='ACP';   e={ $_.TextInfo.AnsiCodePage } }

Caveat: The information returned does not reflect a potential UTF-8 override that may be in place via a new Windows 10 feature (see this SO answer); instead, the information always reflects the code pages originally associated with the active system locale. If you do need to know whether the UTF-8 override is in effect, see the registry-based method below.

On a US-English system, the above yields:

Name  DisplayName             OEMCP  ACP
----  -----------             -----  ---
en-US English (United States)   437 1252

OEMCP is the OEM code page, ACP the ANSI code page.

A registry-based method that also works on older systems down to Windows XP:

# Get the code pages:
Get-ItemProperty HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage | 
     Select-Object OEMCP, ACP

On a US-English system, the above yields:

OEMCP ACP 
----- --- 
437   1252

If you also want get the system locale's [friendly] name and LCID (though note that LCIDs are deprecated):

[Globalization.CultureInfo]::GetCultureInfo([int] ('0x' + (
        Get-ItemProperty 'HKLM:\SYSTEM\CurrentControlSet\Control\Nls\Language' Default
      ).Default)
)

On a US-English system, the above yields:

LCID             Name             DisplayName                                                                                                                                      
----             ----             -----------                                                                                                                                      
1033             en-US            English (United States)                                                                                                                          

Background information:

System locale is the legacy name for what is now more descriptively called language for non-Unicode programs (see NLS terminology), and, as the names suggest:

  • The setting applies only to legacy programs (programs that don't support Unicode).

  • It applies system-wide, irrespective of a given user's locale settings, and administrative privileges are required to change it.

It is important to note that is is a legacy setting, because code pages no longer apply to programs that use Unicode internally and call the Unicode versions of the Windows API.

Notably, it determines the active code pages, i.e., the character encoding used by default:

  • the ANSI code page to use when non-Unicode programs call the non-Unicode (ANSI) versions of the Windows API, notably the ANSI version of the TextOut function for translating strings to and from Unicode, which notably determines how the program's strings render in the GUI.

  • the OEM code page to make active by default in console windows, as reflected by chcp.

    • A console window's active code page determines how keyboard input and output from console applications is interpreted and displayed.
      • Note that that means that even output from Unicode console applications is translated to the active code page, which can result in loss of information; use of pseudo code page 65001, which represents the UTF-8 encoding of Unicode, is a solution, but that can cause legacy command-line programs to misinterpret data and even to fail - see this StackOverflow answer for details.
    • Unlike the ANSI code page, you can change the active [OEM] code page on demand for a given console window; e.g., to switch to OEM code page 850, run chcp 850 in cmd.exe, and $OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = [text.encoding]::GetEncoding(850) in PowerShell.
  • additionally, the rarely used anymore EBCDIC and Mac code pages.

Despite the word locale used in the legacy term and the word language in the current term:

  • The only aspects controlled by the setting are the set of active code pages and the default bitmap fonts, not also other elements of a locale (which are controlled by the user-level locale settings).

  • A given code page is typically shared by many locales and covers multiple languages; e.g., the widely used 1252 code page is used by many Western European languages, including English.

However, when you do change the setting via the Control Panel, you do pick the setting by way of a specific locale.

For a list of all Windows code pages, see https://docs.microsoft.com/en-us/windows/desktop/Intl/code-page-identifiers

Solution 3

The locale can also be seen in msinfo32.

Share:
52,180

Related videos on Youtube

epotter
Author by

epotter

I am the Technical Strategist for Aptera, Inc.

Updated on September 17, 2022

Comments

  • epotter
    epotter almost 2 years

    Is there a way that I manually have a user look up the current Codepage and locale of their windows OS? Is there a registry setting that stores that information?

    It would also be useful if the technique worked all the way back to Windows 2000.

  • kangalioo
    kangalioo over 5 years
    Be aware that chcp will get you the active OEM code page. As mklement states in his answer, there is always another active code page in use by Windows, the ANSI code page. For more information see mklement's answer.
  • Arioch 'The
    Arioch 'The about 5 years
    GetACP() function - technet.microsoft.com/en-us/dd318070 - that is interesting link, the remark section outright tells this function return value does NOT represent user's selected default input language and GUI language but something entirely different...
  • mklement
    mklement about 5 years
    Indeed, @Arioch'The - that is what I tried to clarify in the background-information section: the system locale (a) determines the code pages (but no other locale settings) system-wide, (b) irrespective of a given user's locale. Note how the linked page states (emphasis added): "Returns the current Windows ANSI code page (ACP) identifier for the operating system". As for the potential AppLocale 3rd-party replacement: I've added a link to the answer.
  • Arioch 'The
    Arioch 'The about 5 years
    Actually, there are per-process (implicitly per-user) functions: GetConsoleCP and GetConsoleOutputCP msdn.microsoft.com/en-us/windows/desktop/ms683162 - but they have their own can of worms: first they probably are bound to OEM cps not ANSI CPs even half of the Windows CLI utils forgets this (another half remembers though :-/ guess which is which... ) and the fact they are two :-D // That really is a mess working in command line for localized Windows editions...
  • Arioch 'The
    Arioch 'The about 5 years
    That GetACP remark/link is I think important as a "word of god" confirmation that MBCS-to-Unicode default conversion is intended to be user-independent and OS-global, not just implementation detail in some of Windows versions.
  • Arioch 'The
    Arioch 'The about 5 years
    active OEM codepage (not ANSI codepage) and only if it was overriden by use ( console chcp command)
  • Arioch 'The
    Arioch 'The about 5 years
    That was because TCP-related utilities of Windows were... inconsistent. I tried chcp into OEM, into ANSI, into UTF-8 - at every setting some utilities started giving meaningful output, but others ceased. I even tried to force them into English by chcp, again some reacted some not. There was no common behavior...
  • Arioch 'The
    Arioch 'The about 5 years
    Small upd. For few days I was looking for GetMACCP() function to accompany GetACP() and GetOEMCP() - and could find no traces of. Seems it was false memory and such a function never existed. However there exist special constants, "virtual" codepages, like CP_ACP=0; CP_OEMCP=2; CP_MACCP=3; - and no similar constant for EBCDIC. There is also LOCALE_IDEFAULTMACCODEPAGE together with LOCALE_IDEFAULT_ANSICODEPAGE and LOCALE_IDEFAULT_CODEPAGE (this last is for OEM codepage despite way too vague and generic name), but again no EBCDIC peer there. Probably just the historic artifact.
  • Arioch 'The
    Arioch 'The about 5 years
    Probably today both pre-UNIX MAC and EBCDIC equally belong in "only of some historic importance" niche. I however is somewhat attached to that MAC CP, cause they managed to make yet another variant of marking new lines in plain text files, different from both UNIX and DOS-Win-OS/2 trees. It was exotic corner case I memorized.
  • mklement
    mklement about 5 years
    Thanks, @Arioch'The. Re EBCDIC: There is a LOCALE_IDEFAULTEBCDICCODEPAGE locale-info lookup constant - see docs.microsoft.com/en-us/windows/desktop/Intl/….
  • Arioch 'The
    Arioch 'The about 5 years
    Thanks. More topical link - docs.microsoft.com/en-us/windows/desktop/Intl/… - and EBCDIC is marked "Windows 2000" - so before w2k it probably did not exist, and for all the years since then no one bothered to update the headers conversion sources that I used :-D