WideCharToMultiByte() vs. wcstombs()

19,547

Solution 1

In a nutshell: the WideCharToMultiByte function exposes the encodings/code pages used for the conversion in the parameter list, while wcstombs does not. This is a major PITA, as the standard does not define what encoding is to be used to produce the wchar_t, while you as a developer certainly need to know what encoding you are converting to/from.

Apart from that, WideCharToMultiByte is of course a Windows API function and is not available on any other platform.

Therefore I would suggest using WideCharToMultiByte without a moment's thought if your application is not specifically written to be portable to non-Windows OSes. Otherwise, you might want to wrestle with wcstombs or (preferably IMHO) look into using a full-feature portable Unicode library such as ICU.

Solution 2

  • WideCharToMultiByte is a Windows API function that converts between Windows defined multibyte code pages stored in CHAR, and UTF16, stored in WCHAR. The codepage to use is passed as the first parameter, and can be passed as CP_ACP, which means a codepage specific to the systems current locale - set in the control panel Localization tool "Language to use for Non Unicode Programs". It is accessed by #including , and is available only on Windows.

  • wcstombs is a Standard C Runtime function that converts between the c-runtimes current char* encoding, and wchar_t* encoding. setlocale iirc can be used to set the codepage(s) to use.

  • std::codecvt is the C++ Standard Library template class, in , used for converting strings between various encodings using a variety of traits type mechanisims to define the source and destination encodings.

There are other libraries, including ICONV or ICU that also do various unicode <-> multibyte conversions.

Solution 3

Like with any other function: use the function that does what you need in your program.

WideCharToMultiByte converts from UTF-16 (used as Win32 WCHAR representation) to Win32 code-page of your choice.

wcstombs converts from implementation-defined internal wchar_t representation to current implementation-defined internal multi-byte representation.

So if your program is native Win32 program that uses lots of WIN32 API functions that use and return WCHAR strings then you need WideCharToMultiByte. If you write some functions based on standard library (not Win32 API) that work with standard C wchar_t strings then you need wcstombs.

Solution 4

The main difference is that wcstombs is a standard function, so use that if code needs to run on any platform other than Windows.

Solution 5

wcstombs() is portable, whereas the WideCharToMultiByte() function is win32 only.

When it comes down to it, wcstombs() calls a system-specific function, which on Win32 will most likely be a straight call to WideCharToMultiByte() - however, it might bypass this function completely and just go straight to the internals.
In any case, there's no practical difference.

Share:
19,547

Related videos on Youtube

Greenhorn
Author by

Greenhorn

I am Simple guy with a beard and a moustache.. Treks,Skits,Cartoons,Comics,Coding,Soccer are some but not all of my interests

Updated on April 12, 2020

Comments

  • Greenhorn
    Greenhorn about 4 years

    What is the difference between WideCharToMultiByte() and wcstombs() When to use which one?

  • Serge Dundich
    Serge Dundich about 13 years
    "standard does not define what encoding is to be used to produce the wchar_t, while you as a developer certainly need to know what encoding you are converting to/from". It depends on what you are after. WideCharToMultiByte converts from UTF-16 to Win32 code-page of your choice. wcstombs converts from implementation-defined internal wchar_t representation to current implementation-defined internal multi-byte representation. It is not necessary that developer needs to know implementation-defined encodings.
  • Waihon Yew
    Waihon Yew about 13 years
    @SergeDundich: If you are just passing strings between C library functions then no, it isn't necessary to know the encodings used. In practice, however, you do this to interoperate with external entities (e.g. in the simplest case read/write on a stream). And the external entity certainly does care what encoding you feed it.
  • Serge Dundich
    Serge Dundich about 13 years
    "In practice, however, you do this to interoperate with external entities" Or to convert strings between wchar_t-based and char-based functions input/output. "external entity certainly does care what encoding you feed it" True. But sometimes external entity expects e.g. multi-byte string represented in implementation defined standard way (that may even happen to be user-configurable).
  • Waihon Yew
    Waihon Yew about 13 years
    @SergeDundich: I beg to disagree. How is it possible for the external entity to expect a string encoded in an "implementation-defined way", when no one (including that entity) knows what "implementation-defined" means?
  • Chris Becke
    Chris Becke about 13 years
    the question already had a selected answer, I just thought that someone should perhaps mention (Given the question was tagged c++, not c) that c++ does have a solution as well.
  • Serge Dundich
    Serge Dundich about 13 years
    <<no one (including that entity) knows what "implementation-defined" means>> This is not true. Term "implementation-defined" is not the same as "undefined". "Implementation-defined" means clearly defined and documented by implementation.

Related