std::wstring length

23,389

Solution 1

std::wstring::size() returns the number of wide-char elements in the string. This is not the same as the number of characters (as you correctly noticed).

Unfortunately, the std::basic_string template (and thus its instantiations, such as std::string and std::wstring) is encoding-agnostic. In this sense, it is actually just a template for a string of bytes and not a string of characters.

Solution 2

Firstly std::wstring is an instantiation of std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >.

Although most of the real work is done by char_traits, and one can write their own, this is done primarily to enable use of the C runtime library with different character sizes.

The way to parse in a pointer of Element* is until the character indicated by the char_traits as the terminator is reached.

However you can construct with a pointer and a length, in which case it will read the number of characters it tells you to, which will include any null-bytes. You may have embedded null-bytes in a basic_string, and if you call length() or size() which are aliases for the same thing, it will tell you how many characters it contains.

There is no magic in char_traits to decode multi-element characters as one, nor should you try to implement it that way.

Share:
23,389
coding1223322
Author by

coding1223322

Updated on January 13, 2020

Comments

  • coding1223322
    coding1223322 over 4 years

    What is the result of std::wstring.length() function, the length in wchar_t(s) or the length in symbols? And why?

    TCHAR r2[3];
    r2[0] = 0xD834;  // D834, DD1E - musical G clef
    r2[1] = 0xDD1E;  //
    r2[2] = 0x0000;  // '/0'
    
    std::wstring r = r2;
    
    std::cout << "capacity: " << r.capacity() << std::endl;
    std::cout << "length: "   << r.length()   << std::endl;
    std::cout << "size: "     << r.size()     << std::endl;
    std::cout << "max_size: " << r.max_size() << std::endl;
    
    Output>
    
    capacity: 351
    length: 2
    size: 2
    max_size: 2147483646
    
  • Sanja Melnichuk
    Sanja Melnichuk over 13 years
    size_type string::capacity() const Returns the number of characters a string can hold without reallocation.
  • CashCow
    CashCow over 13 years
    and why exactly was this answer marked down? I gave useful information about exactly what a wstring is and how it gets constructed from a pointer.
  • Eric M
    Eric M almost 5 years
    I for one think it is an excellent answer, particularly your explanation of the use of char_traits' terminator character.