What is the point of STL Character Traits?

13,777

Character traits are an extremely important component of the streams and strings libraries because they allow the stream/string classes to separate out the logic of what characters are being stored from the logic of what manipulations should be performed on those characters.

To begin with, the default character traits class, char_traits<T>, is used extensively in the C++ standard. For example, there is no class called std::string. Rather, there's a class template std::basic_string that looks like this:

template <typename charT, typename traits = char_traits<charT> >
    class basic_string;

Then, std::string is defined as

typedef basic_string<char> string;

Similarly, the standard streams are defined as

template <typename charT, typename traits = char_traits<charT> >
    class basic_istream;

typedef basic_istream<char> istream;

So why are these classes structured as they are? Why should we be using a weird traits class as a template argument?

The reason is that in some cases we might want to have a string just like std::string, but with some slightly different properties. One classic example of this is if you want to store strings in a way that ignores case. For example, I might want to make a string called CaseInsensitiveString such that I can have

CaseInsensitiveString c1 = "HI!", c2 = "hi!";
if (c1 == c2) {  // Always true
    cout << "Strings are equal." << endl;
}

That is, I can have a string where two strings differing only in their case sensitivity are compared equal.

Now, suppose that the standard library authors designed strings without using traits. This would mean that I'd have in the standard library an immensely powerful string class that was entirely useless in my situation. I couldn't reuse much of the code for this string class, since comparisons would always work against how I wanted them to work. But by using traits, it's actually possible to reuse the code that drives std::string to get a case-insensitive string.

If you pull up a copy of the C++ ISO standard and look at the definition of how the string's comparison operators work, you'll see that they're all defined in terms of the compare function. This function is in turn defined by calling

traits::compare(this->data(), str.data(), rlen)

where str is the string you're comparing to and rlen is the smaller of the two string lengths. This is actually quite interesting, because it means that the definition of compare directly uses the compare function exported by the traits type specified as a template parameter! Consequently, if we define a new traits class, then define compare so that it compares characters case-insensitively, we can build a string class that behaves just like std::string, but treats things case-insensitively!

Here's an example. We inherit from std::char_traits<char> to get the default behavior for all the functions we don't write:

class CaseInsensitiveTraits: public std::char_traits<char> {
public:
    static bool lt (char one, char two) {
        return std::tolower(one) < std::tolower(two);
    }

    static bool eq (char one, char two) {
        return std::tolower(one) == std::tolower(two);
    }

    static int compare (const char* one, const char* two, size_t length) {
        for (size_t i = 0; i < length; ++i) {
            if (lt(one[i], two[i])) return -1;
            if (lt(two[i], one[i])) return +1;
        }
        return 0;
    }
};

(Notice I've also defined eq and lt here, which compare characters for equality and less-than, respectively, and then defined compare in terms of this function).

Now that we have this traits class, we can define CaseInsensitiveString trivially as

typedef std::basic_string<char, CaseInsensitiveTraits> CaseInsensitiveString;

And voila! We now have a string that treats everything case-insensitively!

Of course, there are other reasons besides this for using traits. For example, if you want to define a string that uses some underlying character type of a fixed-size, then you can specialize char_traits on that type and then make strings from that type. In the Windows API, for example, there's a type TCHAR that is either a narrow or wide character depending on what macros you set during preprocessing. You can then make strings out of TCHARs by writing

typedef basic_string<TCHAR> tstring;

And now you have a string of TCHARs.

In all of these examples, notice that we just defined some traits class (or used one that already existed) as a parameter to some template type in order to get a string for that type. The whole point of this is that the basic_string author just needs to specify how to use the traits and we magically can make them use our traits rather than the default to get strings that have some nuance or quirk not part of the default string type.

Hope this helps!

EDIT: As @phooji pointed out, this notion of traits is not just used by the STL, nor is it specific to C++. As a completely shameless self-promotion, a while back I wrote an implementation of a ternary search tree (a type of radix tree described here) that uses traits to store strings of any type and using whatever comparison type the client wants them to store. It might be an interesting read if you want to see an example of where this is used in practice.

EDIT: In response to your claim that std::string doesn't use traits::length, it turns out that it does in a few places. Most notably, when you construct a std::string out of a char* C-style string, the new length of the string is derived by calling traits::length on that string. It seems that traits::length is used mostly to deal with C-style sequences of characters, which are the "least common denominator" of strings in C++, while std::string is used to work with strings of arbitrary contents.

Share:
13,777
Matthew Smith
Author by

Matthew Smith

Embedded C++/C Linux programmer. Currently also working with Python and Ember (Javascript based web application framework). Working at a networking devices company that specialises in out-of-band management and recovery.

Updated on June 05, 2022

Comments

  • Matthew Smith
    Matthew Smith almost 2 years

    I notice that in my copy of the SGI STL reference, there is a page about Character Traits but I can't see how these are used? Do they replace the string.h functions? They don't seem to be used by std::string, e.g. the length() method on std::string doesn't make use of the Character Traits length() method. Why do Character Traits exist and are they ever used in practice?

  • phooji
    phooji about 13 years
    Seems like you have done justice to your username :) Perhaps also relevant: many of the boost libraries use concepts and type trait classes, so it isn't just the standard library. Furthermore, similar techniques are used in other languages without the use of templates, see esoteric example: ocaml.janestreet.com/?q=node/11 .
  • Mike DeSimone
    Mike DeSimone about 13 years
    IIRC, this is a good example of policy-based template design, where the traits class is the policy passed to the character-using class.
  • Matthieu M.
    Matthieu M. about 13 years
    nice structure (Ternary Search Tree), however I'd point out that Tries can be "compacted" in various ways: 1/ using ranges of characters to point to a child, rather than single characters (the gain is obvious), 2/ path compression (Patricia Trees) and 3/ buckets at the end of branches (ie, just use a sorted array of strings as long as there are less than K). Combining those (I combined 1 and 3) drastically reduce the memory consumption without impacting the speed performance by more than a constant factor (and in fact, the buckets decrease the number of jumps).
  • dan04
    dan04 about 13 years
    What advantage does a CaseInsensitiveString class have over a CaseInsensitiveCompare function?
  • Xeo
    Xeo over 12 years
    @dan04: Try to get any standard class / algorithm to use your function.
  • Linuxios
    Linuxios almost 12 years
    Wow. + (9.999 * 10^100,000,000,000)
  • user541686
    user541686 almost 12 years
    Aren't char traits a property of the comparison rather than the data? What does it mean for me to have a "case-insensitive"? What if I want to compare strings with different char traits?
  • templatetypedef
    templatetypedef almost 12 years
    @Mehrdad- The character traits do have some impact on the data; for example, it encodes an integer type wide enough to hold a character plus an EOF sentinel. But you are mostly right that traits encode comparisons. To the best of my knowledge, it's not meaningful or well-defined to compare strings with different traits.
  • Virus721
    Virus721 almost 10 years
    So... to put this in a nutshell, traits are just some kind of interface used by the basic_string class to manipulate various types of characters regardless of what they trully are, right ?
  • dom_beau
    dom_beau over 8 years
    @Virus721 Traits aren't rather implementations "plugged in" a given class? What a class receives from a trait is, IMHO, an implementation, not an interface. The trait has its interface, of course.
  • Victor Sergienko
    Victor Sergienko over 7 years
    @Xeo std::map has a Compare template argument. Don't forget std::find_if or std::binary_search too.
  • Justin Time - Reinstate Monica
    Justin Time - Reinstate Monica over 4 years
    It's more that trait types like this are replaceable logic modules than interfaces, @Virus721.
  • milanHrabos
    milanHrabos over 2 years
    So inheriting from the std::char_traits<char> and referining the compare function, is overriding mechanism? Or else, if I don't use my own traits, how does the string gets its own compare function?