Benefits of vector<char> over string?

15,342

Solution 1

Yes, vector<char> indeed does have more capabilities over string.

Unlike string, vector<char> is guaranteed to preserve iterators, references, etc. during a swap operation. See: May std::vector make use of small buffer optimization?

Solution 2

Aside from readability (which should not be underestimated) I can think of a couple of minor performance/memory issues with using std::string over std::vector:

  • Some modern std::string implementations use the small string optimization. If you are storing data that's larger than the string's internal buffer, it becomes a pessimization, reducing the efficiency of copying, moving, and swap1 and increasing the sizeof() for no benefit.

  • An efficient std::string implementation will always allocate at least 1 more byte than the current size for storing a terminating null (not doing so requires extra logic in operator[] to cope with str[size()]).

I should stress that both of these issues are very minor; the performance cost of them will more than likely be lost in the background noise. But you did ask.


1Those operations require branching on size() if the small string optimization is being used, whereas they don't in a good std::vector implementation.

Solution 3

Beyond readability, and ensuring another maintainer does not confuse the purpose of the std::string, there is not a lot of difference in function. You could of course consider char*/malloc as well, if efficiency is the only consideration.

One potential issue I can think of:

std::string defaults to storing <char>. If you later needed to handle another type (e.g. unsigned short) you might need to either:

  • Create your own typedef std::basic_string<unsigned short> (which moves you away from normal std::string handling)
  • Tentatively apply some reinterpret_cast logic in a setter.

With a vector you could simply change the container to a std::vector<unsigned short>.

Share:
15,342
user541686
Author by

user541686

Updated on June 18, 2022

Comments

  • user541686
    user541686 almost 2 years

    This question is related to, but not quite the same as, this question.

    Are there any benefits to using std::vector<char> instead of std::string to hold arbitrary binary data, aside from readability-related issues?

    i.e. Are there any tasks which are easier/more efficient/better to perform with a vector compared to a string?

  • user541686
    user541686 almost 12 years
    Would you mind expanding on the last paragraph? What is the 'disaster'?
  • user541686
    user541686 almost 12 years
    Could you expand on the last part? What is the disadvantage of using std::basic_string<unsigned short> compared to std::vector<unsigned short>?
  • user541686
    user541686 almost 12 years
    Very interesting point about small strings, though I'm not yet convinced it's a disadvantage. :) Still, a great answer, thanks! +1
  • go4sri
    go4sri almost 12 years
    Consider an example: you have binary data which has multiple nul characters. If a user calls .length(), he would get some answer - which will in all probability be wrong, and he will never be alerted to the fact that it is binary data and not a string.
  • user541686
    user541686 almost 12 years
    Why is that wrong? It seems like you're saying it would work correctly, except that it might be unreadable (i.e. misleading). That's fine, but that wasn't the point of my question -- I specifically said issues except readability.
  • tinman
    tinman almost 12 years
    @go4sri: Calling length() on a string with nul characters should give you the correct length. The problems arise when users start using c_str() and then wonder why their strings are truncated.
  • go4sri
    go4sri almost 12 years
    @Mehrdad - I do not think this comes under readability, but if you are not concerned with this kind of error, then you can skip it.
  • Bo Persson
    Bo Persson almost 12 years
    One disadvantage is that it might not compile. :-) std::char_traits<unsigned short> is not required by the standard.
  • PlasmaHH
    PlasmaHH almost 12 years
    Where did you get the figures that tell that most implementations use small strings? It seems to me that libstdc++ does not use it, and in almost every project I have been involved to in the last decade, I have been using libstdc++ ...
  • JoeG
    JoeG almost 12 years
    @PlasmaHH: I've changed it to 'some'.
  • seanhodges
    seanhodges almost 12 years
    @Mehrdad your issues would be mainly portability to other platforms and compatibility with other libraries. You are no longer using a traditional std::string, since the standard defines only char and wchar_t as valid char_traits. Using something else could lead to undefined behaviour if you run a string operation on the contents.
  • Remus Rusanu
    Remus Rusanu almost 12 years
    many time you need to pass in null-terminated string to legacy APIs. string has (string::c_str()) but vector does not. This why you need the extra space also.
  • user541686
    user541686 about 5 years
    The question asks for benefits of vector<char> over string, not the other way around... weird to see you just quote other answers on that aspect and then post your own response for the reverse direction
  • Matthew D. Scholefield
    Matthew D. Scholefield about 5 years
    Hmm, perhaps this would be better suited for a different question. The reason for my answer was that this was the first result on Google for "vector<char> versus string" so I thought I'd add an answer bringing up something not mentioned.
  • user541686
    user541686 about 5 years
    Oh I see. Yeah it's unfortunate, since I already had a laundry list of why I'd use string over vector<char>, so that's specifically not what the question I needed answered.