Resizing a C++ std::vector<char> without initializing data

19,626

Solution 1

vector<char> buf;
buf.reserve(N);
int M = read(fd, &buf[0], N);

This code fragment invokes undefined behavior. You can't write beyond than size() elements, even if you have reserved the space.

The correct code is like:

vector<char> buf;
buf.resize(N);
int M = read(fd, &buf[0], N);
buf.resize(M);


PS. Your statement "With vectors, one can assume that elements are stored contiguously in memory, allowing the range [&vec[0], &vec[vec.capacity()) to be used as a normal array" isn't true. The allowable range is [&vec[0], &vec[vec.size()).

Solution 2

It looks like you can do what you want in C++11 (though I haven't tried this myself). You'll have to define a custom allocator for the vector, then use emplace_back().

First, define

struct do_not_initialize_tag {};

Then define your allocator with this member function:

class my_allocator {
    void construct(char* c, do_not_initialize_tag) const {
        // do nothing
    }

    // details omitted
    // ...
}

Now you can add elements to your array without initializing them:

std::vector<char, my_allocator> buf;
buf.reserve(N);
for (int i = 0; i != N; ++i)
    buf.emplace_back(do_not_initialize_tag());
int M = read(fd, buf.data(), N);
buf.resize(M);

The efficiency of this depends on the compiler's optimizer. For instance, the loop may increment the size member variable N times.

Solution 3

Another, newer, question, a duplicate of this one, has an answer, which looks like exactly what is asked here. Here's its copy (of v3) for quick reference:

It is a known issue that initialization can not be turned off even explicitly for std::vector.

People normally implement their own pod_vector<> that does not do any initialization of the elements.

Another way is to create a type which is layout-compatible with char, whose constructor does nothing:

struct NoInitChar
{
    char value;
    NoInitChar() {
        // do nothing
        static_assert(sizeof *this == sizeof value, "invalid size");
        static_assert(__alignof *this == __alignof value, "invalid alignment");
    }
};

int main() {
    std::vector<NoInitChar> v;
    v.resize(10); // calls NoInitChar() which does not initialize

    // Look ma, no reinterpret_cast<>!
    char* beg = &v.front().value;
    char* end = beg + v.size();
}

Solution 4

Writing into and after the size()th element is an undefined behavior.

Next example copies whole file into a vector in a c++ way (no need to know the file's size and no need to preallocate the memory in the vector):

#include <algorithm>
#include <fstream>
#include <iterator>
#include <vector>

int main()
{
    typedef std::istream_iterator<char> istream_iterator;
    std::ifstream file("example.txt");
    std::vector<char> input;

    file >> std::noskipws;
    std::copy( istream_iterator(file), 
               istream_iterator(),
               std::back_inserter(input));
}

Solution 5

Your program fragment has entered the realm of undefined behavior.

when buf.empty() is true, buf[0] has undefined behavior, and therefore &buf[0] is also undefined.

This fragment probably does what you want.

vector<char> buf;
buf.resize(N); // preallocate space
int M = read(fd, &buf[0], N);
buf.resize(M); // disallow access to the remainder
Share:
19,626

Related videos on Youtube

user984228
Author by

user984228

Updated on June 05, 2022

Comments

  • user984228
    user984228 almost 2 years

    With vectors, one can assume that elements are stored contiguously in memory, allowing the range [&vec[0], &vec[vec.capacity()) to be used as a normal array. E.g.,

    vector<char> buf;
    buf.reserve(N);
    int M = read(fd, &buf[0], N);
    

    But now the vector doesn't know that it contains M bytes of data, added externally by read(). I know that vector::resize() sets the size, but it also clears the data, so it can't be used to update the size after the read() call.

    Is there a trivial way to read data directly into vectors and update the size after? Yes, I know of the obvious workarounds like using a small array as a temporary read buffer, and using vector::insert() to append that to the end of the vector:

    char tmp[N];
    int M = read(fd, tmp, N);
    buf.insert(buf.end(), tmp, tmp + M)
    

    This works (and it's what I'm doing today), but it just bothers me that there is an extra copy operation there that would not be required if I could put the data directly into the vector.

    So, is there a simple way to modify the vector size when data has been added externally?

    • Praetorian
      Praetorian over 12 years
      Are you sure &buf[0] works in debug mode? For instance, on Visual Studio, in debug mode std::vector::operator[] performs a range check. So that expression will throw if buf is empty.
    • Matthieu M.
      Matthieu M. over 12 years
      @SteveJessop: I just died a little.
    • Steve Jessop
      Steve Jessop
      @user984228: if you're happy to rely on implementation details of GCC (which is a BAD IDEA (TM)), then you'd look at the source for its implementation of vector. You can see where it stores the begin and end pointers and capacity, and if you just overwrite the end pointer, I'm pretty sure that will change the size as you want. Just copy whatever the implementation of resize() does in the case where the capacity is big enough to start with, leaving out the memset/fill/whatever. You'll have to work around some private modifiers, of course, perhaps by hard-coding in the offsets.