A dynamic buffer type in C++?

41,228

Solution 1

You want a std::vector:

std::vector<char> myData;

vector will automatically allocate and deallocate its memory for you. Use push_back to add new data (vector will resize for you if required), and the indexing operator [] to retrieve data.

If at any point you can guess how much memory you'll need, I suggest calling reserve so that subsequent push_back's won't have to reallocate as much.

If you want to read in a chunk of memory and append it to your buffer, easiest would probably be something like:

std::vector<char> myData;
for (;;) {
    const int BufferSize = 1024;
    char rawBuffer[BufferSize];

    const unsigned bytesRead = get_network_data(rawBuffer, sizeof(rawBuffer));
    if (bytesRead <= 0) {
        break;
    }

    myData.insert(myData.end(), rawBuffer, rawBuffer + bytesRead);
}

myData now has all the read data, reading chunk by chunk. However, we're copying twice.

We instead try something like this:

std::vector<char> myData;
for (;;) {
    const int BufferSize = 1024;

    const size_t oldSize = myData.size();
    myData.resize(myData.size() + BufferSize);        

    const unsigned bytesRead = get_network_data(&myData[oldSize], BufferSize);
    myData.resize(oldSize + bytesRead);

    if (bytesRead == 0) {
        break;
    }
}

Which reads directly into the buffer, at the cost of occasionally over-allocating.

This can be made smarter by e.g. doubling the vector size for each resize to amortize resizes, as the first solution does implicitly. And of course, you can reserve() a much larger buffer up front if you have a priori knowledge of the probable size of the final buffer, to minimize resizes.

Both are left as an exercise for the reader. :)

Finally, if you need to treat your data as a raw-array:

some_c_function(myData.data(), myData.size());

std::vector is guaranteed to be contiguous.

Solution 2

std::string would work for this:

  • It supports embedded nulls.
  • You can append multi-byte chunks of data to it by calling append() on it with a pointer and a length.
  • You can get its contents as a char array by calling data() on it, and the current length by calling size() or length() on it.
  • Freeing the buffer is handled automatically by the destructor, but you can also call clear() on it to erase its contents without destroying it.

Solution 3

std::vector<unsigned char> buffer;

Every push_back will add new char at the end (reallocating if needed). You can call reserve to minimize the number of allocations if you roughly know how much data you expect.

buffer.reserve(1000000);

If you have something like this:

unsigned char buffer[1000];
std::vector<unsigned char> vec(buffer, buffer + 1000);

Solution 4

One more vote for std::vector. Minimal code, skips the extra copy GMan's code do:

std::vector<char> buffer;
static const size_t MaxBytesPerRecv = 1024;
size_t bytesRead;
do
{
    const size_t oldSize = buffer.size();

    buffer.resize(oldSize + MaxBytesPerRecv);
    bytesRead = receive(&buffer[oldSize], MaxBytesPerRecv); // pseudo, as is the case with winsock recv() functions, they get a buffer and maximum bytes to write to the buffer

    myData.resize(oldSize + bytesRead); // shrink the vector, this is practically no-op - it only modifies the internal size, no data is moved/freed
} while (bytesRead > 0);

As for calling WinAPI functions - use &buffer[0] (yeah, it's a little bit clumsy, but that's the way it is) to pass to the char* arguments, buffer.size() as length.

And a final note, you can use std::string instead of std::vector, there shouldn't be any difference (except you can write buffer.data() instead of &buffer[0] if you buffer is a string)

Solution 5

I'd take a look at Boost basic_streambuf, which is designed for this kind of purpose. If you can't (or don't want to) use Boost, I'd consider std::basic_streambuf, which is quite similar, but a little more work to use. Either way, you basically derive from that base class and overload underflow() to read data from the socket into the buffer. You'll normally attach an std::istream to the buffer, so other code reads from it about the same way as they would user input from the keyboard (or whatever).

Share:
41,228
Vilx-
Author by

Vilx-

Just your average everyday programmer. #SOreadytohelp

Updated on February 18, 2021

Comments

  • Vilx-
    Vilx- about 3 years

    I'm not exactly a C++ newbie, but I have had little serious dealings with it in the past, so my knowledge of its facilities is rather sketchy.

    I'm writing a quick proof-of-concept program in C++ and I need a dynamically sizeable buffer of binary data. That is, I'm going to receive data from a network socket and I don't know how much there will be (although not more than a few MB). I could write such a buffer myself, but why bother if the standard library probably has something already? I'm using VS2008, so some Microsoft-specific extension is just fine by me. I only need four operations:

    • Create the buffer
    • Write data to the buffer (binary junk, not zero-terminated)
    • Get the written data as a char array (together with its length)
    • Free the buffer

    What is the name of the class/function set/whatever that I need?

    Added: Several votes go to std::vector. All nice and fine, but I don't want to push several MB of data byte-by-byte. The socket will give data to me in few-KB large chunks, so I'd like to write them all at once. Also, at the end I will need to get the data as a simple char*, because I will need to pass the whole blob along to some Win32 API functions unmodified.

  • Vilx-
    Vilx- over 14 years
    OK, but I don't see a member with which I could add a whole buffer of data. Or do I have to push several MB byte-by-byte? I will read from the socket it in nice few-KB large chunks.
  • atzz
    atzz over 14 years
    Vilx -- use myData.insert(myData.end(), bytes_ptr, bytes_ptr + bytes_count)
  • Shariful
    Shariful over 14 years
    Assuming that you have a buffer of known size, vec.insert(vec.end, buf, buf+length)
  • Vilx-
    Vilx- over 14 years
    I don't see an append() member function on the vector.
  • RobH
    RobH over 14 years
    Vector is required to be contiguous, so it is possible to take the address of an element and memcopy() a block of data into it. Feel free to shudder at the horror of this.
  • Vilx-
    Vilx- over 14 years
    I shudder at the horror of this.
  • GManNickG
    GManNickG over 14 years
    Taking the address of the first element is fairly common. Also, if you're reading network data and want it, you'll have to copy somewhere, which involves every byte. Some CPU's can copy multiple bytes at once, and your compiler will take advantage of that for you.
  • sbk
    sbk over 14 years
    Why use intermediate buffer? Why not read network data directly into the vector? Resize the vector to its old size +N, receive maximum N bytes to &vector[old_vector_size].
  • Useless
    Useless over 14 years
    +1: if you choose vector, this is the way to do it. I still claim the vector here just used as a collection of {size, capacity, pointer} and you could just as easily call realloc yourself though ...
  • GManNickG
    GManNickG over 14 years
    I claim C++ is just really some assembly instructions and you should use those. :P
  • Useless
    Useless over 14 years
    Fair enough ;D I just don't think the vector is adding much abstraction or expressiveness here - although this may depend on the user/reader's level of comfort with C memory allocation.
  • Sandeep Datta
    Sandeep Datta over 14 years
    @Useless: Ok then how about hassle free exception safe memory management?
  • Useless
    Useless over 14 years
    OK, good point: I'm used to idiomatic C code for low-level socket programming (and the POSIX sockets API doesn't throw), but it isn't either good style in general or idiomatic C++.
  • Vilx-
    Vilx- over 14 years
    Well, that is basically what I want to do. I just wondered if there wasn't some built-in way for doing that already.
  • Offirmo
    Offirmo about 11 years
    But can we have several \0 inside the buffer ?
  • Wyzard
    Wyzard about 11 years
    Yes, that's what I meant when I said that it supports embedded nulls.
  • Offirmo
    Offirmo about 11 years
    Interesting, std::string is more powerful than I thought. cf. stackoverflow.com/a/5319584/587407 +1
  • enthusiasticgeek
    enthusiasticgeek almost 11 years
    @sbk myData.resize(oldSize + bytesRead); should be buffer.resize(oldSize + bytesRead);...a small typo I think.
  • Martin Sherburn
    Martin Sherburn almost 3 years
    By default resize will zero initialise all elements so the second answer of reading directly into the vector is replacing the cost of copying data with zero initialisation. For better performance see stackoverflow.com/questions/21028299/… to avoid the zero initialisation.