Is there a way to get std:string's buffer

25,281

Solution 1

Use std::vector<char> if you want a real buffer.

#include <vector>
#include <string>

int main(){
  std::vector<char> buff(MAX_PATH+1);
  ::GetCurrentDirectory(MAX_PATH+1, &buff[0]);
  std::string path(buff.begin(), buff.end());
}

Example on Ideone.

Solution 2

While a bit unorthodox, it's perfectly valid to use std::string as a linear memory buffer, the only caveat is that it isn't supported by the standard until C++11 that is.

std::string s;
char* s_ptr = &s[0]; // get at the buffer

To quote Herb Sutter,

Every std::string implementation I know of is in fact contiguous and null-terminates its buffer. So, although it isn’t formally guaranteed, in practice you can probably get away with calling &str[0] to get a pointer to a contiguous and null-terminated string. (But to be safe, you should still use str.c_str().)

"Probably" is key here. So, while it's not a guarantee, you should be able to rely on the principle that std::string is a linear memory buffer and you should assert facts about this in your test suite, just to be sure.

You can always build your own buffer class but when you're looking to buy, this is what the STL has to offer.

Solution 3

Not portably, no. The standard does not guarantee that std::strings have an exclusive linear representation in memory (and with the old C++03 standard, even data-structures like ropes are permitted), so the API does not give you access to it. They must be able to change their internal representation to that (in C++03) or give access to their linear representation (if they have one, which is enforced in C++11), but only for reading. You can access this using data() and/or c_str(). Because of that, the interface still supports copy-on-write.

The usual recommendation for working with C-APIs that modify arrays by accessing through pointers is to use an std::vector, which is guaranteed to have a linear memory-representation exactly for this purpose.

To sum this up: if you want to do this portably and if you want your string to end up in an std::string, you have no choice but to copy the result into the string.

Solution 4

 std::string str("Hello world");
 LPCSTR sz = str.c_str();

Keep in mind that sz will be invalidated when str is reallocated or goes out of scope. You could do something like this to decouple from the string:

 std::vector<char> buf(str.begin(), str.end()); // not null terminated
 buf.push_back(0); // null terminated

Or, in oldfashioned C style (note that this will not allow strings with embedded null-characters):

 #include <cstring>

 char* sz = strdup(str.c_str());

 // ... use sz

 free(sz);

Solution 5

According to this MSDN article, I think this is the best approach for what you want to do using std::wstring directly. Second best is std::unique_ptr<wchar_t[]> and third best is using std::vector<wchar_t>. Feel free to read the article and draw you own conclusions.

// Get the length of the text string
// (Note: +1 to consider the terminating NUL)
const int bufferLength = ::GetWindowTextLength(hWnd) + 1;
// Allocate string of proper size
std::wstring text;
text.resize(bufferLength);
// Get the text of the specified control
// Note that the address of the internal string buffer
// can be obtained with the &text[0] syntax
::GetWindowText(hWnd, &text[0], bufferLength);
// Resize down the string to avoid bogus double-NUL-terminated strings
text.resize(bufferLength - 1);
Share:
25,281
MikMik
Author by

MikMik

Updated on November 14, 2020

Comments

  • MikMik
    MikMik over 3 years

    Is there a way to get the "raw" buffer o a std::string?
    I'm thinking of something similar to CString::GetBuffer(). For example, with CString I would do:

    CString myPath;  
    ::GetCurrentDirectory(MAX_PATH+1, myPath.GetBuffer(MAX_PATH));  
    myPath.ReleaseBuffer();  
    

    So, does std::string have something similar?

  • MikMik
    MikMik over 12 years
    "The C++ Programming Language, 3rd Ed." says about string::data(): "writes the characters of the string into an array and returns a pointer to that array". And about c_str: "The c_str() function is like data(), except that it adds a 0 (zero) at the end [...]". So they return copies (although implementers decide to return a pointer to the buffer)
  • James Johnston
    James Johnston over 12 years
    The first parameter for GetCurrentDirectory is supposed to be the length of the buffer. Your code initializes the buffer length to MAX_PATH but then states that MAX_PATH+1 characters are available. So your code runs the risk of GetCurrentDirectory writing a NULL character past the end of the vector - not good.
  • Xeo
    Xeo over 12 years
    @James: Thanks, I don't really work with the winapi functions. :) Well, easy to fix... Also, isn't the same problem present in the OP's code?
  • James Johnston
    James Johnston over 12 years
    Yeah you're right; I didn't read his code. But the error exists in OP's code as well.
  • sehe
    sehe over 12 years
    @MikMik: under all known implementations it returns the buffer. Also, with the new C++11 standard, this is required (implicitely) due to the complexity requirements - see this explanation
  • sehe
    sehe over 12 years
    which standard do you refer to? c++11 does guarantee that. Again, see this background discussion
  • MikMik
    MikMik over 12 years
    Not really, I think. GetCurrentDirectory states that "the buffer length must include room for a terminating null character" and CString::GetBuffer() says "The minimum size of the character buffer in characters. This value does not include space for a null terminator". So I think I got it right.
  • Xeo
    Xeo over 12 years
    @Mik: Ok, sorry then, like I said, I don't work with the winapi and CString n stuff. :)
  • MikMik
    MikMik over 12 years
    I'm no expert, but "under all known implementations it returns the buffer" it's not "it MUST return the buffer". Now, in C++11 it is required? Good to know, but I don't have C++11 yet. Anyway, if I can't write to the buffer, it is of no help to me right now.
  • ltjax
    ltjax over 12 years
    I kinda depends on how far you take object identity. Yes, the new standard does guarantee that a linear representation of the string (as in char memory-block) exists in memory. No, it does not guarantee that this is this strings' exclusive representation. After all, copy-on-write implementations still match the spec, hence data() and c_str() remain const. I'll update my answer to reflect the "exclusivity" of that buffer.
  • sehe
    sehe over 12 years
    COW is more or less obsolete with c++0x move semantics; also it is very inconvenient since C++0x multithreading specification; COW is hardly ever a performance benefit in concurrent programming (due to locking required) and has far too many performance surprises. In fact, I remember reading that C++11 would outlaw COW implementations of std::string (but I still can't find the link)
  • Fred Foo
    Fred Foo over 12 years
    @sehe: I'm not trying to provoke a flame war. I've just read too many posts containing LPCSTR and other Windows-isms without a winapi tag on this website.
  • sehe
    sehe over 12 years
    @larsmans: AFAICT the tagging winapi is not mandatory. Moreover, we are allowed to add it (in fact, I'll start doing so, since now I learned about it)
  • ltjax
    ltjax over 12 years
    I agree that COW implementations are stupid in general, but the specs look like they still allow it. Even though this is off-topic: While COW can be used to emulate move-semantics, it can still be beneficial for long strings that exist in multiple instances - even if just for the memory savings.
  • Fred Foo
    Fred Foo over 12 years
    @sehe: ok. No offence taken, I hope? I only realised later how harsh my words might have seemed. Irony is hard to convey on the internet, so I keep finding out :)
  • Fred Foo
    Fred Foo over 12 years
    Oh and seeing your updated answer: strdup is not available on the Windows platform, IIRC; it's called _strdup there.
  • John Leidegren
    John Leidegren about 11 years
    If you use std::vector as a buffer you'll end up initializing every element in the vector in the most expensive manner possible. This will dwarf any CPU cost throughout your application. Use std::unique_ptr<char[]> or stack allocation. Don't waste CPU on initializing a buffer if you don't need to.
  • Puppy
    Puppy about 11 years
    @John: You're insane. MAX_PATH is only about 256. The cost of initializing such a vector is irrelevant.
  • John Leidegren
    John Leidegren about 11 years
    @DeadMG I'm speaking from experience only, run your program through a profiler and check for yourself but you obviously need more than 1 call for it to be a bottleneck. The reason I bring it up is because if you put that in a library routine which you call often, you're going to waste a lot of CPU. Moreover, since you know MAX_PATH at compile-time, this could be stack allocated, the duration of that buffer is in all likelihood going to be very short. The notion that the vector class represents a real buffer, in any way, is completely fallacious.
  • John Leidegren
    John Leidegren about 11 years
    Please take under advisement that std::vector<char> does not do anything smart, such as memset(..., 0, sizeof ...) which would be fast(er) it does proper initialization, of every element.
  • rubenvb
    rubenvb about 11 years
    @JohnLeidegren Ever heard of vector<char> buffer; buffer.reserve(buffer_size);? No memset involved. Contiguous memory. No pointers. No fuss. No lies.
  • Puppy
    Puppy about 11 years
    @JohnLeidegren: The kernel switch to GetCurrentDirectory is probably more expensive. You would need not just more than 1, but a massive number of calls for this to be a bottleneck. Your comment implies that it is likely for it to be a serious problem, whereas infact it is hideously unlikely for it to be a problem.
  • rubenvb
    rubenvb about 11 years
    I stand corrected. It appears my vector<char> buffer as used above invokes undefined behavior. Then I change my opinion to std::array. If the buffer can't fit on the stack, it ain't worth the name buffer.
  • John Leidegren
    John Leidegren about 11 years
    @DeadMG I'm not arguing that it is faster/slower than a kernel transition. I'm making a point that, as a buffer, it unnecessarily slow. My opinion, is that it's a bad habit. Even small, seemingly insignificant code like this piles up, and eventually does lead to a noticeable performance impact.
  • paulm
    paulm over 10 years
    str.c_str() returns a const pointer so surely that is just as, if not more dangerous
  • John Leidegren
    John Leidegren over 10 years
    I guess the problem here is that it is in the implementation details. Not in anyway suggesting that is good practice, but if you wish to workaround the limitation, you may do so. Even with say the small string optimization in place you're going to obtain a memory location that is safe to write to, given that you respect the bounds of the array. You may not realize that it has returned a pointer to the stack but this is not something you need to know. I cannot see a situation in which the returned pointer actually points to a guard page (that would result in a page fault if you wrote to it).
  • Erik Aronesty
    Erik Aronesty over 9 years
    @sehe i would say that concurrent programming is the thing that's fairly useless and that COW is the thing that comes in handy for efficient passing around of strings without pointers and references. do async for your i/o, use message queues to talk between processes, and forget about messy locks!
  • Alexandr Zarubkin
    Alexandr Zarubkin over 5 years
    This should be accepted answer since C++11. See tomhuang.com/2011/10/24/…
  • Alexandr Zarubkin
    Alexandr Zarubkin over 5 years
    As far as I understand, COW implementations are prohibited since C++11, no?
  • Gabriel Staples
    Gabriel Staples almost 2 years
    There's a second huge caveat!: if you are writing into this std::string buffer as though it was a char *, you must pre-allocate the buffer size with s.resize(BUFFER_SIZE), or else it is undefined behavior to write into that buffer. s.reserve(BUFFER_SIZE) does not cut it. See: en.cppreference.com/w/cpp/string/basic_string/operator_at: "If pos > size(), the behavior is undefined." So, you use s.resize() to forcefully allocate value null terminator chars into the buffer first, up to that size. Then, you can write into that buffer like a normal char * up to its size.
  • Gabriel Staples
    Gabriel Staples almost 2 years
    I've added an answer here to explain in detail what I said in my last comment.