Safe C++ std::string to TCHAR * conversion?

11,727

Solution 1

If you do not need to support the string containing UTF-8 (or another multi-byte encoding) then simply use the ANSI version of Windows API:

handle = CreateFileA( filename.c_str(), .......)

You might need to rejig your code for this as you have the CreateFile buried in a function that expects TCHAR. That's not advised these days; it's a pain to splatter T versions of everything all over your code and it has flow-on effects (such as std::tstring that someone suggested - ugh!)

There hasn't been any need to support dual compilation from the same source code since about 1998. Windows API has to support both versions for backward compatibility but your own code does not have to.


If you do want to support the string containing UTF-8 (and this is a better idea than using UTF-16 everywhere) then you will need to convert it to a UTF-16 string in order to call the Windows API.

The usual way to do this is via the Windows API function MultiByteToWideChar which is a bit awkward to use correctly, but you could wrap it up in a function:

std::wstring make_wstring( std::string const &s );

that invokes MultiByteToWideChar to return a UTF-16 string that you can then pass to WinAPI functions by using its .c_str() function.

See this codereview thread for a possible implementation of such a function (although note discussion in the answers)

Solution 2

The root of your problem is that you are mixing TCHARs and non-TCHARs. Even if you get it to work on your machine, unless you do it precisely right, it will fail when non-ASCII characters are used in the path.

If you can use std::tstring instead of regular string, then you won't have to worry about format conversions or codepage versus Unicode issues.

If not, you can use conversion functions like MultiByteToWideChar but make sure you understand the encoding used in the source string or it will just make things worse.

Solution 3

Try this instead:

std::string filename = fileList->getFullPath(index);

#ifndef UNICODE
audioreader.setFile(filename.c_str());
#else
std::wstring w_filename;
int len = MultiByteToWideChar(CP_ACP, 0, filename.c_str(), filename.length(), NULL, 0);
if (len > 0)
{
    w_filename.resize(len);
    MultiByteToWideChar(CP_ACP, 0, filename.c_str(), filename.length(), &w_filename[0], len);
}
audioreader.setFile(w_filename.c_str());
#endif

Alternatively:

std::string filename = fileList->getFullPath(index);

#ifndef UNICODE
audioreader.setFile(filename.c_str());
#else
std::wstring_convert<std::codecvt<wchar_t, char, std::mbstate_t>> conv;
std::wstring w_filename = conv.from_bytes(filename);
audioreader.setFile(w_filename.c_str());
#endif
Share:
11,727
Force Gaia
Author by

Force Gaia

A C++ develpoper currently wokring on a cross-platform OpenGL mobile app My area of greatest experience is C++; but i have some knowledge of Java, Python, C# (but I've never touched ASP), and PHP/CSS/JS. I have an keen interest in furthering my knowledge of the web languages and really want to learn where to start with ASP. I have touched briefly on VB.Net and Prolog many years ago.

Updated on June 21, 2022

Comments

  • Force Gaia
    Force Gaia almost 2 years

    I am trying to convert a std::string to a TCHAR* for use in CreateFile(). The code i have compiles, and works, but Visual Studio 2013 comes up with a compiler warning:

    warning C4996: 'std::_Copy_impl': Function call with parameters that may be unsafe - this call relies on the caller to check that the passed values are correct. To disable this warning, use -D_SCL_SECURE_NO_WARNINGS. See documentation on how to use Visual C++ 'Checked Iterators'

    I understand why i get the warning, as in my code i use std::copy, but I don't want to define D_SCL_SECURE_NO_WARNINGS if at all possible, as they have a point: std::copy is unsafe/unsecure. As a result, I'd like to find a way that doesn't throw this warning.

    The code that produces the warning:

    std::string filename = fileList->getFullPath(index);
    TCHAR *t_filename = new TCHAR[filename.size() + 1];
    t_filename[filename.size()] = 0;
    std::copy(filename.begin(), filename.end(), t_filename);
    audioreader.setFile(t_filename);
    

    audioreader.setfile() calls CreateFile() internally, which is why i need to convert the string.

    fileList and audioreader are instances of classes i wrote myself, but I'd rather not change the core implementation of either if at all possible, as it would mean I'd need to change a lot of implementation in other areas of my program, where this conversion only happens in that piece of code. The method I used to convert there was found in a solution i found at http://www.cplusplus.com/forum/general/12245/#msg58523

    I've seen something similar in another question (Converting string to tchar in VC++) but i can't quite fathom how to adapt the answer to work with mine as the size of the string isn't constant. All other ways I've seen involve a straight (TCHAR *) cast (or something equally unsafe), which as far as i know about the way TCHAR and other windows string types are defined, is relatively risky as TCHAR could be single byte or multibyte characters depending on UNICODE definition.

    Does anyone know a safe, reliable way to convert a std::string to a TCHAR* for use in functions like CreateFile()?

    EDIT to address questions in the comments and answers:

    Regarding UNICODE being defined or not: The project in VS2013 is a win32 console application, with #undef UNICODE at the top of the .cpp file containing main() - what is the difference between UNICODE and _UNICODE? as i assume the underscore in what Amadeus was asking is significant.

    Not directly related to the question but may add perspective: This program is not going to be used outside the UK, so ANSI vs UNICODE does not matter for this. This is part of a personal project to create an audio server and client. As a result you may see some bits referencing network communication. The aim of this program is to get me using Xaudio and winsock. The conversion issue purely deals with the loading of the file on the server-side so it can open it and start reading chunks to transmit. I'm testing with .wav files found in c:/windows/media

    Filename encoding: I read the filenames in at runtime by using FindFirstFileA() and FindNextFileA(). The names are retrieved by looking at cFilename in a WIN32_FIND_DATAA structure. They are stored in a vector<string> (wrapped in a unique_ptr if that matters) but that could be changed. I assume this is what Dan Korn means.

    More info about the my classes and functions:

    The following are spread between AudioReader.h, Audioreader.cpp, FileList.h, FileList.cpp and ClientSession.h. The fragment above is in ClientSession.cpp. Note that in most of my files i declare using namespace std;

    shared_ptr<FileList> fileList; //ClientSession.h
    AudioReader audioreader; //ClientSession.h
    
    string _storedpath; //FileList.h
    unique_ptr<vector<string>> _filenames; //FileList.h
    
    //FileList.cpp
    string FileList::getFullPath(int i)
    {
    string ret = "";
    unique_lock<mutex> listLock(listmtx);
        if (static_cast<size_t>(i) < _count)
        {
            ret = _storedpath + _filenames->at(i);
        }
        else
        {
            //rather than go out of bounds, return the last element, as returning an error over the network is difficult at present
            ret = _storedpath + _filenames->at(_count - 1);
        }
    return ret;
    }
    
    unique_ptr<AudioReader_Impl> audioReaderImpl; //AudioReader.h
    
    //AudioReader.cpp
    HRESULT AudioReader::setFile(TCHAR * fileName)
    {
        return audioReaderImpl->setFile(fileName);
    }
    
    HANDLE AudioReader_Impl::fileHandle; //AudioReader.cpp
    //AudioReader.cpp
    HRESULT AudioReader_Impl::setFile(TCHAR * fileName)
    {
        fileHandle = CreateFile(fileName, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
        if (fileHandle == INVALID_HANDLE_VALUE)
        {
            return HRESULT_FROM_WIN32(GetLastError());
        }
        if (SetFilePointer(fileHandle, 0, NULL, FILE_BEGIN) == INVALID_SET_FILE_POINTER)
        {
            return HRESULT_FROM_WIN32(GetLastError());
        }
        return S_OK;
    }
    
    • user541686
      user541686 over 9 years
      This isn't relevant to the warning, but what's the encoding of your string? copy doesn't necessarily work...
    • Dan Korn
      Dan Korn over 9 years
      It's hard to know what you need to do without, at the very least, seeing the declaration of the setfile() function. Anyway, the first parameter to CreateFile() is of type LPCTSTR, a pointer to a const TCHAR. Presumably the setfile function also takes a const pointer. So instead of making a non-const string, you should be able to just use whatever you were assigning to the std::string in the first place. Although, your program will be more portable around the world if you define UNICODE in the project settings and handle wide-character (wchar_t) strings instead of just 8-bit char strings.
    • Amadeus
      Amadeus over 9 years
      Is _UNICODE defined in your code? If not, then TCHAR* is defined as char*, so you could simply use the .c_str() function of std::string. Otherwise, use std::wstring instead and then .c_str() should return the unicode version. It may not be the perfect solution, but it seems the path of least resistance.
    • MrEricSir
      MrEricSir over 9 years
      Why not just call CreateFileA() and save yourself the trouble of re-encoding the string?
    • Force Gaia
      Force Gaia over 9 years
      More details given, I hope it provides the info you need
    • Remy Lebeau
      Remy Lebeau over 9 years
      DO NOT use #undef UNICODE in your code. Set the appropriate setting in the Project Options instead, let the compiler decide whether or not UNICODE should be defined. As for _UNICODE vs UNICODE, the first is used by the C runtime library while the second is used by the Win32 API.
    • Force Gaia
      Force Gaia over 9 years
      What exactly are the implications of that undef? As I'd really rather not have to fix huge chunks of my program if removing it breaks it, if the implications are relatively minor.
    • StilesCrisis
      StilesCrisis over 9 years
      Where does _storedpath come from? That's your problem. It ought to be a tstring. You don't show its source.
    • Force Gaia
      Force Gaia over 9 years
      _storedpath is a std::string that contains the directory im searching, as it's consistent throughout
  • marc
    marc over 9 years
    You may want to include a discussion of ownership of the returnvalue of filename.c_str(). What is going to happen once filename or w_filename go out of scope? Right, the return value of c_str is invlid then. this means that we want to create a copy of it, for which OP controls the lifetime.
  • Remy Lebeau
    Remy Lebeau over 9 years
    I read the OP's question as saying the input value of setFile() is passed as-is to CreateFile(). That implies that ownership remains with the caller.
  • Force Gaia
    Force Gaia over 9 years
    I have just provided a few more details about fileList and audioreader if that alters your answer in any way.
  • Force Gaia
    Force Gaia over 9 years
    Both sound like a good way to go, I'll look into them then get back to you. I even used FindFirstFileA to solve a similar problem when dealing with that, I should have realised that there'd be similar for others. But knowing about std::wstring is also interesting. I may try and support unicode for robustness, but ANSI looks to be similar. Regardless, trying both will be good practice.
  • Remy Lebeau
    Remy Lebeau over 9 years
    Since you are using FindFirstFileA() to get the filename and storing it in a std::string, you should be using CreateFileA() to open the file. Obviously, your code is NOT designed to operate with Unicode (even though the OS and filesystem are based on Unicode), so you should not even be using TCHAR* to begin with. Get rid of it and just use char* instead. If you did upgrade to Unicode, you could then use wchar_t* and std::wstring with FindFirstFileW() and CreateFileW(). The ONLY reason to use TCHAR is if you still need to support Windows 9x/ME. Later versions use Unicode
  • Force Gaia
    Force Gaia over 9 years
    Good to know, I may even do that assuming it won't be a pain to convert.
  • Force Gaia
    Force Gaia over 9 years
    Nice idea, but going on what Matt McNabb mentioned this may just complicate things for me unnecessarily. However good to know these exist
  • Force Gaia
    Force Gaia over 9 years
    This did exactly what i needed, a small change that meant i could continue. So i can worry about supporting other character encoding later, once the rest of the logic is in place.
  • StilesCrisis
    StilesCrisis over 9 years
    I think the question that needs answering is, where does the string come from to begin with and why is it not a tstring? If you're doing it right you shouldn't ever need to convert, unless you need to interface with a weird API or read from text files made with different encodings.
  • Force Gaia
    Force Gaia over 9 years
    The strings that were complicating matters were coming from FindFirstFileA() which i now know is due to the two different versions of the API, and i was using specifically the ANSI one. Remy Lebeau and Matt McNabb helped me understand what the source of the issue was. Not that you didn't try, it's just i didn't understand what a TCHAR actually was.
  • StilesCrisis
    StilesCrisis over 9 years
    Yeah, you needed to use FindFirstFile (no A or W) to explicitly get the TCHAR version.
  • Clearer
    Clearer about 6 years
    Being a complete novice in Windows programming; could you point to an implementation of make_wstring?