LPCSTR, LPCTSTR and LPTSTR
Solution 1
To answer the first part of your question:
LPCSTR
is a pointer to a const string (LP means Long Pointer)
LPCTSTR
is a pointer to a const TCHAR
string, (TCHAR
being either a wide char or char depending on whether UNICODE is defined in your project)
LPTSTR
is a pointer to a (non-const) TCHAR
string
In practice when talking about these in the past, we've left out the "pointer to a" phrase for simplicity, but as mentioned by lightness-races-in-orbit they are all pointers.
This is a great codeproject article describing C++ strings (see 2/3 the way down for a chart comparing the different types)
Solution 2
Quick and dirty:
LP
== Long Pointer. Just think pointer or char*
C
= Const, in this case, I think they mean the character string is a const, not the pointer being const.
STR
is string
the T
is for a wide character or char (TCHAR) depending on compiler options.
Bonus Reading
From What does the letter "T" in LPTSTR stand for?: archive
What does the letter "T" in LPTSTR stand for?
October 17th, 2006
The “T” in LPTSTR comes from the “T” in TCHAR. I don’t know for certain, but it seems pretty likely that it stands for “text”. By comparison, the “W” in WCHAR probably comes from the C language standard, where it stands for “wide”.
Solution 3
8-bit AnsiStrings
-
char
: 8-bit character (underlying C/C++ data type) -
CHAR
: alias ofchar
(Windows data type) -
LPSTR
: null-terminated string ofCHAR
(Long Pointer) -
LPCSTR
: constant null-terminated string ofCHAR
(Long Pointer Constant)
16-bit UnicodeStrings
-
wchar_t
: 16-bit character (underlying C/C++ data type) -
WCHAR
: alias ofwchar_t
(Windows data type) -
LPWSTR
: null-terminated string ofWCHAR
(Long Pointer) -
LPCWSTR
: constant null-terminated string ofWCHAR
(Long Pointer Constant)
depending on UNICODE
define
-
TCHAR
: alias ofWCHAR
if UNICODE is defined; otherwiseCHAR
-
LPTSTR
: null-terminated string ofTCHAR
(Long Pointer) -
LPCTSTR
: constant null-terminated string ofTCHAR
(Long Pointer Constant)
So:
Item | 8-bit (Ansi) | 16-bit (Wide) | Varies |
---|---|---|---|
character | CHAR |
WCHAR |
TCHAR |
string | LPSTR |
LPWSTR |
LPTSTR |
string (const) | LPCSTR |
LPCWSTR |
LPCTSTR |
Bonus Reading
TCHAR
→ Text Char (archive.is)
Why is the default 8-bit codepage called "ANSI"?
From Unicode and Windows XP
by Cathy Wissink
Program Manager, Windows Globalization
Microsoft Corporation
May 2002
Despite the underlying Unicode support on Windows NT 3.1, code page support continued to be necessary for many of the higher-level applications and components included in the system, explaining the pervasive use of the “A” [ANSI] versions of the Win32 APIs rather than the “W” [“wide” or Unicode] versions. (The term “ANSI” as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community. The source of this comes from the fact that the Windows code page 1252 was originally based on an ANSI draft, which became ISO Standard 8859-1. However, in adding code points to the range reserved for control codes in the ISO standard, the Windows code page 1252 and subsequent Windows code pages originally based on the ISO 8859-x series deviated from ISO. To this day, it is not uncommon to have the development community, both within and outside of Microsoft, confuse the 8859-1 code page with Windows 1252, as well as see “ANSI” or “A” used to signify Windows code page support.)
Solution 4
Adding to John and Tim's answer.
Unless you are coding for Win98, there are only two of the 6+ string types you should be using in your application
LPWSTR
LPCWSTR
The rest are meant to support ANSI platforms or dual compilations. Those are not as relevant today as they used to be.
Solution 5
To answer the second part of your question, you need to do things like
LV_DISPINFO dispinfo;
dispinfo.item.pszText = LPTSTR((LPCTSTR)string);
because MS's LVITEM
struct has an LPTSTR
, i.e. a mutable T-string pointer, not an LPCTSTR
. What you are doing is
1) convert string
(a CString
at a guess) into an LPCTSTR
(which in practise means getting the address of its character buffer as a read-only pointer)
2) convert that read-only pointer into a writeable pointer by casting away its const
-ness.
It depends what dispinfo
is used for whether or not there is a chance that your ListView
call will end up trying to write through that pszText
. If it does, this is a potentially very bad thing: after all you were given a read-only pointer and then decided to treat it as writeable: maybe there is a reason it was read-only!
If it is a CString
you are working with you have the option to use string.GetBuffer()
-- that deliberately gives you a writeable LPTSTR
. You then have to remember to call ReleaseBuffer()
if the string does get changed. Or you can allocate a local temporary buffer and copy the string into there.
99% of the time this will be unnecessary and treating the LPCTSTR
as an LPTSTR
will work... but one day, when you least expect it...
nothingMaster
Updated on December 11, 2021Comments
-
nothingMaster over 2 years
What the difference between
LPCSTR
,LPCTSTR
andLPTSTR
?Why do we need to do this to convert a string into a
LV
/_ITEM
structure variablepszText
:LV_DISPINFO dispinfo; dispinfo.item.pszText = LPTSTR((LPCTSTR)string);
-
nothingMaster over 15 yearsI quickly scanned that article - seems great, adding it to my bookmarks and will read it as soon as I have time.
-
josesuero over 15 yearsT is not for wide character, it is for varying character type. W is for wide (as in WCHAR). If UNICODE is defined, TCHAR == WCHAR, otherwise TCHAR == CHAR. So if UNICODE is not defined, LPCTSTR == LPCSTR.
-
Tim over 15 yearsthat is why I wrote "depending on compile options"
-
Dzung Nguyen about 14 yearsI really love this type of explaining :) . Thanks so much
-
JaredPar almost 14 years@BlueRaja, I was mainly referring to C based strings in my answer. But for C++ I would avoid
std::string
because it is still an ASCII based string and preferstd::wstring
instead. -
Pacerier almost 9 years@jalf, So what does T stand for?
-
josesuero almost 9 years@Pacerier I'm not sure. "Template" or "Type", possibly?
-
Tim almost 9 yearsdefinitely not template and not type. codeproject.com/Articles/76252/…
-
Lightness Races in Orbit almost 9 yearsAll wrong. None of these things are strings. They are all pointers. -1
-
StrayPointer almost 9 years@LightnessRacesinOrbit You are technically correct - although in my experience it is common practice to leave out the "pointer to a...." description for brevity when referring to string types in C++
-
Lightness Races in Orbit almost 9 years@JohnSibly: In C, yes. In C++, it absolutely shouldn't be!!
-
osvein almost 7 yearsYou should be using LPTSTR and LPCTSTR unless you are calling the ASCII (*A) or widechar (*W) versions of functions directly. They are aliases of whatever character width you specify when you compile.
-
u8it over 6 yearsNotice that that codeproject article was written 15 years ago and, unless it gets updated, contains misleading assumptions about Unicode characters always being 2 bytes. That's entirely wrong. Even UTF16 is variable length... it is much better to say that wide characters are UCS-2 encoded, and that "Unicode" in this context refers to UCS-2.
-
Dan Bechard about 6 yearsShame this answer will never make it to the top because it's so new.. that's really something SO needs to fix. This is the best answer by far.
-
Yoon5oo almost 6 yearsThis really helps me a lot while I am doing Unicode project at the work. Thanks!
-
harper almost 6 yearsYou should avoid C style cast and use
xxx_cast<>()
instead. -
mistertodd almost 6 years
-
AAT over 5 years@harper You are quite right -- but I was quoting the OP, that is the code he was asking about. If I'd written the code myself it would certainly have used
xxx_cast<>
rather than mixing two different bracket-based casting styles! -
Margaret Bloom over 5 yearsNice answer. I think it's worth adding that the unicode version uses UTF16, so each 16-bit chunk is not a character but a code-unit. The names are historical (when Unicode === UCS2).
-
plugwash over 5 yearsIt's a mess, Unicode characters were originally meant to be two bytes, but that turned out not to be enough. So UTF-16 was designed to shoehorn modern unicode into systems that were originally designed for 16 bit unicode. On modern windows a "wide string" is actually a sequence of UTF-16 code units.
-
plugwash over 5 yearsOf course the characters outside the basic multilingual plane are pretty rare, so most of the time you can get away with ignoring this detail.
-
Justin Time - Reinstate Monica over 4 yearsHmm... in this case, @LightnessRacesinOrbit, I would add an addendum that it's okay to leave out the "pointer to a..." when referring to C-strings in C++, if-and-only-if referring specifically to (decayed) string literals, or when interfacing/working with code that's either written in C, relies on C types instead of C++ types, and/or has C linkage via
extern "C"
. Apart from that, yeah, it definitely should need either the "pointer" bit, or specific description as a C string. -
Justin Time - Reinstate Monica over 4 years...And now that Microsoft is working on making the
*A
versions of WinAPI compatible with the UTF-8 code page, they're suddenly a lot more relevant. ;P -
StayOnTarget over 2 years@IanBoyd that link seems to have died? Do you know what the title of the article was?
-
Pierre over 2 yearsIn retrospect, it is now evident wchar_t was a mistake. MS should have gone with UTF-8. That's what most of the World is doing. Qt solves this beautifully with QString.