Why is the default value of the string type null instead of an empty string?
Solution 1
Why is the default value of the string type null instead of an empty string?
Because string
is a reference type and the default value for all reference types is null
.
It's quite annoying to test all my strings for null before I can safely apply methods like ToUpper(), StartWith() etc...
That is consistent with the behaviour of reference types. Before invoking their instance members, one should put a check in place for a null reference.
If the default value of string were the empty string, I would not have to test, and I would feel it to be more consistent with the other value types like int or double for example.
Assigning the default value to a specific reference type other than null
would make it inconsistent.
Additionally
Nullable<String>
would make sense.
Nullable<T>
works with the value types. Of note is the fact that Nullable
was not introduced on the original .NET platform so there would have been a lot of broken code had they changed that rule.(Courtesy @jcolebrand)
Solution 2
Habib is right -- because string
is a reference type.
But more importantly, you don't have to check for null
each time you use it. You probably should throw a ArgumentNullException
if someone passes your function a null
reference, though.
Here's the thing -- the framework would throw a NullReferenceException
for you anyway if you tried to call .ToUpper()
on a string. Remember that this case still can happen even if you test your arguments for null
since any property or method on the objects passed to your function as parameters may evaluate to null
.
That being said, checking for empty strings or nulls is a common thing to do, so they provide String.IsNullOrEmpty()
and String.IsNullOrWhiteSpace()
for just this purpose.
Solution 3
You could write an extension method (for what it's worth):
public static string EmptyNull(this string str)
{
return str ?? "";
}
Now this works safely:
string str = null;
string upper = str.EmptyNull().ToUpper();
Solution 4
You could also use the following, as of C# 6.0
string myString = null;
string result = myString?.ToUpper();
The string result will be null.
Solution 5
Empty strings and nulls are fundamentally different. A null is an absence of a value and an empty string is a value that is empty.
The programming language making assumptions about the "value" of a variable, in this case an empty string, will be as good as initiazing the string with any other value that will not cause a null reference problem.
Also, if you pass the handle to that string variable to other parts of the application, then that code will have no ways of validating whether you have intentionally passed a blank value or you have forgotten to populate the value of that variable.
Another occasion where this would be a problem is when the string is a return value from some function. Since string is a reference type and can technically have a value as null and empty both, therefore the function can also technically return a null or empty (there is nothing to stop it from doing so). Now, since there are 2 notions of the "absence of a value", i.e an empty string and a null, all the code that consumes this function will have to do 2 checks. One for empty and the other for null.
In short, its always good to have only 1 representation for a single state. For a broader discussion on empty and nulls, see the links below.
https://softwareengineering.stackexchange.com/questions/32578/sql-empty-string-vs-null-value
Related videos on Youtube
![Marcel](https://i.stack.imgur.com/lbfZC.png?s=256&g=1)
Marcel
I am an experienced software developer for both technical and business software, mainly in C#/.NET. Most professional projects are web applications or web services in the telecommunications field, for large corporate customers. I work and live in Switzerland. In my spare time I build and hack hardware stuff and occasionally, I blog on https://qrys.ch about it.
Updated on July 10, 2022Comments
-
Marcel almost 2 years
It's quite annoying to test all my strings for
null
before I can safely apply methods likeToUpper()
,StartWith()
etc...If the default value of
string
were the empty string, I would not have to test, and I would feel it to be more consistent with the other value types likeint
ordouble
for example. AdditionallyNullable<String>
would make sense.So why did the designers of C# choose to use
null
as the default value of strings?Note: This relates to this question, but is more focused on the why instead of what to do with it.
-
Jon Skeet over 11 yearsDo you consider this a problem for other reference types?
-
Marcel over 11 years@JonSkeet No, but only because I initially, wrongly, thought that strings are value types.
-
T.J. Crowder over 11 years@Marcel: That's a pretty good reason for wondering about it.
-
Konrad Rudolph over 11 years@JonSkeet Yes. Oh yes. (But you’re no stranger to the non-nullable reference type discussion …)
-
diegoreymendez over 11 yearsI believe you would have a much better time if you used assertions on your strings in places where you expect them NOT to be
null
(and also I recommend that you conceptually treatnull
and empty strings as different things). A null value could be the result of an error somewhere, while an empty string should convey a different meaning. -
JohnCastle over 11 yearsNull starts to be not very well considered. See here, "Null Reference : the billion dollar mistake" qconlondon.com/london-2009/presentation/…. Or here by the Google Guava library (in Java but still relevant) code.google.com/p/guava-libraries/wiki/…
-
jcolebrand over 11 years@JohnCastle I dare you to ask database developers who understand the value of trinary state if you can take their nulls from them. The reason it was no good was because people don't think in trinary, it's either left or right, up or down, yes or no. Relational algebra NEEDS a trinary state.
-
-
Dave Markle over 11 yearsBut please don't. The last thing another programmer wants to see is thousands of lines of code peppered with .EmptyNull() everywhere just because the first guy was "scared" of exceptions.
-
Tim Schmelter over 11 years@DaveMarkle: But obviously it's exactly what OP was looking for. "It's quite annoying to test all my strings for null before I can safely apply methods like ToUpper(), StartWith() etc"
-
Dave Markle over 11 yearsThe comment was to the OP, not to you. While your answer is clearly correct, a programmer asking a basic question such as this should be strongly cautioned against actually putting your solution into WIDE practice, as often is their wont. There are a number of tradeoffs you don't discuss in your answer, such as opaqueness, increased complexity, difficulty of refactoring, potential overuse of extension methods, and yes, performance. Sometimes (many times) a correct answer is not the right path, and this is why I commented.
-
Andy over 11 yearsYou should never throw a
NullReferenceException
yourself (msdn.microsoft.com/en-us/library/ms173163.aspx); you throw anArgumentNullException
if your method can't accept null refs. Also, NullRef's are typically one of the more difficult exceptions to diagnos when you're fixing issues, so I don't think the recommendation to not check for null is a very good one. -
Tim Schmelter over 11 yearsMany people seem to have your opinion. Note that i've not encouraged to replace all occurences of
string
withEmptyNull
. It's just a direct answer to OP's requirement. Many programmers know what they are doing or are working on their own(as me). Btw, here i've found a question which targets this issue: stackoverflow.com/questions/8536740/… -
Andy over 11 years@DaveMarkle The last thing another programmer wants to deal with are NullRefExceptions everywhere because proper null checking wasn't done.
-
Admin over 11 yearsIsn't the same true of the
object
keyword? Though admittedly, that's far less used thanstring
. -
Andy over 11 yearsAnd how exactly do you see this difference, say in a text box? Did the user forget to enter a value in the field, or are they purposefully leaving it blank? Null in a programming language does have a specific meaning; unassigned. We know it doesn't have a value, which is not the same as a database null.
-
Henk Holterman over 11 yearsBut string has special support in several areas (string literals) so it could have been implemented (easily).
-
Louis Kottmann over 11 years@Andy "NullRef's are typically one of the most difficult exceptions to diagnose" I strongly disagree, if you log stuff it's really easy to find & fix (just handle the null case).
-
Admin over 11 years@HenkHolterman One could implement a whole ton of things, but why introduce such a glaring inconsistency?
-
Abbas Gadhia over 11 yearstheres not much difference when you use it with a text box. Either ways, having one notation to represent the absence of a value in a string is paramount. If i had to pick one, i'd pick null.
-
Kos over 11 yearsThrowing
ArgumentNullException
has the additional benefit of being able to provide the parameter name. During debugging, this saves... err, seconds. But important seconds. -
Dave Markle over 11 years@Andy: The solution to not having proper null checking done is to properly check for nulls, not to put a band-aid on a problem.
-
Henk Holterman over 11 years@delnan - "why" was the question here.
-
user over 11 yearsIf you're going through the trouble of writing
.EmptyNull()
, why not simply use(str ?? "")
instead where it is needed? That said, I agree with the sentiment expressed in @DaveMarkle's comment: you probably shouldn't.null
andString.Empty
are conceptually different, and you can't necessarily treat one the same as another. -
Fabricio Araujo over 11 yearsIn Delphi, string is a value type and therefore can't be null. It makes life a lot easier in this respect - I really find very annoying make string an reference type.
-
Admin over 11 years@HenkHolterman And "Consistency" is the rebuttal to your point "string could be treated unlike other reference types".
-
Admin over 11 yearsSpeaking as someone who works in SQL a lot, and has dealt with the headache of Oracle not making a distinction between NULL and zero-length, I am very glad that .NET does. "Empty" is a value, "null" is not.
-
Fabricio Araujo over 11 years@JonofAllTrades: I disagree. On application code, except dealing with db code, there's no meaning an string being treated as a class. It's a value type and a basic one. Supercat: +1 to you
-
Fabricio Araujo over 11 years@delnan: Being working on a language that treats string as value types and working 2+ years on dotnet, I agree with Henk. I see it as a major FLAW on dotnet.
-
Admin over 11 yearsDatabase code is a big "except". As long as there are some problem domains where you need to distinguish between "present/known, an empty string" and "not present/unknown/inapplicable", such as databases, then the language needs to support it. Of course now that .NET has
Nullable<>
, strings could be reimplemented as value types; I can't speak to the costs and benefits of such a choice. -
Thorarin over 11 yearsAs
int
is an alias forSystem.Int32
. What's your point? :) -
jcolebrand over 11 yearsIt could be because Nullable was only introduced in the .NET 2.0 Framework, so before then it wasn't available?
-
Marcel over 11 years@jcolebrand Thanks for your initial edit, I would give it a 1+ if I could. I however compacted your edit a bit with a new edit.
-
Marcel over 11 yearsThanks for mentioning the "first initialisation" thing.
-
Dan Burton over 11 yearsHow would it be a problem when initializing a large array? Since, as you said, Strings are immutable, all elements of the array would simply be pointers to the same
String.Empty
. Am I mistaken? -
Marcel over 11 yearsThanks Dan Burton for pointing out that someone CAN set the initialized value to null on reference types later on. Thinking this through tells me that my original intent in the question leads to no use.
-
supercat over 11 years@JonofAllTrades: Code that deals with numbers has to have an out-of-band means of distinguishing the default value zero from "undefined". As it is, nullable-handling code that works with strings and numbers has to use one method for nullable strings and another for nullable numbers. Even if a nullable class type
string
is more efficient thanNullable<string>
would be, having to use the "more efficient" method is more burdensome than being able to use the same approach for all nullable data database values. -
supercat over 11 yearsUnder the COM (Common Object Model) which predated .net, a string type would either hold a pointer to the string's data, or
null
to represent the empty string. There are a number of ways .net could have implemented similar semantics, had they chosen to do so, especially given thatString
has a number of characteristics that make it a unique type anyway [e.g. it and the two array types are the only types whose allocation size isn't constant]. -
supercat over 11 years@delnan: One could create a value type which behaved essentially like
String
, except for (1) the value-type-ish behavior of having a usable default value, and (2) an unfortunate extra layer of boxing indirection any time it was cast toObject
. Given that the heap representation ofstring
is unique, having special treatment to avoid extra boxing wouldn't have been much of a stretch (actually, being able to specify non-default boxing behaviors would be a good thing for other types as well). -
supercat over 11 yearsOf course, if one could specify that certain instance methods should be called directly without regard for whether they are invoked on null references (as happens with extension efforts), the horribly ugly syntax
String.IsNullOrEmpty(myString)
could be replaced withmyString.IsNullOrEmpty
. -
jcolebrand over 11 years@Marcel that's fine. I wanted to make sure it was seen to be an addition, and since you were the primary interested party, I'm glad you're the one that made the edit to clean it up a bit. :D I don't need the +1s, just for SE to be a better resource in the future :D
-
jcolebrand over 11 years@supercat is the value treated differently because there was no Nullable at the beginning or because strings are 90% of why we use computers in the modern age?
-
supercat over 11 yearsThe default value for any type is going to have all bits set to zero. The only way for the default value of
string
to be an empty string is to allow all-bits-zero as a representation of an empty string. There are a number of ways this could be accomplished, but I don't think any involve initializing references toString.Empty
. -
supercat over 11 years@jcolebrand: I wouldn't say 90%. Graphics and audio processing, both of which are primarily numeric, account for a pretty hefty chunk. Most of the situations which would benefit from
String
being nullable would actually benefit more from being able to use the same logic to handle maybe-valid strings and maybe-valid numeric types, than from havingstring
behave an a reference type which must be handled differently. -
djv over 11 yearsOther answers discussed this point as well. I think people have concluded that it wouldn't make sense to treat the String class as a special case and provide something other than all-bits-zero as an initialization, even if it was something like
String.Empty
or""
. -
jcolebrand over 11 yearsYou, sir, have never worked with a number of my previous employers ... ;-)
-
Alessandro Da Rugna over 11 years@Thorari @delnan : They're both aliases, but
System.Int32
is aStruct
thus having a default value whileSystem.String
is aClass
having a pointer with default value ofnull
. They're visually presented in the same font/color. Without knowledge, one can think they act the same way (=having a default value). My answer was written with a en.wikipedia.org/wiki/Cognitive_psychology cognitive psychology idea behind it :-) -
supercat over 11 years@DanV: Changing the initialization behavior of
string
storage locations would have required also changing the initialization behavior of all structs or classes which have fields of typestring
. That would represent a pretty big change in the design of .net, which presently expects to zero-initialize any type without even having to think about what it is, save only for its total size. -
supercat over 11 yearsThere's no particular reason
string
would have to be a reference type. To be sure, the actual characters that make up the string certainly have to be stored on the heap, but given the amount of dedicated support that strings have in the CLR already, it would not be a stretch to haveSystem.String
be a value type with a single private fieldValue
of typeHeapString
. That field would be a reference type, and would default tonull
, but aString
struct whoseValue
field was null would behave as an empty string. The only disadvantage of this approach would be... -
supercat over 11 years...that casting a
String
toObject
would, in the absence of special-case code in the runtime, cause the creation of a boxedString
instance on the heap, rather than simply copying a reference to theHeapString
. -
Henk Holterman over 11 years@supercat - nobody is saying that string should/could be a value type.
-
Dave Markle over 11 yearsNot sure what "ugly" means here, but if it means "consistent with everything else in the language and not hard to understand", then I guess it's ugly.
-
supercat over 11 yearsNobody except me. Having string be a "special" value type (with a private reference-type field) would allow most handling to be essentially as efficient as it is now, except for an added null check on methods/properties like
.Length
etc. so that instances which hold a null reference would not attempt to dereference it but instead behave as appropriate for an empty string. Whether the Framework would be better or worse withstring
implemented that way, if one wanteddefault(string)
to be an empty string... -
supercat over 11 years...having
string
be a value-type wrapper on a reference-type field would be the approach that required the fewest changes to other parts of .net [indeed, if one were willing to accept have conversion fromString
toObject
create an extra boxed item, one could simply haveString
be an ordinary struct with a field of typeChar[]
which it never exposed]. I think having aHeapString
type would probably be better, but in some ways the value-type string holding aChar[]
would be simpler. -
Nathan Koop over 11 years@DaveMarkle you may want to include IsNullOrWhitespace too msdn.microsoft.com/en-us/library/…
-
Henk Holterman over 11 yearsYou know, when 1 comment isn't enough, you probably shouldn't post as a comment. The lack of formatting increases the TL;DR factor.
-
Patrick Magee about 11 yearsSometimes it's nice to have clean looking Extension methods like this, not having to slap value ?? "" everywhere.
-
Jeppe Stig Nielsen almost 11 yearsOf course you could go even further in this (bad?) direction with
public static string ToUpperSafe(this string str) { return str == null ? null : str.ToUpper(); }
and so on... -
Thomas Koelle over 9 yearsI am fairly certain Anders Hejlsberg that said it in a channel 9 interview. I know the difference between heap and stack but the idea with C# is that the casual programmer don't need to.
-
sara about 8 yearsI really think checking for null everywhere is a source of immense code bloat. it's ugly, and it looks hacky and it's hard to stay consistent. I think (at least in C#-like languages) a good rule is "ban the null keyword in production code, use it like crazy in test code".
-
Stijn Van Antwerpen over 7 yearsTo be correct, since c# 6.0, the version of the IDE has nothing to do with it since this is a language feature.
-
Jaja Harris over 7 yearsAnother option -
public string Name { get; set; } = string.Empty;
-
Hunter Nelson about 6 yearsWhat is this called? myString?.ToUpper();
-
russelrillema about 6 yearsIt's called a Null-Conditional Operator. You can read about it here msdn.microsoft.com/en-us/magazine/dn802602.aspx