Why is the default value of the string type null instead of an empty string?

208,345

Solution 1

Why is the default value of the string type null instead of an empty string?

Because string is a reference type and the default value for all reference types is null.

It's quite annoying to test all my strings for null before I can safely apply methods like ToUpper(), StartWith() etc...

That is consistent with the behaviour of reference types. Before invoking their instance members, one should put a check in place for a null reference.

If the default value of string were the empty string, I would not have to test, and I would feel it to be more consistent with the other value types like int or double for example.

Assigning the default value to a specific reference type other than null would make it inconsistent.

Additionally Nullable<String> would make sense.

Nullable<T> works with the value types. Of note is the fact that Nullable was not introduced on the original .NET platform so there would have been a lot of broken code had they changed that rule.(Courtesy @jcolebrand)

Solution 2

Habib is right -- because string is a reference type.

But more importantly, you don't have to check for null each time you use it. You probably should throw a ArgumentNullException if someone passes your function a null reference, though.

Here's the thing -- the framework would throw a NullReferenceException for you anyway if you tried to call .ToUpper() on a string. Remember that this case still can happen even if you test your arguments for null since any property or method on the objects passed to your function as parameters may evaluate to null.

That being said, checking for empty strings or nulls is a common thing to do, so they provide String.IsNullOrEmpty() and String.IsNullOrWhiteSpace() for just this purpose.

Solution 3

You could write an extension method (for what it's worth):

public static string EmptyNull(this string str)
{
    return str ?? "";
}

Now this works safely:

string str = null;
string upper = str.EmptyNull().ToUpper();

Solution 4

You could also use the following, as of C# 6.0

string myString = null;
string result = myString?.ToUpper();

The string result will be null.

Solution 5

Empty strings and nulls are fundamentally different. A null is an absence of a value and an empty string is a value that is empty.

The programming language making assumptions about the "value" of a variable, in this case an empty string, will be as good as initiazing the string with any other value that will not cause a null reference problem.

Also, if you pass the handle to that string variable to other parts of the application, then that code will have no ways of validating whether you have intentionally passed a blank value or you have forgotten to populate the value of that variable.

Another occasion where this would be a problem is when the string is a return value from some function. Since string is a reference type and can technically have a value as null and empty both, therefore the function can also technically return a null or empty (there is nothing to stop it from doing so). Now, since there are 2 notions of the "absence of a value", i.e an empty string and a null, all the code that consumes this function will have to do 2 checks. One for empty and the other for null.

In short, its always good to have only 1 representation for a single state. For a broader discussion on empty and nulls, see the links below.

https://softwareengineering.stackexchange.com/questions/32578/sql-empty-string-vs-null-value

NULL vs Empty when dealing with user input

Share:
208,345

Related videos on Youtube

Marcel
Author by

Marcel

I am an experienced software developer for both technical and business software, mainly in C#/.NET. Most professional projects are web applications or web services in the telecommunications field, for large corporate customers. I work and live in Switzerland. In my spare time I build and hack hardware stuff and occasionally, I blog on https://qrys.ch about it.

Updated on July 10, 2022

Comments

  • Marcel
    Marcel almost 2 years

    It's quite annoying to test all my strings for null before I can safely apply methods like ToUpper(), StartWith() etc...

    If the default value of string were the empty string, I would not have to test, and I would feel it to be more consistent with the other value types like int or double for example. Additionally Nullable<String> would make sense.

    So why did the designers of C# choose to use null as the default value of strings?

    Note: This relates to this question, but is more focused on the why instead of what to do with it.

    • Jon Skeet
      Jon Skeet over 11 years
      Do you consider this a problem for other reference types?
    • Marcel
      Marcel over 11 years
      @JonSkeet No, but only because I initially, wrongly, thought that strings are value types.
    • T.J. Crowder
      T.J. Crowder over 11 years
      @Marcel: That's a pretty good reason for wondering about it.
    • Konrad Rudolph
      Konrad Rudolph over 11 years
      @JonSkeet Yes. Oh yes. (But you’re no stranger to the non-nullable reference type discussion …)
    • diegoreymendez
      diegoreymendez over 11 years
      I believe you would have a much better time if you used assertions on your strings in places where you expect them NOT to be null (and also I recommend that you conceptually treat null and empty strings as different things). A null value could be the result of an error somewhere, while an empty string should convey a different meaning.
    • JohnCastle
      JohnCastle over 11 years
      Null starts to be not very well considered. See here, "Null Reference : the billion dollar mistake" qconlondon.com/london-2009/presentation/…. Or here by the Google Guava library (in Java but still relevant) code.google.com/p/guava-libraries/wiki/…
    • jcolebrand
      jcolebrand over 11 years
      @JohnCastle I dare you to ask database developers who understand the value of trinary state if you can take their nulls from them. The reason it was no good was because people don't think in trinary, it's either left or right, up or down, yes or no. Relational algebra NEEDS a trinary state.
  • Dave Markle
    Dave Markle over 11 years
    But please don't. The last thing another programmer wants to see is thousands of lines of code peppered with .EmptyNull() everywhere just because the first guy was "scared" of exceptions.
  • Tim Schmelter
    Tim Schmelter over 11 years
    @DaveMarkle: But obviously it's exactly what OP was looking for. "It's quite annoying to test all my strings for null before I can safely apply methods like ToUpper(), StartWith() etc"
  • Dave Markle
    Dave Markle over 11 years
    The comment was to the OP, not to you. While your answer is clearly correct, a programmer asking a basic question such as this should be strongly cautioned against actually putting your solution into WIDE practice, as often is their wont. There are a number of tradeoffs you don't discuss in your answer, such as opaqueness, increased complexity, difficulty of refactoring, potential overuse of extension methods, and yes, performance. Sometimes (many times) a correct answer is not the right path, and this is why I commented.
  • Andy
    Andy over 11 years
    You should never throw a NullReferenceException yourself (msdn.microsoft.com/en-us/library/ms173163.aspx); you throw an ArgumentNullException if your method can't accept null refs. Also, NullRef's are typically one of the more difficult exceptions to diagnos when you're fixing issues, so I don't think the recommendation to not check for null is a very good one.
  • Tim Schmelter
    Tim Schmelter over 11 years
    Many people seem to have your opinion. Note that i've not encouraged to replace all occurences of string with EmptyNull. It's just a direct answer to OP's requirement. Many programmers know what they are doing or are working on their own(as me). Btw, here i've found a question which targets this issue: stackoverflow.com/questions/8536740/…
  • Andy
    Andy over 11 years
    @DaveMarkle The last thing another programmer wants to deal with are NullRefExceptions everywhere because proper null checking wasn't done.
  • Admin
    Admin over 11 years
    Isn't the same true of the object keyword? Though admittedly, that's far less used than string.
  • Andy
    Andy over 11 years
    And how exactly do you see this difference, say in a text box? Did the user forget to enter a value in the field, or are they purposefully leaving it blank? Null in a programming language does have a specific meaning; unassigned. We know it doesn't have a value, which is not the same as a database null.
  • Henk Holterman
    Henk Holterman over 11 years
    But string has special support in several areas (string literals) so it could have been implemented (easily).
  • Louis Kottmann
    Louis Kottmann over 11 years
    @Andy "NullRef's are typically one of the most difficult exceptions to diagnose" I strongly disagree, if you log stuff it's really easy to find & fix (just handle the null case).
  • Admin
    Admin over 11 years
    @HenkHolterman One could implement a whole ton of things, but why introduce such a glaring inconsistency?
  • Abbas Gadhia
    Abbas Gadhia over 11 years
    theres not much difference when you use it with a text box. Either ways, having one notation to represent the absence of a value in a string is paramount. If i had to pick one, i'd pick null.
  • Kos
    Kos over 11 years
    Throwing ArgumentNullException has the additional benefit of being able to provide the parameter name. During debugging, this saves... err, seconds. But important seconds.
  • Dave Markle
    Dave Markle over 11 years
    @Andy: The solution to not having proper null checking done is to properly check for nulls, not to put a band-aid on a problem.
  • Henk Holterman
    Henk Holterman over 11 years
    @delnan - "why" was the question here.
  • user
    user over 11 years
    If you're going through the trouble of writing .EmptyNull(), why not simply use (str ?? "") instead where it is needed? That said, I agree with the sentiment expressed in @DaveMarkle's comment: you probably shouldn't. null and String.Empty are conceptually different, and you can't necessarily treat one the same as another.
  • Fabricio Araujo
    Fabricio Araujo over 11 years
    In Delphi, string is a value type and therefore can't be null. It makes life a lot easier in this respect - I really find very annoying make string an reference type.
  • Admin
    Admin over 11 years
    @HenkHolterman And "Consistency" is the rebuttal to your point "string could be treated unlike other reference types".
  • Admin
    Admin over 11 years
    Speaking as someone who works in SQL a lot, and has dealt with the headache of Oracle not making a distinction between NULL and zero-length, I am very glad that .NET does. "Empty" is a value, "null" is not.
  • Fabricio Araujo
    Fabricio Araujo over 11 years
    @JonofAllTrades: I disagree. On application code, except dealing with db code, there's no meaning an string being treated as a class. It's a value type and a basic one. Supercat: +1 to you
  • Fabricio Araujo
    Fabricio Araujo over 11 years
    @delnan: Being working on a language that treats string as value types and working 2+ years on dotnet, I agree with Henk. I see it as a major FLAW on dotnet.
  • Admin
    Admin over 11 years
    Database code is a big "except". As long as there are some problem domains where you need to distinguish between "present/known, an empty string" and "not present/unknown/inapplicable", such as databases, then the language needs to support it. Of course now that .NET has Nullable<>, strings could be reimplemented as value types; I can't speak to the costs and benefits of such a choice.
  • Thorarin
    Thorarin over 11 years
    As int is an alias for System.Int32. What's your point? :)
  • jcolebrand
    jcolebrand over 11 years
    It could be because Nullable was only introduced in the .NET 2.0 Framework, so before then it wasn't available?
  • Marcel
    Marcel over 11 years
    @jcolebrand Thanks for your initial edit, I would give it a 1+ if I could. I however compacted your edit a bit with a new edit.
  • Marcel
    Marcel over 11 years
    Thanks for mentioning the "first initialisation" thing.
  • Dan Burton
    Dan Burton over 11 years
    How would it be a problem when initializing a large array? Since, as you said, Strings are immutable, all elements of the array would simply be pointers to the same String.Empty. Am I mistaken?
  • Marcel
    Marcel over 11 years
    Thanks Dan Burton for pointing out that someone CAN set the initialized value to null on reference types later on. Thinking this through tells me that my original intent in the question leads to no use.
  • supercat
    supercat over 11 years
    @JonofAllTrades: Code that deals with numbers has to have an out-of-band means of distinguishing the default value zero from "undefined". As it is, nullable-handling code that works with strings and numbers has to use one method for nullable strings and another for nullable numbers. Even if a nullable class type string is more efficient than Nullable<string> would be, having to use the "more efficient" method is more burdensome than being able to use the same approach for all nullable data database values.
  • supercat
    supercat over 11 years
    Under the COM (Common Object Model) which predated .net, a string type would either hold a pointer to the string's data, or null to represent the empty string. There are a number of ways .net could have implemented similar semantics, had they chosen to do so, especially given that String has a number of characteristics that make it a unique type anyway [e.g. it and the two array types are the only types whose allocation size isn't constant].
  • supercat
    supercat over 11 years
    @delnan: One could create a value type which behaved essentially like String, except for (1) the value-type-ish behavior of having a usable default value, and (2) an unfortunate extra layer of boxing indirection any time it was cast to Object. Given that the heap representation of string is unique, having special treatment to avoid extra boxing wouldn't have been much of a stretch (actually, being able to specify non-default boxing behaviors would be a good thing for other types as well).
  • supercat
    supercat over 11 years
    Of course, if one could specify that certain instance methods should be called directly without regard for whether they are invoked on null references (as happens with extension efforts), the horribly ugly syntax String.IsNullOrEmpty(myString) could be replaced with myString.IsNullOrEmpty.
  • jcolebrand
    jcolebrand over 11 years
    @Marcel that's fine. I wanted to make sure it was seen to be an addition, and since you were the primary interested party, I'm glad you're the one that made the edit to clean it up a bit. :D I don't need the +1s, just for SE to be a better resource in the future :D
  • jcolebrand
    jcolebrand over 11 years
    @supercat is the value treated differently because there was no Nullable at the beginning or because strings are 90% of why we use computers in the modern age?
  • supercat
    supercat over 11 years
    The default value for any type is going to have all bits set to zero. The only way for the default value of string to be an empty string is to allow all-bits-zero as a representation of an empty string. There are a number of ways this could be accomplished, but I don't think any involve initializing references to String.Empty.
  • supercat
    supercat over 11 years
    @jcolebrand: I wouldn't say 90%. Graphics and audio processing, both of which are primarily numeric, account for a pretty hefty chunk. Most of the situations which would benefit from String being nullable would actually benefit more from being able to use the same logic to handle maybe-valid strings and maybe-valid numeric types, than from having string behave an a reference type which must be handled differently.
  • djv
    djv over 11 years
    Other answers discussed this point as well. I think people have concluded that it wouldn't make sense to treat the String class as a special case and provide something other than all-bits-zero as an initialization, even if it was something like String.Empty or "".
  • jcolebrand
    jcolebrand over 11 years
    You, sir, have never worked with a number of my previous employers ... ;-)
  • Alessandro Da Rugna
    Alessandro Da Rugna over 11 years
    @Thorari @delnan : They're both aliases, but System.Int32 is a Struct thus having a default value while System.String is a Class having a pointer with default value of null. They're visually presented in the same font/color. Without knowledge, one can think they act the same way (=having a default value). My answer was written with a en.wikipedia.org/wiki/Cognitive_psychology cognitive psychology idea behind it :-)
  • supercat
    supercat over 11 years
    @DanV: Changing the initialization behavior of string storage locations would have required also changing the initialization behavior of all structs or classes which have fields of type string. That would represent a pretty big change in the design of .net, which presently expects to zero-initialize any type without even having to think about what it is, save only for its total size.
  • supercat
    supercat over 11 years
    There's no particular reason string would have to be a reference type. To be sure, the actual characters that make up the string certainly have to be stored on the heap, but given the amount of dedicated support that strings have in the CLR already, it would not be a stretch to have System.String be a value type with a single private field Value of type HeapString. That field would be a reference type, and would default to null, but a String struct whose Value field was null would behave as an empty string. The only disadvantage of this approach would be...
  • supercat
    supercat over 11 years
    ...that casting a String to Object would, in the absence of special-case code in the runtime, cause the creation of a boxed String instance on the heap, rather than simply copying a reference to the HeapString.
  • Henk Holterman
    Henk Holterman over 11 years
    @supercat - nobody is saying that string should/could be a value type.
  • Dave Markle
    Dave Markle over 11 years
    Not sure what "ugly" means here, but if it means "consistent with everything else in the language and not hard to understand", then I guess it's ugly.
  • supercat
    supercat over 11 years
    Nobody except me. Having string be a "special" value type (with a private reference-type field) would allow most handling to be essentially as efficient as it is now, except for an added null check on methods/properties like .Length etc. so that instances which hold a null reference would not attempt to dereference it but instead behave as appropriate for an empty string. Whether the Framework would be better or worse with string implemented that way, if one wanted default(string) to be an empty string...
  • supercat
    supercat over 11 years
    ...having string be a value-type wrapper on a reference-type field would be the approach that required the fewest changes to other parts of .net [indeed, if one were willing to accept have conversion from String to Object create an extra boxed item, one could simply have String be an ordinary struct with a field of type Char[] which it never exposed]. I think having a HeapString type would probably be better, but in some ways the value-type string holding a Char[] would be simpler.
  • Nathan Koop
    Nathan Koop over 11 years
    @DaveMarkle you may want to include IsNullOrWhitespace too msdn.microsoft.com/en-us/library/…
  • Henk Holterman
    Henk Holterman over 11 years
    You know, when 1 comment isn't enough, you probably shouldn't post as a comment. The lack of formatting increases the TL;DR factor.
  • Patrick Magee
    Patrick Magee about 11 years
    Sometimes it's nice to have clean looking Extension methods like this, not having to slap value ?? "" everywhere.
  • Jeppe Stig Nielsen
    Jeppe Stig Nielsen almost 11 years
    Of course you could go even further in this (bad?) direction with public static string ToUpperSafe(this string str) { return str == null ? null : str.ToUpper(); } and so on...
  • Thomas Koelle
    Thomas Koelle over 9 years
    I am fairly certain Anders Hejlsberg that said it in a channel 9 interview. I know the difference between heap and stack but the idea with C# is that the casual programmer don't need to.
  • sara
    sara about 8 years
    I really think checking for null everywhere is a source of immense code bloat. it's ugly, and it looks hacky and it's hard to stay consistent. I think (at least in C#-like languages) a good rule is "ban the null keyword in production code, use it like crazy in test code".
  • Stijn Van Antwerpen
    Stijn Van Antwerpen over 7 years
    To be correct, since c# 6.0, the version of the IDE has nothing to do with it since this is a language feature.
  • Jaja Harris
    Jaja Harris over 7 years
    Another option - public string Name { get; set; } = string.Empty;
  • Hunter Nelson
    Hunter Nelson about 6 years
    What is this called? myString?.ToUpper();
  • russelrillema
    russelrillema about 6 years
    It's called a Null-Conditional Operator. You can read about it here msdn.microsoft.com/en-us/magazine/dn802602.aspx