Why is it considered good practice to return an empty collection?

10,519

Solution 1

If you return an empty collection (but not necessarily Collections.emptyList()), you avoid surprising downstream consumers of this method with an unintentional NPE.

This is preferable to returning null because:

  • The consumer doesn't have to guard against it
  • The consumer can operate on the collection irrespective of how many elements are in it

I say not necessarily Collections.emptyList() since, as you point out, you're trading one runtime exception for another in that adding to this list will be unsupported and once again surprise the consumer.

The most ideal solution to this: eager initialization of the field.

private List<String> bone = new ArrayList<>();

The next solution to this: make it return an Optional and do something in case it doesn't exist. Instead of throwing you could also provide the empty collection here if you so desired.

Dog dog = new Dog();
dog.get().orElseThrow(new IllegalStateException("Dog has no bones??"));

Solution 2

Because the alternative to returning an empty collection is generally returning null; and then callers have to add guards against NullPointerException. If you return an empty collection that class of error is mitigated. In Java 8+ there is also an Optional type, which can serve the same purpose without a Collection.

Solution 3

I don't think I understand why you object to empty collections, but I'll point out in the meantime that I think your code needs improvement. Maybe that's the issue?

Avoid unnecessary null checks in your own code:

public class Dog{

   private List<String> bone = new ArrayList<>();

   public List<String> get(){
       return bone;
   }
}

Or consider not creating a new list each time:

 public class Dog{

   private List<String> bone;

   public List<String> get(){
       if(bone == null){
        return Collections.EMPTY_LIST;
       }
       return bone;
   }
}

Solution 4

The following answer could be of interest regarding your question: should-functions-return-null-or-an-empty-object.

Summarized:


Returning null is usually the best idea if you intend to indicate that no data is available.

An empty object implies data has been returned, whereas returning null clearly indicates that nothing has been returned.

Additionally, returning a null will result in a null exception if you attempt to access members in the object, which can be useful for highlighting buggy code - attempting to access a member of nothing makes no sense. Accessing members of an empty object will not fail meaning bugs can go undiscovered.


Also from clean code:


The problem with using null is that the person using the interface doesn't know if null is a possible outcome, and whether they have to check for it, because there's no not null reference type.


From Martin Fowler's Special Case pattern


Nulls are awkward things in object-oriented programs because they defeat polymorphism. Usually you can invoke foo freely on a variable reference of a given type without worrying about whether the item is the exact type or a sub-class. With a strongly typed language you can even have the compiler check that the call is correct. However, since a variable can contain null, you may run into a runtime error by invoking a message on null, which will get you a nice, friendly stack trace.

If it's possible for a variable to be null, you have to remember to surround it with null test code so you'll do the right thing if a null is present. Often the right thing is same in many contexts, so you end up writing similar code in lots of places - committing the sin of code duplication.

Nulls are a common example of such problems and others crop up regularly. In number systems you have to deal with infinity, which has special rules for things like addition that break the usual invariants of real numbers. One of my earliest experiences in business software was with a utility customer who wasn't fully known, referred to as "occupant." All of these imply altering the usual behavior of the type.

Instead of returning null, or some odd value, return a Special Case that has the same interface as what the caller expects.


And finally from Billion Dollar Mistake!


I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W).

My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement.

This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

In recent years, a number of program analysers like PREfix and PREfast in Microsoft have been used to check references, and give warnings if there is a risk they may be non-null. More recent programming languages like Spec# have introduced declarations for non-null references. This is the solution, which I rejected in 1965.

Tony Hoare


Hope this provides enough reasons on why it is regarded better to return an empty collection or special return value instead of null.

Share:
10,519
Admin
Author by

Admin

Updated on June 08, 2022

Comments

  • Admin
    Admin almost 2 years

    I have read several books and seen several blogs discussing how returning an empty collection is better than returning null. I completely understand trying to avoid the check, but I don't understand why returning an empty collection is better than returning null. For example:

    public class Dog{
    
       private List<String> bone;
    
       public List<String> get(){
           return bone;
       }
    
    }
    

    vs

     public class Dog{
    
       private List<String> bone;
    
       public List<String> get(){
           if(bone == null){
            return Collections.emptyList();
           }
           return bone;
       }
    
    }
    

    Example one will throw a NullPointerException and example two will throw an UnsupportedOperation exception, but they are both very generic exceptions. What makes one better or worse than the other?

    Also a third option would be to do something like this:

     public class Dog{
    
       private List<String> bone;
    
       public List<String> get(){
           if(bone == null){
            return new ArrayList<String>();
           }
           return bone;
       }
    
    }
    

    but the problem with this is that you're adding unexpected behavior to your code which others may have to maintain.

    I'm really looking for a solution to this predicament. Many people on blogs tend to just say it is better without an detailed explanation as to why. If returning an immutable list is best practice I am ok doing it, but I would like to understand why it is better.

  • Makoto
    Makoto about 8 years
    Your first approach was better. Your second approach surprises one in that all of a sudden, the get method of a class instantiates a variable.
  • Admin
    Admin about 8 years
    Right, but my point is that if you try to add something to an immutable list you are still going to get an UnsupportedOperationException, so you have to guard against that, so I'm wondering where the benefit lies.
  • Elliott Frisch
    Elliott Frisch about 8 years
    Who says an empty collection must be an immutable list? If you are planning on adding items to it, don't return an immutable list.
  • Admin
    Admin about 8 years
    I agree that eager initialization would be better, but then some might debate that initializing a list before it's used is using up memory for no good reason (albeit a very small amount) .
  • Makoto
    Makoto about 8 years
    @Adam: I'd gladly trade a little bit of memory here to avoid a stupid and completely irresponsible NPE elsewhere.
  • Admin
    Admin about 8 years
    Thank you for this, makes sense =)
  • Admin
    Admin about 8 years
    But what if a little bit becomes a lot? Say for example you are serializing/deserializing hundreds of thousands of objects (maybe millions) where the lists have been eagerly initialized. How would the strategy change? This is more hypothetical but I'm just curious.
  • Makoto
    Makoto about 8 years
    @Adam: At that time you should consider formally profiling your application. Chances are, if there are hundreds of thousands of objects floating around at once, there's more than one bottleneck (and, one that would prove more fruitful to fix than a simple eager initialization).
  • Makoto
    Makoto about 8 years
    This really isn't your answer, though. You've copied it verbatim from at least two other places.