IEnumerable vs List - What to Use? How do they work?

556,497

Solution 1

IEnumerable describes behavior, while List is an implementation of that behavior. When you use IEnumerable, you give the compiler a chance to defer work until later, possibly optimizing along the way. If you use ToList() you force the compiler to reify the results right away.

Whenever I'm "stacking" LINQ expressions, I use IEnumerable, because by only specifying the behavior I give LINQ a chance to defer evaluation and possibly optimize the program. Remember how LINQ doesn't generate the SQL to query the database until you enumerate it? Consider this:

public IEnumerable<Animals> AllSpotted()
{
    return from a in Zoo.Animals
           where a.coat.HasSpots == true
           select a;
}

public IEnumerable<Animals> Feline(IEnumerable<Animals> sample)
{
    return from a in sample
           where a.race.Family == "Felidae"
           select a;
}

public IEnumerable<Animals> Canine(IEnumerable<Animals> sample)
{
    return from a in sample
           where a.race.Family == "Canidae"
           select a;
}

Now you have a method that selects an initial sample ("AllSpotted"), plus some filters. So now you can do this:

var Leopards = Feline(AllSpotted());
var Hyenas = Canine(AllSpotted());

So is it faster to use List over IEnumerable? Only if you want to prevent a query from being executed more than once. But is it better overall? Well in the above, Leopards and Hyenas get converted into single SQL queries each, and the database only returns the rows that are relevant. But if we had returned a List from AllSpotted(), then it may run slower because the database could return far more data than is actually needed, and we waste cycles doing the filtering in the client.

In a program, it may be better to defer converting your query to a list until the very end, so if I'm going to enumerate through Leopards and Hyenas more than once, I'd do this:

List<Animals> Leopards = Feline(AllSpotted()).ToList();
List<Animals> Hyenas = Canine(AllSpotted()).ToList();

Solution 2

There is a very good article written by: Claudio Bernasconi's TechBlog here: When to use IEnumerable, ICollection, IList and List

Here some basics points about scenarios and functions:

enter image description here enter image description here

Solution 3

A class that implement IEnumerable allows you to use the foreach syntax.

Basically it has a method to get the next item in the collection. It doesn't need the whole collection to be in memory and doesn't know how many items are in it, foreach just keeps getting the next item until it runs out.

This can be very useful in certain circumstances, for instance in a massive database table you don't want to copy the entire thing into memory before you start processing the rows.

Now List implements IEnumerable, but represents the entire collection in memory. If you have an IEnumerable and you call .ToList() you create a new list with the contents of the enumeration in memory.

Your linq expression returns an enumeration, and by default the expression executes when you iterate through using the foreach. An IEnumerable linq statement executes when you iterate the foreach, but you can force it to iterate sooner using .ToList().

Here's what I mean:

var things = 
    from item in BigDatabaseCall()
    where ....
    select item;

// this will iterate through the entire linq statement:
int count = things.Count();

// this will stop after iterating the first one, but will execute the linq again
bool hasAnyRecs = things.Any();

// this will execute the linq statement *again*
foreach( var thing in things ) ...

// this will copy the results to a list in memory
var list = things.ToList()

// this won't iterate through again, the list knows how many items are in it
int count2 = list.Count();

// this won't execute the linq statement - we have it copied to the list
foreach( var thing in list ) ...

Solution 4

Nobody mentioned one crucial difference, ironically answered on a question closed as a duplicated of this.

IEnumerable is read-only and List is not.

See Practical difference between List and IEnumerable

Solution 5

The most important thing to realize is that, using Linq, the query does not get evaluated immediately. It is only run as part of iterating through the resulting IEnumerable<T> in a foreach - that's what all the weird delegates are doing.

So, the first example evaluates the query immediately by calling ToList and putting the query results in a list.
The second example returns an IEnumerable<T> that contains all the information needed to run the query later on.

In terms of performance, the answer is it depends. If you need the results to be evaluated at once (say, you're mutating the structures you're querying later on, or if you don't want the iteration over the IEnumerable<T> to take a long time) use a list. Else use an IEnumerable<T>. The default should be to use the on-demand evaluation in the second example, as that generally uses less memory, unless there is a specific reason to store the results in a list.

Share:
556,497
Axonn
Author by

Axonn

I'm a Dual Class Human (Software Engineer / Writer). I'm currently running an Open Source Literature experiment called Stories from the Continuum. It all starts with a book of short stories that I currently have in development and will hopefully expand to several novels that I plan to write. All this in done in an Open Source way, using a GIT repository and a story structure that facilitates branching and versioning. Here's the URL to the GIT Repo: https://github.com/MAxonn/stories-from-the-continuum I'm also developing a multi-tagger component for Foobar2000, a Windows-based music player (that can also run in Linux using Wine). I plan to use this in the creation of an advanced music search system that will be able to find songs on much more than just genre. If all goes well, I plan to evolve this into an Open Music Genome Project (Google for Pandora Music Genome Project to learn more about the concept).

Updated on October 13, 2021

Comments

  • Axonn
    Axonn over 2 years

    I have some doubts over how Enumerators work, and LINQ. Consider these two simple selects:

    List<Animal> sel = (from animal in Animals 
                        join race in Species
                        on animal.SpeciesKey equals race.SpeciesKey
                        select animal).Distinct().ToList();
    

    or

    IEnumerable<Animal> sel = (from animal in Animals 
                               join race in Species
                               on animal.SpeciesKey equals race.SpeciesKey
                               select animal).Distinct();
    

    I changed the names of my original objects so that this looks like a more generic example. The query itself is not that important. What I want to ask is this:

    foreach (Animal animal in sel) { /*do stuff*/ }
    
    1. I noticed that if I use IEnumerable, when I debug and inspect "sel", which in that case is the IEnumerable, it has some interesting members: "inner", "outer", "innerKeySelector" and "outerKeySelector", these last 2 appear to be delegates. The "inner" member does not have "Animal" instances in it, but rather "Species" instances, which was very strange for me. The "outer" member does contain "Animal" instances. I presume that the two delegates determine which goes in and what goes out of it?

    2. I noticed that if I use "Distinct", the "inner" contains 6 items (this is incorrect as only 2 are Distinct), but the "outer" does contain the correct values. Again, probably the delegated methods determine this but this is a bit more than I know about IEnumerable.

    3. Most importantly, which of the two options is the best performance-wise?

    The evil List conversion via .ToList()?

    Or maybe using the enumerator directly?

    If you can, please also explain a bit or throw some links that explain this use of IEnumerable.

  • Steven Sudit
    Steven Sudit over 13 years
    I'm not sure it's safe to say that generating a list means lower performance.
  • Axonn
    Axonn over 13 years
    @ Steven: indeed as thecoop and Chris said, sometimes it may be necessary to use a List. In my case, I have concluded it isn't. @ Daren: what do you mean by "this will create a new list for each element in memory"? Maybe you meant a "list entry"? ::- ).
  • Axonn
    Axonn over 13 years
    Hi and thanks for answering ::- ). This cleared up almost all my doubts. Any idea why the Enumerable is "split" into "inner" and "outer"? This happens when I inspect the element in debug/break mode via mouse. Is this perhaps Visual Studio's contribution? Enumerating on the spot and indicating input and output of the Enum?
  • Axonn
    Axonn over 13 years
    Hi and thanks for answering ::- ). You gave me a very good example of how a case when clearly the IEnumerable case is performance-advantaged. Any idea regarding that other part of my question? Why the Enumerable is "split" into "inner" and "outer"? This happens when I inspect the element in debug/break mode via mouse. Is this perhaps Visual Studio's contribution? Enumerating on the spot and indicating input and output of the Enum?
  • Ajibola
    Ajibola over 13 years
    I think they refer to the two sides of a join. If you do "SELECT * FROM Animals JOIN Species..." then the inner part of the join is Animals, and the outer part is Species.
  • darkAsPitch
    darkAsPitch over 13 years
    @Axonn yes, I ment list entry. fixed.
  • darkAsPitch
    darkAsPitch over 13 years
    @Steven If you plan to iterate over the elements in the IEnumerable, then creating a list first (and iterating over that) means you iterate over the elements twice. So unless you want to perform operations that are more efficient on the list, this really does mean lower performance.
  • thecoop
    thecoop over 13 years
    That's the Join doing it's work - inner and outer are the two sides of the join. Generally, don't worry about what's actually in the IEnumerables, as it will be completely different from your actual code. Only worry about the actual output when you iterate over it :)
  • Steven Sudit
    Steven Sudit over 13 years
    Assuming we're just going to iterate over all of the results exactly once, there's no advantage to making a list unless (as you said) the operation benefits from random access. Generating the list always costs us something. My thought is that, if this is LINQ to SQL or if the processing is not trivial, then caching the results in a list allows us to pay once and then iterate over it as often as we like cheaply. As the overhead of list generation is fairly low, it's not hard to come up with cases where the benefits outweigh that cost. I hope this explains my thinking.
  • Steven Sudit
    Steven Sudit over 13 years
    Even worse, I think we're in agreement.
  • jerhewet
    jerhewet almost 13 years
    @Daren -- except when you've been bitten in the ass when the IEnumerable doesn't quite behave like you were expecting it to (and the reason I'm here on Stack Overflow researching it :-)). I was doing a for/each over an XPathSelectElements(), and without adding a .ToList() the subsequence calls to .Remove() failed to remove the selected XElements. Still not sure exactly why that should behave that way -- so off to do more reading!
  • darkAsPitch
    darkAsPitch almost 13 years
    @jerhewet: it is never a good idea to modify a sequence being iterated over. Bad things will happen. Abstractions will leak. Demons will break into our dimension and wreak havoc. So yes, .ToList() helps here ;)
  • PmanAce
    PmanAce about 11 years
    You could also use compiled queries in LINQ to further optimize your code if you wanted. :)
  • Bronek
    Bronek over 10 years
    When I've read the answers about: IEnumerable<T> vs IQueryable<T> I saw the analogical explanation, so that IEnumerable automatically forces the runtime to use LINQ to Objects to query the collection. So I'm confused between these 3 types. stackoverflow.com/questions/2876616/…
  • Lakshay Dulani
    Lakshay Dulani about 10 years
    I am sorry I am a bit confused here! You mean to say that var Leopards = Feline(AllSpotted()); will fetch data from the database in a single go with implementing both the filters ,instead of, getting first All Spotted and then getting Feline ones only??
  • Jim
    Jim almost 10 years
    @Lakshay correct apart from it will execute when .ToList() is added and called.
  • Nate
    Nate over 9 years
    @Bronek The answer you linked is true. IEnumerable<T> will be LINQ-To-Objects after the first part meaning all spotted would have to be returned to run Feline. On the other hand an IQuertable<T> will allow the query to be refined, pulling down only Spotted Felines.
  • Hans
    Hans over 9 years
    This answer is very misleading! @Nate's comment explains why. If you're using IEnumerable<T>, the filter is going to happen on the client side no matter what.
  • Jonathan Twite
    Jonathan Twite about 8 years
    It should be pointed that this article is only for the public facing parts of your code, not the internal workings. List is an implementation of IList and as such has extra functionality on top of those in IList (e.g. Sort, Find, InsertRange). If you force yourself to use IList over List, you loose these methods that you may require
  • Pap
    Pap about 8 years
    But what happens if you execute a foreach on an IEnumerable without converting it to a List first? Does it bring the whole collection in memory? Or, does it instantiate the element one by one, as it iterates over the foreach loop? thanks
  • Keith
    Keith about 8 years
    @Pap the latter: it executes again, nothing is automatically cached in memory.
  • Dandré
    Dandré almost 8 years
    Don't forget IReadOnlyCollection<T>
  • Dandré
    Dandré almost 8 years
    Another alternative to IEnumerable<T> is IReadOnlyCollection<T>. The reason I mention this is that IEnumerable<T> can refer to an indefinitely sized sequence while the sequence size of IReadOnlyCollection<T> is difinitive. Yes IReadOnlyCollection<T> is a IEnumerable<T> but at least you know that when you get a IReadOnlyCollection<T>, it is a collection of some sort that you can iterate multiple times over while you may not be so sure when you get a IEnumerable<T> (it may have a massive performance impact when re-iterated).
  • nthpixel
    nthpixel over 7 years
    @Hans - So does that mean the AllSpotted query will be run against the DB twice? Or only once and running Canine() against it retrieved from memory?
  • Hans
    Hans over 7 years
    Yes AllSpotted() would be run twice. The bigger problem with this answer is the following statement: "Well in the above, Leopards and Hyenas get converted into single SQL queries each, and the database only returns the rows that are relevant." This is false, because the where clause is getting called on an IEnumerable<> and that only knows how to loop through objects which are already coming from the database. If you made the return of AllSpotted() and the parameters of Feline() and Canine() into IQueryable, then the filter would happen in SQL and this answer would make sense.
  • Bronek
    Bronek over 7 years
    That's because of linq methods (extension) which in this case come from IEnumerable where only create a query but not execute it (behind the scenes the expression trees are used). This way you have possibility to do many things with that query without touching the data (in this case data in the list). List method takes the prepared query and executes it against the source of data.
  • BeemerGuy
    BeemerGuy over 7 years
    Actually, I read all the answers, and yours was the one I up-voted, because it clearly states the difference between the two without specifically talking about LINQ/SQL. It is essential to know all this BEFORE you get to LINQ/SQL. Admire.
  • Neme
    Neme over 7 years
    That is an important difference to explain but your "expected result" isn't really expected. You're saying it like it's some sort of gotcha rather than design.
  • amd
    amd over 7 years
    @Neme, yes It was my expectation before I understand how IEnumerable works, but now Isn't more since I know how ;)
  • Jeb50
    Jeb50 over 5 years
    seems like the key diff is 1) whole thing in memory or not. 2) IEnumerable let me use foreach while List will go by say index. Now, if I'd like to know the count/length of thing beforehand, IEnumerable won't help, right?
  • Keith
    Keith over 5 years
    @Jeb50 Not exactly - both List and Array implement IEnumerable. You can think of IEnumerable as a lowest common denominator that works for both in memory collections and large ones that get one item at a time. When you call IEnumerable.Count() you might be calling a fast .Length property or going through the whole collection - the point is that with IEnumerable you don't know. That can be a problem, but if you're just going to foreach it then you don't care - your code will work with an Array or DataReader the same.
  • M Fuat
    M Fuat over 5 years
    getting the next item until it runs out does it mean to send a query to the database for each item?
  • Keith
    Keith over 5 years
    @MFouadKajj I don't know what stack you're using, but it's almost certainly not making a request with each row. The server runs the query and calculates the starting point of the result set, but doesn't get the whole thing. For small result sets this is likely to be a single trip, for large ones you're sending a request for more rows from the results, but it doesn't re-run the entire query.
  • Jason Masters
    Jason Masters about 5 years
    As a follow up, is that because of the Interface aspect or because of the List aspect? i.e. is IList also readonly?
  • CAD bloke
    CAD bloke about 5 years
    IList is not read-only - docs.microsoft.com/en-us/dotnet/api/… IEnumerable is read-only because it lacks any methods to add or remove anything once it is constructed, it is one of the base interfaces which IList extends (see link)
  • jbyrd
    jbyrd over 4 years
    It might be helpful to include a plain array [] here as well.
  • Deepak Mishra
    Deepak Mishra over 4 years
    Wrong and misleading answer. IEnumerable(read only) is useful with yield keyword for deferred execution. It's always better to cache the results to a list or dictionary for repeated execution or random access.
  • Daniel
    Daniel about 4 years
    While it may be frowned upon, thank you for sharing this graphic and article
  • Jazimov
    Jazimov almost 4 years
    For properly using the word 'reify' in a sentence.
  • Shaiju T
    Shaiju T over 3 years
    So using IEnumerable for millions of records will send millions of select SQL request to database ?
  • Keith
    Keith over 3 years
    @shaijut It shouldn't, but it might depend on the specific provider. In Microsoft SQL Server you get a client cursor that keeps the connection open and the client just requests the next record in the set. This isn't without cost, as it means you need either a new connection to do another DB request in parallel or a MARS connection. Too much for a comment really
  • lukkea
    lukkea about 3 years
    While this is an important concept to understand this doesn't actually answer the question.
  • LongChalk
    LongChalk about 2 years
    That's just a matter of usage, hiding a bigger underlying problem - IEnumerable is read only because it (potentially) keeps changing. Consider the houses I have to display, in ascending order of value price (say I have 10 of them). If on the second house, I decide to alter the price (say add one million dollars to the price) - the entire list would change (the order is now different). "one at a time" and "all of them right now" are two different things.