Should I be concerned about .NET dictionary speed?

14,432

Solution 1

  1. You are micro-optimizing. Do you even have working code yet? Remember, "If it doesn't work, it doesn't matter how fast it doesn't work." (Mich Ravera) http://www.codingninja.co.uk/best-programmers-quotes/.

    You have no idea where the bottlenecks will be, and already you're focused on Dictionary. What if the problem is somewhere else?

  2. How do you know how the Dictionary class is implemented? Maybe it already uses an array with hashed keys!

P.S. It's really ".NET Dictionaries", not "C# Dictionaries", because C# is just one of several programming languages that use the framework.

Solution 2

Hello, I will be creating a project that will use dictionary lookups and inserts quite a bit. Is this something to be concerned about?

Yes. It is always wise to consider performance factors up front.

The form that your concern should take is as follows: your concern should be encouraging you to write realistic, user-focused performance specifications. It should be encouraging you to start writing performance tests early, and running them often, so that you can see how every single change to the product affects performance. That way you will be informed immediately when a code change causes a user-affecting change in performance. And it should be encouraging you to run profiles often, so that you are reasoning about performance based on empirical measurements, rather than random guesses and hunches.

Also, if I do benchmarking and such and it is really bad, then what is the best way of replacing dictionary with something else?

The best way to do this is to build a reasonable abstraction layer. If you have a class (or interface) which represents the "insert" and "lookup" abstract data type, then you can replace its internals without changing any of the callers.

Note that adding a layer of abstraction itself has a performance cost. If your profiling shows that the abstraction layer is too expensive, if the extra couple nanoseconds per call is too much, then you might have to get rid of the abstraction layer. Again, this decision will be driven by real-world performance data.

Would using an array with "hashed" keys even be faster? That wouldn't help on insert time though will it?

Neither you nor anyone reading this can possibly know which one is faster until you write it both ways and then benchmark it both ways under real-world conditions. Doing it under "lab" conditions will skew your results; you'll need to understand how things work when the GC is under realistic memory pressure, and so on. You might as well ask us which horse will run faster in next year's Kentucky Derby. If we knew the answer just by looking at the racing form, we'd all be rich already. You can't possibly expect anyone to know which of two entirely hypothetical, unwritten pieces of code will be faster under unspecified conditions!

Solution 3

The Dictionary<TKey, TValue> class is actually implemented as a hash table which makes lookups very fast (close to O(1)). See the API documentation for more information. I doubt you could make a better implementation yourself.

Solution 4

Wait and see if the performance of your application is below expectations
If it is then use a profiler to determine if the Dictionary lookup is the source of the problem
If it is then do some tests with representative data to see if another choice of list would be quicker.

In short - no, in general you shouldn't worry about the performance of implementation details until after you have a problem.

Solution 5

I would do a benchmark of the Dictionary, HashTable (HashSet in .NET), and perhaps a home grown class, and see which works out best under your typical usage conditions.

Normally I would say it's fine (insert StackOverflow's favorite premature ejaculation quote here), but if this is a core peice of the application, Benchmark, Benchmark, Benchmark.

Share:
14,432

Related videos on Youtube

Earlz
Author by

Earlz

Hello there! My name's Jordan Earls, but most people online know me as "earlz". I'm the lead developer and a co-founder of the Qtum project which brings the Ethereum Virtual Machine (ie, the thing that makes Solidity contracts function) to a UTXO based blockchain similar to Bitcoin. I've been programming since I was 13 and am completely self-taught. Low-level code like assembly and pointer arithmetic is the fun stuff for me. I also make music when I have time even though it's usually awful. Most of my personal projects are open source and BSD licensed. The majority of them are at bitbucket with the rest of them being listed on github Also, you can follow me on the twitters @earlzdotnet

Updated on May 11, 2020

Comments

  • Earlz
    Earlz almost 4 years

    I will be creating a project that will use dictionary lookups and inserts quite a bit. Is this something to be concerned about?

    Also, if I do benchmarking and such and it is really bad, then what is the best way of replacing dictionary with something else? Would using an array with "hashed" keys even be faster? That wouldn't help on insert time though will it?

    Also, I don't think I'm micro-optimizing because this really will be a significant part of code on a production server, so if this takes an extra 100ms to complete, then we will be looking for new ways to handle this.

  • Dave Swersky
    Dave Swersky over 14 years
    +1 for calling out premature optimization... you can't optimize code that hasn't been written
  • Cory Charlton
    Cory Charlton over 14 years
    A Dictionary<TKey, TValue> should always outperform a HashTable. .NET HashTables were a bad idea even when Dictionary<TKey, TValue> didn't exist :)
  • Daniel Schaffer
    Daniel Schaffer over 14 years
    Another +1 for premature optimization. I agree so much I wrote a new comment in addition to upvoting the answer and clicking the up arrow on Dave's comment.
  • Gord
    Gord over 14 years
    I was going to add a comment to this effect but I knew someone else would get to it first.
  • andriy
    andriy over 14 years
    "If it doesn't work, it doesn't matter how fast it doesn't work." stackoverflow.com/questions/58640/great-programming-quotes/…
  • Neil N
    Neil N over 14 years
    Should? And I meant HashSet<T> I never liked the old HashTables in .Net
  • Pavel Minaev
    Pavel Minaev over 14 years
    Guess what type KeyedCollection is based on to provide amortized O(1) retrieval by key?..
  • JulianR
    JulianR over 14 years
    Not to be Mr Fuzzy Pants here, but O(1) doesn't mean it's fast, it just means that it's constant time. But that constant time might be very long or very short. That said, O(1) tends to be fast in practice.
  • Joel Mueller
    Joel Mueller over 14 years
    Non-generic equals non-starter, for me. I do wonder why they never made a generic version of this class, though.
  • Paul
    Paul over 14 years
    I agree with, and have used in practice, this approach of using an interface/abstraction up front and then removing it as performance testing indicates. And I'd add not to remove that interface too soon! It can cost you design flexibility down the road which can lead to a lot of refactoring....I've been there too.
  • Josh Davis
    Josh Davis over 14 years
    It's not "premature optimization" if it's something that will influence the design of the application. You don't want to write an application then halfway through realize that your design is wrong and that you have to rewrite it. People are so quick to regurgitate out-of-context quotes...
  • John Saunders
    John Saunders over 14 years
    No, it's premature optimization because the code doesn't work yet! Make it work, make the automated tests work, then if you find you need to change the design because the prformance is inadequate, then you at least have all the automated tests written to prove your faster code works faster. You also don't know ahead of time that this will affect the design.
  • Admin
    Admin over 14 years
    There is video, youtube.com/watch?v=aAb7hSCtvGw, I saw once with Joshua Bloch talking about how in some it does pay to think about performance before it becomes a problem (I think its about halfway though). From the, Great programming quotes CW - "Weeks of coding can save you hours of planning."
  • John Saunders
    John Saunders about 14 years
    @Morbo: Eric Lippert said, "The best way to do this is to build a reasonable abstraction layer." in his answer.
  • LarsTech
    LarsTech over 12 years
    I'll take a guess and say "Super Saver" in the 2010 Kentucky Derby.
  • ErikE
    ErikE about 12 years
    This downvoter didn't like the attitude displayed in #1. (I am tempted, in the style of the movie The Princess Bride, to call you 'pig').
  • John Saunders
    John Saunders about 12 years
    @ErikE: thanks for saying. I don't feel there's an "attitude". The two things are different: C# is not .NET. The exact same dictionaries can be used by any other .NET language. I think it's a good idea to call things what they are.
  • ErikE
    ErikE about 12 years
    I actually agree with you--it's a good idea to call things what they are. I always prefer to be exact! On the other hand, the last 3 sentences are irrelevant to the downvote. Do you mind if, without changing the factual information, I fix the part I find objectionable?
  • John Saunders
    John Saunders about 12 years
    Go ahead. I'd like to see that. I can always change it back.
  • nawfal
    nawfal about 11 years
    This is indeed premature optimization. Performance and design should always be first planned, but if O(1) lookup with a key is what OP wants, then Dictionary<,> is what .NET has to offer and worrying about it before having real world data is too much thinking. I mean in .NET, HahSet<>, List<>, LinkedList<>, Dictionary<,>, KeyedCollection<,> etc do different jobs. If you're planning beforehand the type of access/addition your structure would need, that's what is called designing. But if you find Dictionary<> is the right tool, then worrying abt its performance at this stage is worthless. +1
  • John Thoits
    John Thoits almost 5 years
    You didn't answer the question. For some reason I can comment on an upvote but not a downvote, and I suppose I'm going to get nicked for even being bothered by that.