When using a LINQ Where clause on a Dictionary, how can I return a dictionary of the same type?

23,651

Solution 1

Seems to be a lot of fussing about looking things up in the List. If the list only contains a few elements, then no big deal. If the List contains thousands of elements, you're going to want O(1) lookups into it. HashSet can provide this.

Dictionary<string, string> getValidIds(
  Dictionary<string, string> SalesPersons,
  List<string> ids
)
{
  HashSet<string> idsFast = new HashSet<string>(ids);
  Dictionary<string, string> result = SalesPersons
    .Where(kvp => idsFast.Contains(kvp.Key))
    .ToDictionary(kvp => kvp.Key, kvp => kvp.Value)
  return result;
}

Solution 2

Pretty sure you could just call ToDictionary on the result of the Where call:

Dictionary<string, string> GetValidIds(Dictionary<string, string> salesPersons,
    IList<string> ids)
{
    return salesPersons
        .Where(p => ids.Contains(p.Key))
        .ToDictionary(p => p.Key, p => p.Value);
}

Solution 3

Interestingly, if you enumerate over the dictionary (length N) first and check the list (length M) for inclusion, then you get O(NM) performance.

You could build a HashSet<> of the ids, but that seems redundant since we already have a (pre-hashed) dictionary available.

I would instead iterate over the ids first; since dictionary lookup (by key) is O(1), this gives O(M) performance - this might, however, mean that you don't use LINQ (since TryGetValue won't love LINQ (and introducing a tuple is too much like hard work)...

    Dictionary<string, string> getValidIds(
            IDictionary<string, string> salesPersons,
            IEnumerable<string> ids) {
        var result = new Dictionary<string, string>();
        string value;
        foreach (string key in ids) {
            if (salesPersons.TryGetValue(key, out value)) {
                result.Add(key, value);
            }
        }
        return result;
    }

It doesn't overly concern me that this is more lines than the LINQ version; it removes an O(N) of complexity...


Edit; the following might work (I haven't tested it), but I think it is an abuse of LINQ, and certainly won't scale to PLINQ etc... use with extreme caution!! I also believe the foreach approach simply has fewer overheads, so will be quicker... anyway:

    Dictionary<string, string> getValidIds(
        IDictionary<string, string> salesPersons,
        IEnumerable<string> ids)
    {
        string value = null;
        return  (from key in ids
                where salesPersons.TryGetValue(key, out value) // HACK: v. dodgy
                select new { key, value })
                .ToDictionary(x=>x.key, x=>x.value);
    }
Share:
23,651
user1732401
Author by

user1732401

Using the data you didn't know you gave me to answer questions you didn't know you had

Updated on May 28, 2020

Comments

  • user1732401
    user1732401 almost 4 years

    I'm currently using the following code to achieve this, but is seems there should be something better... suggestions? It just seems to me there should be a way to skip the foreach...

    Dictionary<string,string> getValidIds(Dictionary<string,string> SalesPersons,List<string> ids)
    {
        Dictionary<string,string> o = new Dictionary<string,string>();
        var ie = SalesPersons.Where<KeyValuePair<string, string>>(t => ids.Contains(t.Key));
        foreach (var i in ie)
        {
            o.Add(i.Key, i.Value);
        }
        return o;
    }
    
  • Marc Gravell
    Marc Gravell almost 15 years
    @Matt... well, maybe - but there is a point where O(1)/O(Nlog(N)) *isn't unreasonable, but O(N^2) is a problem - whether that is 1k, 10k or 100k is up to the specific process, of course.