How to retrieve actual item from HashSet<T>?

100,923

Solution 1

What you're asking for was added to .NET Core a year ago, and was recently added to .NET 4.7.2:

In .NET Framework 4.7.2 we have added a few APIs to the standard Collection types that will enable new functionality as follows.
- ‘TryGetValue‘ is added to SortedSet and HashSet to match the Try pattern used in other collection types.

The signature is as follows (found in .NET 4.7.2 and above):

    //
    // Summary:
    //     Searches the set for a given value and returns the equal value it finds, if any.
    //
    // Parameters:
    //   equalValue:
    //     The value to search for.
    //
    //   actualValue:
    //     The value from the set that the search found, or the default value of T when
    //     the search yielded no match.
    //
    // Returns:
    //     A value indicating whether the search was successful.
    public bool TryGetValue(T equalValue, out T actualValue);

P.S.: In case you're interested, there is related function they're adding in the future - HashSet.GetOrAdd(T).

Solution 2

This is actually a huge omission in the set of collections. You would need either a Dictionary of keys only or a HashSet that allows for the retrieval of object references. So many people have asked for it, why it doesn't get fixed is beyond me.

Without third-party libraries the best workaround is to use Dictionary<T, T> with keys identical to values, since Dictionary stores its entries as a hash table. Performance-wise it is the same as the HashSet, but it wastes memory of course (size of a pointer per entry).

Dictionary<T, T> myHashedCollection;
...
if(myHashedCollection.ContainsKey[item])
    item = myHashedCollection[item]; //replace duplicate
else
    myHashedCollection.Add(item, item); //add previously unknown item
...
//work with unique item

Solution 3

This method has been added to .NET Framework 4.7.2 (and .NET Core 2.0 before it); see HashSet<T>.TryGetValue. Citing the source:

/// <summary>
/// Searches the set for a given value and returns the equal value it finds, if any.
/// </summary>
/// <param name="equalValue">The value to search for.
/// </param>
/// <param name="actualValue">
/// The value from the set that the search found, or the default value
/// of <typeparamref name="T"/> when the search yielded no match.</param>
/// <returns>A value indicating whether the search was successful.</returns>
/// <remarks>
/// This can be useful when you want to reuse a previously stored reference instead of 
/// a newly constructed one (so that more sharing of references can occur) or to look up
/// a value that has more complete data than the value you currently have, although their
/// comparer functions indicate they are equal.
/// </remarks>
public bool TryGetValue(T equalValue, out T actualValue)

Solution 4

What about overloading the string equality comparer:

  class StringEqualityComparer : IEqualityComparer<String>
{
    public string val1;
    public bool Equals(String s1, String s2)
    {
        if (!s1.Equals(s2)) return false;
        val1 = s1;
        return true;
    }

    public int GetHashCode(String s)
    {
        return s.GetHashCode();
    }
}
public static class HashSetExtension
{
    public static bool TryGetValue(this HashSet<string> hs, string value, out string valout)
    {
        if (hs.Contains(value))
        {
            valout=(hs.Comparer as StringEqualityComparer).val1;
            return true;
        }
        else
        {
            valout = null;
            return false;
        }
    }
}

And then declare the HashSet as:

HashSet<string> hs = new HashSet<string>(new StringEqualityComparer());

Solution 5

Another Trick would do Reflection, by accessing the internal function InternalIndexOf of HashSet. Keep in mind the fieldnames are hardcoded, so if those change in upcoming .NET versions this will break.

Note: If you use Mono, you should change field name from m_slots to _slots.

internal static class HashSetExtensions<T>
{
    public delegate bool GetValue(HashSet<T> source, T equalValue, out T actualValue);

    public static GetValue TryGetValue { get; }

    static HashSetExtensions() {
        var targetExp = Expression.Parameter(typeof(HashSet<T>), "target");
        var itemExp   = Expression.Parameter(typeof(T), "item");
        var actualValueExp = Expression.Parameter(typeof(T).MakeByRefType(), "actualValueExp");

        var indexVar = Expression.Variable(typeof(int), "index");
        // ReSharper disable once AssignNullToNotNullAttribute
        var indexExp = Expression.Call(targetExp, typeof(HashSet<T>).GetMethod("InternalIndexOf", BindingFlags.NonPublic | BindingFlags.Instance), itemExp);

        var truePart = Expression.Block(
            Expression.Assign(
                actualValueExp, Expression.Field(
                    Expression.ArrayAccess(
                        // ReSharper disable once AssignNullToNotNullAttribute
                        Expression.Field(targetExp, typeof(HashSet<T>).GetField("m_slots", BindingFlags.NonPublic | BindingFlags.Instance)), indexVar),
                    "value")),
            Expression.Constant(true));

        var falsePart = Expression.Constant(false);

        var block = Expression.Block(
            new[] { indexVar },
            Expression.Assign(indexVar, indexExp),
            Expression.Condition(
                Expression.GreaterThanOrEqual(indexVar, Expression.Constant(0)),
                truePart,
                falsePart));

        TryGetValue = Expression.Lambda<GetValue>(block, targetExp, itemExp, actualValueExp).Compile();
    }
}

public static class Extensions
{
    public static bool TryGetValue2<T>(this HashSet<T> source, T equalValue,  out T actualValue) {
        if (source.Count > 0) {
            if (HashSetExtensions<T>.TryGetValue(source, equalValue, out actualValue)) {
                return true;
            }
        }
        actualValue = default;
        return false;
    }
}

Test:

var x = new HashSet<int> { 1, 2, 3 };
if (x.TryGetValue2(1, out var value)) {
    Console.WriteLine(value);
}
Share:
100,923
Francois C
Author by

Francois C

Updated on April 15, 2020

Comments

  • Francois C
    Francois C about 4 years

    I've read this question about why it is not possible, but haven't found a solution to the problem.

    I would like to retrieve an item from a .NET HashSet<T>. I'm looking for a method that would have this signature:

    /// <summary>
    /// Determines if this set contains an item equal to <paramref name="item"/>, 
    /// according to the comparison mechanism that was used when the set was created. 
    /// The set is not changed. If the set does contain an item equal to 
    /// <paramref name="item"/>, then the item from the set is returned.
    /// </summary>
    bool TryGetItem<T>(T item, out T foundItem);
    

    Searching the set for an item with such a method would be O(1). The only way to retrieve an item from a HashSet<T> is to enumerate all items which is O(n).

    I haven't find any workaround to this problem other then making my own HashSet<T> or use a Dictionary<K, V>. Any other idea?

    Note:
    I don't want to check if the HashSet<T> contains the item. I want to get the reference to the item that is stored in the HashSet<T> because I need to update it (without replacing it by another instance). The item I would pass to the TryGetItem would be equal (according to the comparison mechanism that I've passed to the constructor) but it would not be the same reference.

  • Ed T
    Ed T over 9 years
    I would suggest that the keys to his dictionary should be whatever he has currently placed in his EqualityComparer for the hashset. I feel it is dirty to use an EqualityComparer when you aren't really saying the items are equal (otherwise you could just use the item you created for the purpose of the comparison). I'd make a class/struct that represents the key. This comes at the cost of more memory of course.
  • Access Denied
    Access Denied almost 8 years
    Since key is stored inside Value I suggest using collection inherited from KeyedCollection instead of Dictionary. msdn.microsoft.com/en-us/library/ms132438(v=vs.110).aspx
  • mp666
    mp666 almost 8 years
    This is all about memory management - returning the actual item that is in the hashset rather than an identical copy. So in the above code we find the string with the same content and then return a reference to this. For strings this is a similar to what interning does.
  • Piotr Kula
    Piotr Kula almost 8 years
    This is an interesting way, just you need to wrap the second in a try - so that if you searching for something that is not in the list you will get a NullReferenceExpection . But its a step in the correct direction?
  • Graeme Wicksted
    Graeme Wicksted almost 8 years
    @zumalifeguard @mp666 this is not guaranteed to work as-is. It would require someone instantiating the HashSet to provide the specific value converter. An optimal solution would be for TryGetValue to pass in a new instance of the specialized StringEqualityComparer (otherwise the as StringEqualityComparer could result in a null causing the .val1 property access to throw). In doing so, StringEqualityComparer can become a nested private class within HashSetExtension. Futher, in case of an overridden equality comparer, the StringEqualityComparer should call into the default.
  • mp666
    mp666 over 7 years
    you need to declare your HashSet as: HashSet<string> valueCash = new HashSet<string>(new StringEqualityComparer())
  • Daniel A.A. Pelsmaeker
    Daniel A.A. Pelsmaeker over 7 years
    Since you're using the Linq extension method Enumerable.Contains, it will enumerate all elements of the set and compare them, losing any benefits the hash implementation of the set provides. Then you might as well just write set.SingleOrDefault(e => set.Comparer.Equals(e, obj)), which has the same behavior and performance characteristics as your solution.
  • Graeme Wicksted
    Graeme Wicksted over 7 years
    @Virtlink Good catch -- You're absolutely right. I'll modify my answer.
  • Daniel A.A. Pelsmaeker
    Daniel A.A. Pelsmaeker over 7 years
    However, if you were to wrap a HashSet that uses your comparator internally, it would work. Like this: Utillib/ExtHashSet
  • Graeme Wicksted
    Graeme Wicksted over 7 years
    @Virtlink thank you! I ended up wrapping HashSet as one option but providing the comparers and an extension method for additional versatility. It's now thread-safe and will not leak memory... but it's quite a bit more code than I had hoped!
  • Graeme Wicksted
    Graeme Wicksted over 7 years
    @Francois Writing the code above was more an exercise of figuring out an "optimal" time/memory solution; however, I don't suggest you go with this method. Using a Dictionary<T,T> with a custom IEqualityComparer is much more straight-forward and future-proof!
  • Niklas Ekman
    Niklas Ekman over 7 years
    LINQ traverses the collection in a foreach loop, i.e. O(n) lookup time. While it is a solution to the problem, it kind of defeats the purpose of using a HashSet in the first place.
  • M.kazem Akhgary
    M.kazem Akhgary almost 7 years
    Dirty hack. I know how it works but its lazy just make it work kind of solution
  • nawfal
    nawfal about 6 years
    As well as for SortedSet as well.
  • j.hull
    j.hull over 5 years
    I am not sure why you got down votes for when I applied this logic it worked. I needed to extract values from a structure that started with Dictionary<string,ISet<String>> where the ISet contained x number of values. The most direct way to get those values was to loop through the dictionary pulling the key and the ISet Value. Then I looped through the ISet to display the individual values. It is not elegant, but it worked.
  • ekalchev
    ekalchev over 2 years
    Because someone would use HashSet to achieve O(1) complexity and ToList() make this method O(n)