Storing arrays in Set and avoiding duplicates

32,720

Solution 1

You can't. arrays use the default identity-based Object.hashCode() implementation and there's no way you can override that. Don't use Arrays as keys in a HashMap / HashSet!

Use a Set of Lists instead.

Solution 2

The "better way" is to use collections. Use a List instead of a String[]:

Set<List<String>> boog = //...
boog.add(Arrays.asList("a", "b", "c"));
boog.add(Arrays.asList("a", "b", "c"));
boog.add(Arrays.asList("a", "b", "d"));

System.out.println(boog.size()); // 2

Edit

If you absolutely needed to use arrays as keys, you could build a transparent wrapper around each key and put that in the map. Some libraries help you with that. For example, here's how you could do a Set<String[]> using Trove:

Set<String[]> boog = new TCustomHashSet<String[]>(new ArrayHashingStrategy());

boog.add(new String[]{"a", "b", "c"});
boog.add(new String[]{"a", "b", "c"});
boog.add(new String[]{"a", "b", "d"});

System.out.println(boog.size()); // 2

//...
public class ArrayHashingStrategy extends HashingStrategy<Object[]> {

   public int computeHashCode(Object[] array) {
      return Arrays.hashCode(array);
   }

   public boolean equals(Object[] arr1, Object[] arr2) {
      return Arrays.equals(arr1, arr2);
   }
}        

Solution 3

hashCode() of arrays uses the default implementation, which doesn't take into account the elements, and you can't change that.

You can use a List instead, with a hashCode() calculated based on the hashcodes of its elements. ArrayList (as most implementations) uses such function.


Alternatively (but less preferable, unless you are forced somehow to use arrays), you can use a 'special' HashSet where instead of invoking key.hashCode() invoke Arrays.hashCode(array). To implement that extend HashMap and then use Collections.newSetFromMap(map)

Solution 4

Actually, you can. You can use TreeSet with provided Comparator. In your case it will be something like:

Set<String[]> boog = new TreeSet<>((o1, o2) -> {
    for (int i = 0; i < o1.length; i++){
        int cmp = o1[i].compareTo(o2[i]);
        if (cmp != 0) {
            return cmp;
        }
    }
    return o1.length - o2.length;
});

Under the hood it will looks like alphabetic sorted tree.

Solution 5

You can use TreeSet.

TreeSet uses Comparable.compareTo() or Comparator.compare() instead of hashCode() and equals() to compare elements. It can be specified in the constructor.

public static void main(String[] args) {
    Set<String[]> boog = new TreeSet<>(Arrays::compare);
    boog.add(new String[]{"a", "b", "c"});
    boog.add(new String[]{"a", "b", "c"});
    boog.add(new String[]{"a", "b", "d"});
    for (String[] e : boog)
        System.out.println(Arrays.toString(e));
}

output:

[a, b, c]
[a, b, d]

Arrays::compare is a Comparator that compares arrays in lexicographical order.

Share:
32,720
Booker
Author by

Booker

Updated on November 21, 2021

Comments

  • Booker
    Booker over 2 years
    HashSet<String[]> boog = new HashSet<String[]>();
    boog.add(new String[]{"a", "b", "c"});
    boog.add(new String[]{"a", "b", "c"});
    boog.add(new String[]{"a", "b", "d"});
    

    results in

    [a, b, c]
    [a, b, d]
    [a, b, c]
    

    where [a,b,c] is repeated, so the hash function is not working as expected. How would I go about overriding the Hash method for String arrays. Or for that matter, a generic array? Is there a better way to accomplish what I'm trying to do?

  • Sean Patrick Floyd
    Sean Patrick Floyd over 12 years
    The problem with the latter approach is that HashSet internally uses HashMap, so you will have to provide a replacement for that as well. And the HashMap is in a private field, so you'll have to do it with Reflection. Messy.
  • Bozho
    Bozho over 12 years
    true, then extend HashMap and use Collections.newSetFromMap(..)
  • Daniel Pryden
    Daniel Pryden over 12 years
    +1 for spelling out Set<List<String>> and for using Arrays.asList(), which is almost certainly what the OP wants.
  • Mark Peters
    Mark Peters over 12 years
    Just a note that Arrays.asList provides a perfectly usable wrapper for you already.
  • Jean Logeart
    Jean Logeart over 12 years
    Perfectly true. It should even be preferred. Yet I was just offering an alternative way, or more exactly I showed how he wanted it to work :)
  • jcsahnwaldt Reinstate Monica
    jcsahnwaldt Reinstate Monica almost 2 years
    The suggestion to extend HashMap doesn't work. HashMap has a method hash(key) which is used by many other methods. We can't override hash(key) (because it's package-private in older JDK versions and static in newer versions), and we also can't override some of the methods calling hash(key) (because they are package-private). HashMap isn't designed for extension.
  • jcsahnwaldt Reinstate Monica
    jcsahnwaldt Reinstate Monica almost 2 years