How to select a random key from a HashMap in Java?

44,278

Solution 1

I managed to find a solution without performance loss. I will post it here since it may help other people -- and potentially answer several open questions on this topic (I'll search for these later).

What you need is a second custom Set-like data structure to store the keys -- not a list as some suggested here. Lists-like data structures are to expensive to remove items from. The operations needed are adding/removing elements in constant time (to keep it up-to-date with the HashMap) and a procedure to select the random element. The following class MySet does exactly this

class MySet<A> {
     ArrayList<A> contents = new ArrayList();
     HashMap<A,Integer> indices = new HashMap<A,Integer>();
     Random R = new Random();

     //selects random element in constant time
     A randomKey() {
         return contents.get(R.nextInt(contents.size()));
     }

     //adds new element in constant time
     void add(A a) {
         indices.put(a,contents.size());
         contents.add(a);
     }

     //removes element in constant time
     void remove(A a) {
        int index = indices.get(a);
        contents.set(index,contents.get(contents.size()-1));
        indices.put(contents.get(index),index);
        contents.remove((int)(contents.size()-1));
        indices.remove(a);
     }
}

Solution 2

from top of my head

List<A> keysAsArray = new ArrayList<A>(map.keySet())
Random r = new Random()

then just

map.get(keysAsArray.get(r.nextInt(keysAsArray.size()))

Solution 3

You need access to the underlying entry table.

// defined staticly
Field table = HashMap.class.getDeclaredField("table");
table.setAccessible(true);
Random rand = new Random();

public Entry randomEntry(HashMap map) {
    Entry[] entries = (Entry[]) table.get(map);
    int start = rand.nextInt(entries.length);
    for(int i=0;i<entries.length;i++) {
       int idx = (start + i) % entries.length;
       Entry entry = entries[idx];
       if (entry != null) return entry;
    }
    return null;
}

This still has to traverse the entries to find one which is there so the worst case is O(n) but the typical behaviour is O(1).

Solution 4

Sounds like you should consider either an ancillary List of keys or a real object, not a Map, to store in your list.

Solution 5

As @Alberto Di Gioacchino pointed out, there is a bug in the accepted solution with the removal operation. This is how I fixed it.

class MySet<A> {
     ArrayList<A> contents = new ArrayList();
     HashMap<A,Integer> indices = new HashMap<A,Integer>();
     Random R = new Random();

     //selects random element in constant time
     A randomKey() {
         return contents.get(R.nextInt(contents.size()));
     }

     //adds new element in constant time
     void add(A item) {
         indices.put(item,contents.size());
         contents.add(item);
     }

     //removes element in constant time
     void remove(A item) {
        int index = indices.get(item);
        contents.set(index,contents.get(contents.size()-1));
        indices.put(contents.get(index),index);
        contents.remove(contents.size()-1);
        indices.remove(item);
     }
}
Share:
44,278
user1111929
Author by

user1111929

Updated on May 03, 2021

Comments

  • user1111929
    user1111929 about 3 years

    I'm working with a large ArrayList<HashMap<A,B>>, and I would repeatedly need to select a random key from a random HashMap (and do some stuff with it). Selecting the random HashMap is trivial, but how should I select a random key from within this HashMap?

    Speed is important (as I need to do this 10000 times and the hashmaps are large), so just selecting a random number k in [0,9999] and then doing .next() on the iterator k times, is really not an option. Similarly, converting the HashMap to an array or ArrayList on every random pick is really not an option. Please, read this before replying.

    Technically I feel that this should be possible, since the HashMap stores its keys in an Entry[] internally, and selecting at random from an array is easy, but I can't figure out how to access this Entry[]. So any ideas to access the internal Entry[] are more than welcome. Other solutions (as long as they don't consume linear time in the hashmap size) are also welcome of course.

    Note: heuristics are fine, so if there's a method that excludes 1% of the elements (e.g. because of multi-filled buckets) that's no problem at all.