what is faster hashset clear or new hashset?

18,339

Solution 1

Although clear might be more performant, this depends on the size of the set. In practice, this is not likely to make a significant difference in the performance of your application. Even in the lines of code around this function, performance will be dominated by other factors such as JIT compilation.

What is important is design quality, which will make it easy to refactor for performance after you have profiled your code. In most cases, avoiding hard-to-track state changes is important, and creating a new HashSet is better design than reuse of a HashSet.

Solution 2

I would suggest you do what you believe is clearest and simplest.

The problem with trying to reuse HashSet is that most of the objects used internally are not recycled. If you want this use Javolution's FastSet http://javolution.org/target/site/apidocs/javolution/util/FastSet.html

HashSet is not the most efficient set collection so if you really care about this level of micro-optimisation you are likely to find that a different collection suits your use case better. However 99% of the time it is just fine and the most obvious choice for a hash set and I suspect it doesn't matter how you use it.

Solution 3

HashSet clear calls the map.clear()

which is

 /**
  621        * Removes all of the mappings from this map.
  622        * The map will be empty after this call returns.
  623        */
  624       public void clear() {
  625           modCount++;
  626           Entry[] tab = table;
  627           for (int i = 0; i < tab.length; i++)
  628               tab[i] = null;
  629           size = 0;
  630       }

So it is certainly dependent on the size of the Set . But the answer would be to benchmark it on your app environment

Note: Referring to OpenJDK implementation in this talk

Solution 4

A HashSet is backed by a HashMap, and the call to clear() reroutes to HashMap.clear(). That is (at least in Java 1.6) implemented as follows:

/**
 * Removes all of the mappings from this map.
 * The map will be empty after this call returns.
 */
public void clear() {
    modCount++;
    Entry[] tab = table;
    for (int i = 0; i < tab.length; i++)
        tab[i] = null;
    size = 0;
}

So I would suppose, that a new HashSet will be a lot faster, if your old HashSet was very large.

Solution 5

HashSet#clear will iterate through all of the elements in the backing HashMap and set them to null.

I would imagine that set = new HashSet<String>() will be faster, for increasingly large values of n.

Share:
18,339
oshai
Author by

oshai

Kotlin &amp; Java developer, new technologies enthusiast. likes beautiful code when I see it.

Updated on June 04, 2022

Comments

  • oshai
    oshai almost 2 years

    assuming I have defined a HashSet<String> set.
    What is better in terms of performance:

    set.clear;
    

    or

    set = new HashSet<String>();
    

    EDIT: the reason I am checking this is that currently my code has the second option, but I want to change it to the first one to make the Set final.