Performance and Memory allocation comparison between List and Set

64,779

Solution 1

If you don't care about the ordering, and don't delete elements, then it really boils down to whether you need to find elements in this data structure, and how fast you need those lookups to be.

Finding an element by value in a HashSet is O(1). In an ArrayList, it's O(n).

If you are only using the container to store a bunch of unique objects, and iterate over them at the end (in any order), then arguably ArrayList is a better choice since it's simpler and more economical.

Solution 2

HashSet consumes about 5.5 times more memory than ArrayList for the same number of elements (although they're both still linear), and has significantly slower iteration (albeit with the same asymptotics); a quick Google search suggests a 2-3x slowdown for HashSet iteration versus ArrayList.

If you don't care about uniqueness or the performance of contains, then use ArrayList.

Solution 3

If you plan only to add elements and later iterate over them, your best bet is ArrayList as it's closest to the arrays you are replacing. It's more memory efficient than LinkedList or any Set implementation, has fast insertion, iteration, and random access.

Solution 4

If you will compare, searching between List and Set, Set will be better because of the underline Hashing algorithm.

In the case of a list, in worst case scenario, contains will search till the end. In case of Set, because of hashing and bucket, it will search only subset.

Sample use case: Add 1 to 100_000 integer to ArrayList and HashSet. Search each integer in ArrayList and HashSet.

Set will take 9 milliseconds where as List will take 16232 seconds.

private static void compareSetvsList(){
    List<Integer> list = new ArrayList<>() ;
    Set<Integer> set = new HashSet<>() ;

    System.out.println("Setting values in list and set .... ");
    int counter = 100_000  ;

    for(int i =0 ; i< counter ; i++){            
        list.add(i);
        set.add(i);
    }

    System.out.println("Checking time .... ");
    long l1 = System.currentTimeMillis();
    for(int i =0 ; i< counter ; i++) list.contains(i);

    long l2 = System.currentTimeMillis();
    System.out.println(" time taken for list : "+ (l2-l1));

    for(int i =0 ; i< counter ; i++)set.contains(i);

    long l3 = System.currentTimeMillis();
    System.out.println(" time taken for set : "+ (l3-l2));

    //      for 10000   time taken for list : 123        time taken for set : 4
    //      for 100000  time taken for list : 16232          time taken for set : 9
    //      for 1000000 time taken for list : hung       time taken for set : 26

}

Solution 5

Use HashSet if you need to use .contains(T) frequently.

Example:

private static final HashSet<String> KEYWORDS = Stream.of(new String[]{"if", "do", "for", "try", "while", "break", "return"}).collect(Collectors.toCollection(HashSet::new));

public boolean isKeyword(String str) {
     return KEYWORDS.contains(str);
}
Share:
64,779
Priyank Doshi
Author by

Priyank Doshi

I am a software development engineer at Amazon. Language: java c/c++ html , css , jquery Frameworks: Spring core &amp; MVC Hibernate

Updated on July 05, 2022

Comments

  • Priyank Doshi
    Priyank Doshi almost 2 years

    I want to know the comparison between List and Set in terms of performance,memory allocation and usability.

    If i don't have any requirement of keeping the uniqueness in the list of objects, neither required the insertion order to be maintained, Can I use ArrayList and SortedSet/HashSet interchangeably? Will it be good to directly use Collections class instead of even list/set?

    P.S. I also don't have any need for list or set specific functions provided by java. I am using List/Set instead of Array only because they can dynamically grow without extra programming efforts.

  • Pau Kiat Wee
    Pau Kiat Wee almost 12 years
    Finding a element in ArrayList is O(n)? Is not find element in ArrayList is direct access via index?
  • Louis Wasserman
    Louis Wasserman almost 12 years
    I assume by "find element" he means contains.
  • NPE
    NPE almost 12 years
    @PauKiatWee: I mean "find by value". OP has already stated that he doesn't care about the ordering of elements, so accessing by index is clearly irrelevant.
  • Louis Wasserman
    Louis Wasserman almost 12 years
    Well, if you know the number of elements to expect in advance, and allocate the ArrayList accordingly, it's closer to 8 times. But it doesn't sound like we can make that assumption about the OP's situation.
  • Louis Wasserman
    Louis Wasserman almost 12 years
    Basically, the rule is that arrays are 4 bytes per element, and an ArrayList consists of little more than an array (with some room to grow) and a constant number of int fields.
  • Marko Topolnik
    Marko Topolnik almost 12 years
    Nowadays they're more likely to be 8 bytes/element.
  • Louis Wasserman
    Louis Wasserman almost 12 years
    Depends if you're using a 32-bit or a 64-bit VM. That said, HashSet gets hurt worse by 8-byte references than ArrayList does -- adding an extra 4 bytes per reference, based on the linked memory consumption chart, brings ArrayList up to ~12 bytes per element, and HashSet up to ~52 bytes per element.)
  • Jitendra
    Jitendra over 7 years
    Finding an element by value in a HashSet is O(1) How is it?