HashMap vs ArrayList performance am I correct

79,594

Solution 1

Generally, yes, you are correct. There's also a combined data structure, the LinkedHashMap, which offers fast access to arbitrary elements as well as predictable ordering.

However, it's worth noting that ArrayList and HashMap are only two implementations of the List and Map interfaces, respectively. There are other implementations of each that might be more suitable for more specific requirements. For example, a LinkedList might provide higher performance than an ArrayList for certain queueing/dequeueing requirements.

Solution 2

A Map is a map, or "associative array". It has a key->value layout. A List is on the other hand a list, which is an ordered collection of elements.

A more direct comparison would possibly be between Set and List: Both these hold values, where the list is explicitly ordered (you can get element # x), and the set is (typically) not ordered (well, unless it is an SortedSet, in which case iteration order will be ordered by a Comparator).

The two most common implementations for Set and List is HashSet and ArrayList. To check if an element belongs in an arraylist (contains(element)), the implementation iterate over all the elements of it, checking whether one have found the element using the equals() method. To check if an element belongs in a hashset, first the element's hashCode() is calculated, then one goes "directly" to the position where this element should reside, and checks if it is there.

Thus, a significant difference between ArrayList and HashSet is the speed of contains().

On a list, you can ask for element# x, in addition to what you can do on a set, which is add, remove, ask-whether-present (contains), and iterate over all elements.

On a map, you can ask for an element by its key, instead of by its index as you do with a list.

A HashSet is currently implemented simply by a HashMap where the value part of the key->value relationship is not used. This is completely absurd and has no use whatsoever other than wasting at least 4 bytes, on could argue 12, for every and all elements inserted in the HashSet.

Solution 3

I would say that you're generally correct, but not entirely accurate. You use a HashMap for data retrieval, but not always randomly. You use an ArrayList for iteration but you can also use it for lookups via the index.

More generally, you use a Map implementation when you need to efficiently retrieve items by lookup, i.e. retrieving something based on the key - such as dictionaries, caches, repositories, etc.

You use a List implementation when you just want a data structure where you can iterate over your data, usually when you want them in a predetermined and/or predictable order.

In other words, you use Maps as an indexing data structure, and you use Lists as you would usually use arrays.

Solution 4

Don't forget it's also much faster to get to one specific item with a map (if you have the key) than it is from an array (unless you have an index, but a key will always get you the right value whereas having an index may not work if new elements are inserted or older ones removed).

Solution 5

For me, it's more whether I care about the ordering of the items in the collection. If you do care about the order then use the ArrayList. If you don't care about the order (you just want to store a bunch of items) then you can use a HashMap.

Share:
79,594
Ankur
Author by

Ankur

A junior BA have some experience in the financial services industry. I do programming for my own personal projects hence the questions might sound trivial.

Updated on April 21, 2020

Comments

  • Ankur
    Ankur about 4 years

    I currently believe that:

    • When you need a structure from which you will be retrieving items randomly - use a HashMap
    • When you will be retrieving items in order (e.g. using a for loop) - use an ArrayList

    Am I generally correct? Are there situations where this is not correct?

  • brady
    brady over 14 years
    I doubt a linked list would give you better performance. Array access is O(1), just like a single link traversal, but I'd expect locality of reference to be better with an array, giving you a better results.
  • Stephen C
    Stephen C over 14 years
    LinkedList is not significantly better than ArrayList for iterative performance. (And it definitely uses more memory!). But it is significantly better for some kinds of list insertion and deletion operations.
  • brady
    brady over 14 years
    Yes, I agree about insert/delete being better for LinkedList, especially when used as a Deque. My comments were a response to, "for pure iterative performance..."
  • aberrant80
    aberrant80 over 14 years
    Yea, you're right. I confused it with the performance of insertions and deletions.
  • Dave
    Dave almost 12 years
    It's also worth noting that if these collections are to be persisted to a database via JPA (or an ORM like Hibernate), useful alternative implementations like LinkedHashMap cannot be readily used because of the limited fashion in which ORMs reconstitute collections.
  • Jeremy
    Jeremy about 10 years
    HashSets I've used once or twice to maintain a set of elements, the source for which were not guaranteed to be distinct & unique (i.e, there could be duplicate entries from the data stream.) Using a hashset, you are guaranteed to not have duplicates (granted equals and hashcode are implemented correctly.) Presumably this is more optimal than testing in List contains a given entry per every insertion operation.