Most efficient way to see if an ArrayList contains an object in Java

144,937

Solution 1

It depends on how efficient you need things to be. Simply iterating over the list looking for the element which satisfies a certain condition is O(n), but so is ArrayList.Contains if you could implement the Equals method. If you're not doing this in loops or inner loops this approach is probably just fine.

If you really need very efficient look-up speeds at all cost, you'll need to do two things:

  1. Work around the fact that the class is generated: Write an adapter class which can wrap the generated class and which implement equals() based on those two fields (assuming they are public). Don't forget to also implement hashCode() (*)
  2. Wrap each object with that adapter and put it in a HashSet. HashSet.contains() has constant access time, i.e. O(1) instead of O(n).

Of course, building this HashSet still has a O(n) cost. You are only going to gain anything if the cost of building the HashSet is negligible compared to the total cost of all the contains() checks that you need to do. Trying to build a list without duplicates is such a case.


* () Implementing hashCode() is best done by XOR'ing (^ operator) the hashCodes of the same fields you are using for the equals implementation (but multiply by 31 to reduce the chance of the XOR yielding 0)

Solution 2

You could use a Comparator with Java's built-in methods for sorting and binary search. Suppose you have a class like this, where a and b are the fields you want to use for sorting:

class Thing { String a, b, c, d; }

You would define your Comparator:

Comparator<Thing> comparator = new Comparator<Thing>() {
  public int compare(Thing o1, Thing o2) {
    if (o1.a.equals(o2.a)) {
      return o1.b.compareTo(o2.b);
    }
    return o1.a.compareTo(o2.a);
  }
};

Then sort your list:

Collections.sort(list, comparator);

And finally do the binary search:

int i = Collections.binarySearch(list, thingToFind, comparator);

Solution 3

Given your constraints, you're stuck with brute force search (or creating an index if the search will be repeated). Can you elaborate any on how the ArrayList is generated--perhaps there is some wiggle room there.

If all you're looking for is prettier code, consider using the Apache Commons Collections classes, in particular CollectionUtils.find(), for ready-made syntactic sugar:

ArrayList haystack = // ...
final Object needleField1 = // ...
final Object needleField2 = // ...

Object found = CollectionUtils.find(haystack, new Predicate() {
   public boolean evaluate(Object input) {
      return needleField1.equals(input.field1) && 
             needleField2.equals(input.field2);
   }
});

Solution 4

If the list is sorted, you can use a binary search. If not, then there is no better way.

If you're doing this a lot, it would almost certainly be worth your while to sort the list the first time. Since you can't modify the classes, you would have to use a Comparator to do the sorting and searching.

Solution 5

Even if the equals method were comparing those two fields, then logically, it would be just the same code as you doing it manually. OK, it might be "messy", but it's still the correct answer

Share:
144,937
Parrots
Author by

Parrots

Freelance designer and developer, focusing on web and iOS apps. Gamer, runner, pilot, and all around nerd.

Updated on July 05, 2022

Comments

  • Parrots
    Parrots almost 2 years

    I have an ArrayList of objects in Java. The objects have four fields, two of which I'd use to consider the object equal to another. I'm looking for the most efficient way, given those two fields, to see if the array contains that object.

    The wrench is that these classes are generated based on XSD objects, so I can't modify the classes themselves to overwrite the .equals.

    Is there any better way than just looping through and manually comparing the two fields for each object and then breaking when found? That just seems so messy, looking for a better way.

    Edit: the ArrayList comes from a SOAP response that is unmarshalled into objects.

  • oxbow_lakes
    oxbow_lakes about 15 years
    This isn't likely to be any quicker than a manual search as it doesn't sound as if his collection is sorted
  • Parrots
    Parrots about 15 years
    Tragically it's sorted by one of the two fields I don't care about. I could use a custom comparator to sort based on the one field that would help in the case of a binary search, but I have a feeling that wouldn't help much in terms of overall speed :|
  • palantus
    palantus about 15 years
    @Parrots: Is it possible to sort it once and then do all the searches? If so, and if you have a fair number of objects (say 50) in the list, a binary search is definitely going to be faster.
  • palantus
    palantus about 15 years
    On a totally unrelated subject, I wish the HTML sanitizer didn't use a greedy regex. That first link is supposed to be two different links, but the </a> and <a> in the middle got swallowed.
  • Lordn__n
    Lordn__n about 15 years
    Only if searched multiple times.
  • Vincent Floriot
    Vincent Floriot about 15 years
    Binary search will definitely be way faster than a normal linear search. This is assuming you get the entire list and only have to sort it once, otherwise you lose the speed advantage gained by using binary search. With 10,000 elements binary search = 14 comparisons, without = 10000 comparisons
  • OscarRyz
    OscarRyz about 15 years
    What if the underlaying List impleentation wasn't an arraylist but some sort of hashSet? That would be faster isn't?
  • Overflown
    Overflown about 15 years
    This is the path of least resistance. A HashSet takes time that is hard to analyze. This solution is equiv to the STL set
  • palantus
    palantus about 15 years
    @Bombe re. edit: That's what I tried first, but it didn't show up correctly in the preview. The second link still doesn't work, but at least it's not ugly anymore.
  • palantus
    palantus about 15 years
    @Oscar: I'm not sure I understand your comment. How can a list be a HashSet?
  • Jonas Kölker
    Jonas Kölker about 15 years
    "HashSet.contains() has constant access time, i.e. O(1)" -- could you please point to a proof? Doesn't it depend heavily on the hash function? If not, why not just say "Fast in practice"? Otherwise, I think you're spreading misinformation (probably with the best intentions, though :))
  • Wim Coenen
    Wim Coenen about 15 years
    @Jonas Kölker: From the documentation: "This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets."
  • Wim Coenen
    Wim Coenen about 15 years
    Why would a HashSet be harder to analyze? You know the asymptotic running time. You can profile it. What is less analyzable about it?
  • Ar5hv1r
    Ar5hv1r almost 14 years
    @Jonas, while a poor hashCode() implementation will lead to slow access times, any algorithms text (notably the CLR(S) text which many of the Collections data structures are built off - amazon.com/Introduction-Algorithms-Third-Thomas-Cormen/dp/…) will tell you that hash based data structures are O(1) for lookup. It is important to realize that O(1) does not denote one-step lookup, but lookup unrelated to the size of the data structure. Therefore even with poor hashCode()s, the lookup time is O(1). Wim isn't spreading any misinformation, in fact he's spot on.
  • Ar5hv1r
    Ar5hv1r almost 14 years
    Another good answer. I would be inclined to do this before constructing a wrapper class. Especially if you're looking at very large data sets, I suspect this might be more efficient (it certainly is space-wise).
  • Ed Staub
    Ed Staub almost 13 years
    Guava's Iterators.find() is very similar, but supports generics.
  • RobMcZag
    RobMcZag almost 7 years
    This way you are being slower than just using contains, O(N) with N/2 average, as the sort is O(N logN) and you still have the binary search O(log N). This approach is fine if the list is static with repeated searches, so you can sort it once and search many times.