Removing duplicates from one list by comparing with another list
Solution 1
If the lists are unsorted, and are ArrayLists or other similar list implementations with an O(n) contains method, then you should create a HashSet with the items of listB in order to perform the removal. If the items are not put into a set then you will end up with O(n^2) performance.
Easiest way to do what you need is thus:
listA.removeAll(new HashSet(listB));
ArrayList.removeAll(Collection)
will not put the items into a set for you (at least in the JDK 1.6 and 1.7 versions I checked), which is why you need to create the HashSet yourself in the above.
The removeAll method will copy the items you wish to keep into the beginning of the list as it traverses it, avoiding array compacting with each removal, so using it against a passed in HashSet as shown is reasonably optimal and is O(n).
Solution 2
You could add both list elements to a Set
To remove elements on one list from another, try listA.removeAll(listB);
Pankaj Gadge
Updated on August 19, 2022Comments
-
Pankaj Gadge over 1 year
I have a two lists of objects and I would like to remove the instances from one list which is there in other list.
e.g. I have following two lists and suppose each letter represents the object.
List listA = {A, B, C , D, E, F, G, H , I , J}
List listB= {D, G, K, P, Z}
Now, clearly listB has D and G which are there on listA too so I want listA to be like this
listA = {A, B, C , E, F, H , I , J}
Can you guys please suggest what would be the solution for this with O(n) or less than O(n2).
I can iterate over both the lists and remove the duplicate instances by comparing but I want to have something more efficient.
-
Pankaj Gadge about 11 yearsBy adding both elements in a SET , I can remove the duplicate from the list. but I want to remove it from listA completely.
-
Dinesh about 5 yearsthis should be an accepted answer. Elegant and simple.
-
DraxDomax over 3 yearsinteresting! Why does not using HashSet result in O(n^2)? I tried searching "array compacting" but couldn't figure it out from the existing explanations...
-
Trevor Freeman over 3 years@DraxDomax O(n^2) because
removeAll
works by iterating through the list it is called on and invokingcontains
on the passed in collection for every element. If the passed in collection is an ArrayList then it has an O(n) contains method, which is why we end up with O(n^2). Now, this is not entirely accurate since the collection we are checking against may be of limited fixed size (e.g. 3 elements or some such) and not scale with N in the other list, so it really depends on your use case. More accurate would be to say it is O(n * j) where j is the size of the passed in collection. -
DraxDomax over 3 yearsI think if more people took care to think about the implications of writing a loop, software would be a lot more effective for use these days :) That being said, I'd probably not have a job because I am not so good (although I try!). Thanks for the clarification!!!