Removing duplicates from one list by comparing with another list

21,626

Solution 1

If the lists are unsorted, and are ArrayLists or other similar list implementations with an O(n) contains method, then you should create a HashSet with the items of listB in order to perform the removal. If the items are not put into a set then you will end up with O(n^2) performance.

Easiest way to do what you need is thus:

listA.removeAll(new HashSet(listB));

ArrayList.removeAll(Collection) will not put the items into a set for you (at least in the JDK 1.6 and 1.7 versions I checked), which is why you need to create the HashSet yourself in the above.

The removeAll method will copy the items you wish to keep into the beginning of the list as it traverses it, avoiding array compacting with each removal, so using it against a passed in HashSet as shown is reasonably optimal and is O(n).

Solution 2

You could add both list elements to a Set

To remove elements on one list from another, try listA.removeAll(listB);

Share:
21,626
Pankaj Gadge
Author by

Pankaj Gadge

Updated on August 19, 2022

Comments

  • Pankaj Gadge
    Pankaj Gadge over 1 year

    I have a two lists of objects and I would like to remove the instances from one list which is there in other list.

    e.g. I have following two lists and suppose each letter represents the object.

    List listA = {A, B, C , D, E, F, G, H , I , J}

    List listB= {D, G, K, P, Z}

    Now, clearly listB has D and G which are there on listA too so I want listA to be like this

    listA = {A, B, C , E, F, H , I , J}

    Can you guys please suggest what would be the solution for this with O(n) or less than O(n2).

    I can iterate over both the lists and remove the duplicate instances by comparing but I want to have something more efficient.

  • Pankaj Gadge
    Pankaj Gadge about 11 years
    By adding both elements in a SET , I can remove the duplicate from the list. but I want to remove it from listA completely.
  • Dinesh
    Dinesh about 5 years
    this should be an accepted answer. Elegant and simple.
  • DraxDomax
    DraxDomax over 3 years
    interesting! Why does not using HashSet result in O(n^2)? I tried searching "array compacting" but couldn't figure it out from the existing explanations...
  • Trevor Freeman
    Trevor Freeman over 3 years
    @DraxDomax O(n^2) because removeAll works by iterating through the list it is called on and invoking contains on the passed in collection for every element. If the passed in collection is an ArrayList then it has an O(n) contains method, which is why we end up with O(n^2). Now, this is not entirely accurate since the collection we are checking against may be of limited fixed size (e.g. 3 elements or some such) and not scale with N in the other list, so it really depends on your use case. More accurate would be to say it is O(n * j) where j is the size of the passed in collection.
  • DraxDomax
    DraxDomax over 3 years
    I think if more people took care to think about the implications of writing a loop, software would be a lot more effective for use these days :) That being said, I'd probably not have a job because I am not so good (although I try!). Thanks for the clarification!!!