How do I remove repeated elements from ArrayList?
Solution 1
If you don't want duplicates in a Collection
, you should consider why you're using a Collection
that allows duplicates. The easiest way to remove repeated elements is to add the contents to a Set
(which will not allow duplicates) and then add the Set
back to the ArrayList
:
Set<String> set = new HashSet<>(yourList);
yourList.clear();
yourList.addAll(set);
Of course, this destroys the ordering of the elements in the ArrayList
.
Solution 2
Although converting the ArrayList
to a HashSet
effectively removes duplicates, if you need to preserve insertion order, I'd rather suggest you to use this variant
// list is some List of Strings
Set<String> s = new LinkedHashSet<>(list);
Then, if you need to get back a List
reference, you can use again the conversion constructor.
Solution 3
In Java 8:
List<String> deduped = list.stream().distinct().collect(Collectors.toList());
Please note that the hashCode-equals contract for list members should be respected for the filtering to work properly.
Solution 4
Suppose we have a list of String
like:
List<String> strList = new ArrayList<>(5);
// insert up to five items to list.
Then we can remove duplicate elements in multiple ways.
Prior to Java 8
List<String> deDupStringList = new ArrayList<>(new HashSet<>(strList));
Note: If we want to maintain the insertion order then we need to use LinkedHashSet
in place of HashSet
Using Guava
List<String> deDupStringList2 = Lists.newArrayList(Sets.newHashSet(strList));
Using Java 8
List<String> deDupStringList3 = strList.stream().distinct().collect(Collectors.toList());
Note: In case we want to collect the result in a specific list implementation e.g. LinkedList
then we can modify the above example as:
List<String> deDupStringList3 = strList.stream().distinct()
.collect(Collectors.toCollection(LinkedList::new));
We can use parallelStream
also in the above code but it may not give expected performace benefits. Check this question for more.
Solution 5
If you don't want duplicates, use a Set instead of a List
. To convert a List
to a Set
you can use the following code:
// list is some List of Strings
Set<String> s = new HashSet<String>(list);
If really necessary you can use the same construction to convert a Set
back into a List
.
user25778
Updated on July 20, 2022Comments
-
user25778 almost 2 years
I have an
ArrayList<String>
, and I want to remove repeated strings from it. How can I do this? -
Shervin Asgari over 14 yearsWhy use ArrayList in parameter? Why not just List? Will that not work?
-
volley over 14 yearsA List will absolutely work as in-parameter for the first method listed. The method is however optimized for use with a random access list such as ArrayList, so if a LinkedList is passed instead you will get poor performance. For example, setting the n:th element in a LinkedList takes O(n) time, whereas setting the n:th element in a random access list (such as ArrayList) takes O(1) time. Again, though, this is probably overkill... If you need this kind of specialized code it will hopefully be in an isolated situation.
-
volley over 14 yearsSee also LinkedHashSet, if you wish to retain the order.
-
Matt Briançon about 13 yearsDoes LinkedHashSet make any guarantees as to which of several duplicates are kept from the list? For instance, if position 1, 3, and 5 are duplicates in the original list, can we assume that this process will remove 3 and 5? Or maybe remove 1 and 3? Thanks.
-
user2756501 about 13 years@Matt: yes, it does guarantee that. The docs say: "This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set."
-
Chetan about 12 yearsBut this will just create the set without duplicates , I want to know which number was duplicate in O(n) time
-
WowBow about 12 yearsVery interesting. I have a different situation here. I am not trying to sort String but another object called AwardYearSource. This class has an int attribute called year. So I want to remove duplicates based on the year. i.e if there is year 2010 mentioned more than once, I want to remove that AwardYearSource object. How can I do that?
-
volley about 12 yearsChetan, finding the items in O(n) is possible if the set of possible values is small (think Byte or Short); a BitSet or similar can then be used to store and look up already encountered values in O(1) time. But then again - with such a small value set, doing it in O(n log n) might not be a problem anyway since n is low. (This comment is not applicable to original poster, who needs to do this with String.)
-
Ondrej Bozek almost 12 years@Chetan finding all duplicates from ArrayList in O(n), its important to have correctly defined equals method on objects which you have in the list (no problem for numbers):
public Set<Object> findDuplicates(List<Object> list) { Set<Object> items = new HashSet<Object>(); Set<Object> duplicates = new HashSet<Object>(); for (Object item : list) { if (items.contains(item)) { duplicates.add(item); } else { items.add(item); } } return duplicates; }
-
Ondrej Bozek almost 12 years@WowBow For example you can define Wrapper object which holds AwardYearSource. And define this Wrapper objects equals method based on AwardYearSources year field. Then you can use Set with these Wrapper objects.
-
shrini1000 over 11 years@WowBow or implement Comparable/Comparator
-
Kevik almost 11 yearsthis is great, and it gets even better if you change HashSet to LinkedHashSet
-
Jonik over 10 yearsA good practice would be to define variables using the interface types
List
andSet
(instead of implementation typesArrayList
andHashSet
as in your example). -
maaartinus over 10 yearsIt's slow and you might get a ConcurrentModificationException.
-
CarlJohn over 10 years@maaartinus Have you tried that code ?. It won't produce any exceptions.Also it is pretty fast. I tried the code before posting.
-
maaartinus over 10 yearsYou're right, it doesn't as you iterate the array instead of the list. However, it's slow like hell. Try it with a few millions elements. Compare it to
ImmutableSet.copyOf(lst).toList()
. -
ashes999 over 10 yearsYou can clean this up by using
new HashSet(al)
instead of initializing it to empty and callingaddAll
. -
abarnert over 9 yearsWhy would you post a quadratic solution to a question that already has 2-year-old linear and log-linear solutions, that are also simpler?
-
jean d'arme over 8 yearscan I add rules for setting what's duplicate to me? For example: when my
Object
has several values if two of them repeat I consider them as duplicate (other values can be different) and useSet
? -
neo7 over 8 yearsThis implementation return no element in the list because of the last j--
-
Manash Ranjan Dakua over 8 yearsThis implementation work's very fine.there is no issue behind this and for this task i am only use one arraylist.so this answer is completely good.before giving negative feedback you shold also add testcase also so that every one can understand the result.Thanks Manash
-
Maytham Fahmi over 8 years@jonathan-stafford nice and useful peace of code. voteup
-
randers over 8 yearsThis answer lacks two things: 1) It does not use generics, but raw types (
ArrayList<T>
should be used instead ofArrayList
) 2) The explicit iterator creating can be avoided by using afor (T current : l1) { ... }
. Even if you wanted to use anIterator
explicitly,iterador
is misspelled. -
Aniket Paul about 8 yearsanswers the question I was asked in the interview .. How to remove repeated values from an ArrayList without using Sets. Thanx
-
Patrick M almost 8 yearsAnd this implementation runs in quadratic time, compared to the linked hash set implementation running in linear time. (i.e. this takes 10 times longer on a list with 10 elements, 10,000 times longer on a list with 10,000 elements. JDK 6 implementation for ArrayList.contains, JDK8 impl is the same.)
-
Patrick M almost 8 yearsInternally,
indexOf
iterates thelst
using a for loop. -
Neon Warge almost 8 yearsOne reason that I am force to use collection instead of set is due to HTTP API calls. The API returns a POJO of certain objects, I need to act data on it so I really need it not to be duplicated. For some reason, it wasn't part of the API, to return to me a list of non-duplicated data, I cannot wait until it is developed so I must find a way to present a list with no duplicates.
-
r00tandy over 7 yearsAnd the OneLiner would be:
myArrayList = new ArrayList(new HashSet(myArrayList));
(But just do it if you really need the ArrayList before AND after this line!) -
Jarred Allen over 7 years@jeand'arme If you use a TreeSet instead of a HashSet, you can define your own Comparator to use, and the TreeSet will consider two items to be duplicates if the Comparators .compare(e1, e2) returns 0. Note that this will destroy the existing order of the arraylist.
-
StackFlowed over 7 yearsHow do i do this for case insensitive distinct ?
-
Muhammad Adil over 7 yearsSimilarly at the bottom of the thread, I have given an answer where I am using Set for Custom Object. In a case if anyone have custom object like "Contact" or "Student" can use that answer that works fine for me.
-
Andy Turner over 6 yearsNote that there is an
ImmutableSet.asList()
method, returning anImmutableList
, if you need it back as aList
. -
Paul over 6 years@StackFlowed If you don't need to preserve the order of the list you can
addAll
tonew TreeSet<String>(String.CASE_INSENSITIVE_ORDER)
. The first element added will remain in the set so if your list contains "Dog" and "dog" (in that order) theTreeSet
will contain "Dog". If order must be preserved then before the line in the answer putlist.replaceAll(String::toUpperCase);
. -
Tushar Gogna over 6 yearsI liked this solution better.
-
Samir about 6 yearsI am getting this error :incompatible types: List<Object> cannot be converted to List<String>
-
Diablo almost 6 yearsYah, When i typed my previous comments, I was in a impression that
parallel streams
will give better performance always. But it's a myth. I later learned that there are certain scenarios where parallel streams should be used. In this scenario parallel streams will not give any better performance. and yes parallel streams might not give desired results some cases.List<String> deDupStringList3 = stringList.stream().map(String::toLowerCase).distinct().collect(Collectors.toList());
should be the suitable solution in this case -
Giacomo over 5 yearswhat is the time complexity ?
-
ByWaleed about 5 yearsI think this is the best way of removing duplicated in an ArrayList. Definitely recommended. Thank you @Nenad for the answer.
-
TheRealChx101 about 5 yearsThe problem comes when you have to specifically access an element. For instance when binding an object to a list item view in Android, you are given its index. So
Set
cannot be used here. -
Holger almost 5 yearsWhen you use
T element = iter.next();
, you don’t need the unchecked type casts. Or you usefor(T element: list) …
instead of dealing with anIterator
manually. -
Holger almost 5 years“without using any other data structure”, except another
ArrayList
. Which will be very inefficient for larger lists. -
Holger almost 5 years“…and also not be quadratic in run time” Of course, this is quadratic in run time. It’s even worse than other quadratic solutions.
-
Holger almost 5 yearsCalling
indexOf
on anArrayList
within aforEach
is O(n²). -
Holger almost 5 years“if you are concerned about the overhead of HashSet” and are ignorant towards the overhead of
Collectiors.sort
… -
Holger almost 5 yearsNot easier than
Set<String> strSet = new HashSet<>(strList);
-
Holger almost 5 yearsTry
ListIterator<String> i = hashList.listIterator(); i.add("hello"); i.add("hello");
. Alternatively, you can usehashList.add("a"); hashList.add("b"); hashList.replaceAll(x -> "hello");
. And there might be more ways to counteract this approach in the future. The takeaway is that you shouldn’t try to enforce new contracts via subclassing with classes not designed for that. You will find more about this topic when you search for the “Prefer composition over inheritance” OOP rule. -
Laser Infinite over 4 yearsThis is a simple solution in general but how do you remove the duplicates from an Arraylist of int[]?
-
Ajay Mistry about 4 yearsnot working with ArrayList having model instead of string .
-
jvargas about 4 yearsHow can I aproach this when the list is an object list
-
Holger almost 4 yearsAlternative:
Set<Object> set = new HashSet<>(); yourList.removeIf(x -> !set.add(x));
The advantage is that this alternative allows you to decide what to use in theset.add(…)
expression, for your particular notion of “duplicate”. It’s also independent of the list’s actual elements type. It also retains the order, regardless of whether the set maintains the order or not. Also usable with aTreeSet
(e.g. with custom comparator) instead ofHashSet
. -
LiNKeR over 3 yearsthat makes a lot of sense
-
kfir over 2 yearsPerfect - just missing "repeated = false;" in the internal loop after the "if(!repeated) l2.add(c1);" otherwise it return a short list
-
Jwala Kumar about 2 yearsFor those, who have problem in understanding ArrayList<Integer> arrList3 = new ArrayList<Integer>(Arrays.asList(1, 23, 1, 3, 3, 2, 3, 2, 12, 67, 23, 12, 34, 9)); Collection<Integer> set = new HashSet<>(arrList3); arrList3.clear(); arrList3.addAll(set); System.out.println(set);