Is there a no-duplicate List implementation out there?
Solution 1
There's no Java collection in the standard library to do this. LinkedHashSet<E>
preserves ordering similarly to a List
, though, so if you wrap your set in a List
when you want to use it as a List
you'll get the semantics you want.
Alternatively, the Commons Collections (or commons-collections4
, for the generic version) has a List
which does what you want already: SetUniqueList
/ SetUniqueList<E>
.
Solution 2
Here is what I did and it works.
Assuming I have an ArrayList
to work with the first thing I did was created a new LinkedHashSet
.
LinkedHashSet<E> hashSet = new LinkedHashSet<E>()
Then I attempt to add my new element to the LinkedHashSet
. The add method does not alter the LinkedHasSet
and returns false if the new element is a duplicate. So this becomes a condition I can test before adding to the ArrayList
.
if (hashSet.add(E)) arrayList.add(E);
This is a simple and elegant way to prevent duplicates from being added to an array list. If you want you can encapsulate it in and override of the add method in a class that extends the ArrayList
. Just remember to deal with addAll
by looping through the elements and calling the add method.
Solution 3
So here's what I did eventually. I hope this helps someone else.
class NoDuplicatesList<E> extends LinkedList<E> {
@Override
public boolean add(E e) {
if (this.contains(e)) {
return false;
}
else {
return super.add(e);
}
}
@Override
public boolean addAll(Collection<? extends E> collection) {
Collection<E> copy = new LinkedList<E>(collection);
copy.removeAll(this);
return super.addAll(copy);
}
@Override
public boolean addAll(int index, Collection<? extends E> collection) {
Collection<E> copy = new LinkedList<E>(collection);
copy.removeAll(this);
return super.addAll(index, copy);
}
@Override
public void add(int index, E element) {
if (this.contains(element)) {
return;
}
else {
super.add(index, element);
}
}
}
Solution 4
Why not encapsulate a set with a list, sort like:
new ArrayList( new LinkedHashSet() )
This leaves the other implementation for someone who is a real master of Collections ;-)
Solution 5
You should seriously consider dhiller's answer:
- Instead of worrying about adding your objects to a duplicate-less List, add them to a Set (any implementation), which will by nature filter out the duplicates.
- When you need to call the method that requires a List, wrap it in a
new ArrayList(set)
(or anew LinkedList(set)
, whatever).
I think that the solution you posted with the NoDuplicatesList
has some issues, mostly with the contains()
method, plus your class does not handle checking for duplicates in the Collection passed to your addAll()
method.
Comments
-
Yuval over 2 years
I know about
SortedSet
, but in my case I need something that implementsList
, and notSet
. So is there an implementation out there, in the API or elsewhere?It shouldn't be hard to implement myself, but I figured why not ask people here first?
-
Ben over 15 yearsThis constructor copies the contents of the Set into the new List, rather than wrapping it.
-
Yuval over 15 yearsI specifically mentioned that I need a List implementation. Trust me, there's a reason.
-
Yuval over 15 yearsThe Commons class is exactly what I need, but my boss told me to implement it myself eventually. 10x anyway!
-
matt b over 15 yearsIs the reason because you're interacting with an API that is taking a List as a parameter (instead of a Collection)? That's a bit annoying to have to deal with
-
matt b over 15 yearsBe careful - LinkedList.contains() needs to scan the entire list to determine if an object is contained in the List. This means that when you are adding objects to a large List, the entire List is scanned for each add operation (in the worst case). This can end up being SLOW.
-
matt b over 15 yearsAlso, your addAll override doesn't check for duplicates in the collection being passed to addAll().
-
Yuval over 15 yearsI'd love to learn of these contains() issues. As for the addAll(), I create a copy of the given collection and remove all objects already in 'this'. How does that not handle duplicates?
-
Yuval over 15 yearsI was ready to fall back to this idea (which eventually I had to) if no one suggested anything better =8-) See my own answer above.
-
Yuval over 15 yearsActually the API takes a Map<AccountType, Map<AccountType, List<Account>>>, which means holding somewhere in the vicinity of dozens to hundreds of lists... bah.
-
Ben over 15 yearsAh well, nothing like reinventing the wheel! You'll know now if the need comes up again, anyway. collections15 is a pretty useful thing to have kicking around; MultiMaps in particular ease the pain of something one ends up implementing oneself a lot.
-
matt b over 15 yearsAs I mentioned in my comment to your class posting, contains() has to scan the entire list (in the worst case) to find if the object is contained in the list. If you have a list of 1 million items and add 10 it it individually, then (in the worst case) over ten million items are scanned.
-
matt b over 15 yearsAs for addAll(), if the Collection passed to addAll contains duplicates itself, they are not detected. For example: your list {A, B, C, D} parameter list {B, D, E, E, E}. You create a copy of the parameter, and after removeAll it contains {E, E, E}.
-
Yuval over 15 yearsThe addAll() issue is not really relevant to me, as I use NoDuplicatesList throughout the procedure, and addAll() should be receiving another NoDuplicatesList as its parameter. What would you suggest to improve the contains() performance?
-
Yuval almost 15 years@skaffman: he's not actually an idiot, but sometimes he makes moves that are... well, odd. Anyway, I'm not gonna introduce bugs into the product. In today's market, I'm happy with my job and not looking to slam doors and burn bridges, if you get my point.
-
Janning about 12 yearsThis copies a set to a list but you don't have any well-known ordering. But this what the question is all about.
-
emeraldhieu almost 12 yearsI'm quite surprise when SetUniqueList doesn't have parameterized type.
-
Jeffrey Blattman over 11 yearswrapping LinkedHashSet in a list doesn't make it easy or efficient to implement all of the index-based operations on a list. i don't think that's a reasonable approach. and also, there are reason not to pull in the commons things, for example, on a mobile platform where size matters.
-
Ben over 11 yearsJeffrey: On mobile platforms the system will usually remove unused classes, but sure, there's plenty of reasons you might not go down one of these "normal" solutions. There's always some trade-off to be made, and no solution will fix all cases.
-
gyurix over 9 yearsYeah, I think, this is the best solution for it, you can also simply use a normal HashSet, not a Linked, and then you can use your list as you want, you can also deside what to do in some situations, like in adding an element inside a list before a specific index, you can deside that you want to move the duplicated item to this position or not.
-
gyurix over 9 yearsYeah, you should add a bit more methods to it for implementing the List interface.
-
marcolopes about 8 yearsBest solution here... Will post my UniqueList class code
-
Jeancarlo Fontalvo over 7 yearsThis worked for me, in my BFS Graph algorithm. Because I had some nodes that I added to a Queue (LinkedList) just if they weren't already in.
-
naXa stands with Ukraine over 6 yearsWhat is
HashSet#consist()
? -
miroslavign almost 6 yearsJust keep in mind, sorting will NOT work on that "list"
-
TheRealChx101 about 5 years@mattb How would you solve this problem then: On Android, when binding objects to a list item view, we are given the position of the item in the view adapter. Since sets have no index, the only way is to check whether or not the object exists when using lists is to iterate through and look for an existing copy.
-
Al G Johnston about 4 yearsConstructing probability functions with element-probability pairs can involve not have duplicates, although duplicate elements could just be merged.
-
specializt almost 3 yearsthe performance-problems of this solution could be fixed with a simple, additional
Set<Integer>
which stores the hashCodes of the elements (instead of searching through the entire list) - which would require all of the elements to properly implement hashCode(), of course but with helper frameworks like Lombok, this really is no problem ... its kind of trivial, actually. One could even optimize that solution with a red/black-tree for hashCodes ... small memory overhead for large performance gains; welcome to the world of cloud computing ;-) -
Eddie Jamsession about 2 yearsIf we wanted to implement it ourselves, we wouldn't ask