Is there a no-duplicate List implementation out there?

189,617

Solution 1

There's no Java collection in the standard library to do this. LinkedHashSet<E> preserves ordering similarly to a List, though, so if you wrap your set in a List when you want to use it as a List you'll get the semantics you want.

Alternatively, the Commons Collections (or commons-collections4, for the generic version) has a List which does what you want already: SetUniqueList / SetUniqueList<E>.

Solution 2

Here is what I did and it works.

Assuming I have an ArrayList to work with the first thing I did was created a new LinkedHashSet.

LinkedHashSet<E> hashSet = new LinkedHashSet<E>()

Then I attempt to add my new element to the LinkedHashSet. The add method does not alter the LinkedHasSet and returns false if the new element is a duplicate. So this becomes a condition I can test before adding to the ArrayList.

if (hashSet.add(E)) arrayList.add(E);

This is a simple and elegant way to prevent duplicates from being added to an array list. If you want you can encapsulate it in and override of the add method in a class that extends the ArrayList. Just remember to deal with addAll by looping through the elements and calling the add method.

Solution 3

So here's what I did eventually. I hope this helps someone else.

class NoDuplicatesList<E> extends LinkedList<E> {
    @Override
    public boolean add(E e) {
        if (this.contains(e)) {
            return false;
        }
        else {
            return super.add(e);
        }
    }

    @Override
    public boolean addAll(Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(copy);
    }

    @Override
    public boolean addAll(int index, Collection<? extends E> collection) {
        Collection<E> copy = new LinkedList<E>(collection);
        copy.removeAll(this);
        return super.addAll(index, copy);
    }

    @Override
    public void add(int index, E element) {
        if (this.contains(element)) {
            return;
        }
        else {
            super.add(index, element);
        }
    }
}   

Solution 4

Why not encapsulate a set with a list, sort like:

new ArrayList( new LinkedHashSet() )

This leaves the other implementation for someone who is a real master of Collections ;-)

Solution 5

You should seriously consider dhiller's answer:

  1. Instead of worrying about adding your objects to a duplicate-less List, add them to a Set (any implementation), which will by nature filter out the duplicates.
  2. When you need to call the method that requires a List, wrap it in a new ArrayList(set) (or a new LinkedList(set), whatever).

I think that the solution you posted with the NoDuplicatesList has some issues, mostly with the contains() method, plus your class does not handle checking for duplicates in the Collection passed to your addAll() method.

Share:
189,617
Yuval
Author by

Yuval

Updated on December 03, 2021

Comments

  • Yuval
    Yuval over 2 years

    I know about SortedSet, but in my case I need something that implements List, and not Set. So is there an implementation out there, in the API or elsewhere?

    It shouldn't be hard to implement myself, but I figured why not ask people here first?

  • Ben
    Ben over 15 years
    This constructor copies the contents of the Set into the new List, rather than wrapping it.
  • Yuval
    Yuval over 15 years
    I specifically mentioned that I need a List implementation. Trust me, there's a reason.
  • Yuval
    Yuval over 15 years
    The Commons class is exactly what I need, but my boss told me to implement it myself eventually. 10x anyway!
  • matt b
    matt b over 15 years
    Is the reason because you're interacting with an API that is taking a List as a parameter (instead of a Collection)? That's a bit annoying to have to deal with
  • matt b
    matt b over 15 years
    Be careful - LinkedList.contains() needs to scan the entire list to determine if an object is contained in the List. This means that when you are adding objects to a large List, the entire List is scanned for each add operation (in the worst case). This can end up being SLOW.
  • matt b
    matt b over 15 years
    Also, your addAll override doesn't check for duplicates in the collection being passed to addAll().
  • Yuval
    Yuval over 15 years
    I'd love to learn of these contains() issues. As for the addAll(), I create a copy of the given collection and remove all objects already in 'this'. How does that not handle duplicates?
  • Yuval
    Yuval over 15 years
    I was ready to fall back to this idea (which eventually I had to) if no one suggested anything better =8-) See my own answer above.
  • Yuval
    Yuval over 15 years
    Actually the API takes a Map<AccountType, Map<AccountType, List<Account>>>, which means holding somewhere in the vicinity of dozens to hundreds of lists... bah.
  • Ben
    Ben over 15 years
    Ah well, nothing like reinventing the wheel! You'll know now if the need comes up again, anyway. collections15 is a pretty useful thing to have kicking around; MultiMaps in particular ease the pain of something one ends up implementing oneself a lot.
  • matt b
    matt b over 15 years
    As I mentioned in my comment to your class posting, contains() has to scan the entire list (in the worst case) to find if the object is contained in the list. If you have a list of 1 million items and add 10 it it individually, then (in the worst case) over ten million items are scanned.
  • matt b
    matt b over 15 years
    As for addAll(), if the Collection passed to addAll contains duplicates itself, they are not detected. For example: your list {A, B, C, D} parameter list {B, D, E, E, E}. You create a copy of the parameter, and after removeAll it contains {E, E, E}.
  • Yuval
    Yuval over 15 years
    The addAll() issue is not really relevant to me, as I use NoDuplicatesList throughout the procedure, and addAll() should be receiving another NoDuplicatesList as its parameter. What would you suggest to improve the contains() performance?
  • Yuval
    Yuval almost 15 years
    @skaffman: he's not actually an idiot, but sometimes he makes moves that are... well, odd. Anyway, I'm not gonna introduce bugs into the product. In today's market, I'm happy with my job and not looking to slam doors and burn bridges, if you get my point.
  • Janning
    Janning about 12 years
    This copies a set to a list but you don't have any well-known ordering. But this what the question is all about.
  • emeraldhieu
    emeraldhieu almost 12 years
    I'm quite surprise when SetUniqueList doesn't have parameterized type.
  • Jeffrey Blattman
    Jeffrey Blattman over 11 years
    wrapping LinkedHashSet in a list doesn't make it easy or efficient to implement all of the index-based operations on a list. i don't think that's a reasonable approach. and also, there are reason not to pull in the commons things, for example, on a mobile platform where size matters.
  • Ben
    Ben over 11 years
    Jeffrey: On mobile platforms the system will usually remove unused classes, but sure, there's plenty of reasons you might not go down one of these "normal" solutions. There's always some trade-off to be made, and no solution will fix all cases.
  • gyurix
    gyurix over 9 years
    Yeah, I think, this is the best solution for it, you can also simply use a normal HashSet, not a Linked, and then you can use your list as you want, you can also deside what to do in some situations, like in adding an element inside a list before a specific index, you can deside that you want to move the duplicated item to this position or not.
  • gyurix
    gyurix over 9 years
    Yeah, you should add a bit more methods to it for implementing the List interface.
  • marcolopes
    marcolopes about 8 years
    Best solution here... Will post my UniqueList class code
  • Jeancarlo Fontalvo
    Jeancarlo Fontalvo over 7 years
    This worked for me, in my BFS Graph algorithm. Because I had some nodes that I added to a Queue (LinkedList) just if they weren't already in.
  • naXa stands with Ukraine
    naXa stands with Ukraine over 6 years
    What is HashSet#consist() ?
  • miroslavign
    miroslavign almost 6 years
    Just keep in mind, sorting will NOT work on that "list"
  • TheRealChx101
    TheRealChx101 about 5 years
    @mattb How would you solve this problem then: On Android, when binding objects to a list item view, we are given the position of the item in the view adapter. Since sets have no index, the only way is to check whether or not the object exists when using lists is to iterate through and look for an existing copy.
  • Al G Johnston
    Al G Johnston about 4 years
    Constructing probability functions with element-probability pairs can involve not have duplicates, although duplicate elements could just be merged.
  • specializt
    specializt almost 3 years
    the performance-problems of this solution could be fixed with a simple, additional Set<Integer> which stores the hashCodes of the elements (instead of searching through the entire list) - which would require all of the elements to properly implement hashCode(), of course but with helper frameworks like Lombok, this really is no problem ... its kind of trivial, actually. One could even optimize that solution with a red/black-tree for hashCodes ... small memory overhead for large performance gains; welcome to the world of cloud computing ;-)
  • Eddie Jamsession
    Eddie Jamsession about 2 years
    If we wanted to implement it ourselves, we wouldn't ask