Java: Detect duplicates in ArrayList?

297,148

Solution 1

Simplest: dump the whole collection into a Set (using the Set(Collection) constructor or Set.addAll), then see if the Set has the same size as the ArrayList.

List<Integer> list = ...;
Set<Integer> set = new HashSet<Integer>(list);

if(set.size() < list.size()){
    /* There are duplicates */
}

Update: If I'm understanding your question correctly, you have a 2d array of Block, as in

Block table[][];

and you want to detect if any row of them has duplicates?

In that case, I could do the following, assuming that Block implements "equals" and "hashCode" correctly:

for (Block[] row : table) {
   Set set = new HashSet<Block>(); 
   for (Block cell : row) {
      set.add(cell);
   }
   if (set.size() < 6) { //has duplicate
   }
}

I'm not 100% sure of that for syntax, so it might be safer to write it as

for (int i = 0; i < 6; i++) {
   Set set = new HashSet<Block>(); 
   for (int j = 0; j < 6; j++)
    set.add(table[i][j]);
 ...

Set.add returns a boolean false if the item being added is already in the set, so you could even short circuit and bale out on any add that returns false if all you want to know is whether there are any duplicates.

Solution 2

Improved code, using return value of Set#add instead of comparing the size of list and set.

public static <T> boolean hasDuplicate(Iterable<T> all) {
    Set<T> set = new HashSet<T>();
    // Set#add returns false if the set does not change, which
    // indicates that a duplicate element has been added.
    for (T each: all) if (!set.add(each)) return true;
    return false;
}

Solution 3

With Java 8+ you can use Stream API:

boolean areAllDistinct(List<Block> blocksList) {
    return blocksList.stream().map(Block::getNum).distinct().count() == blockList.size();
}

Solution 4

If you are looking to avoid having duplicates at all, then you should just cut out the middle process of detecting duplicates and use a Set.

Solution 5

Improved code to return the duplicate elements

  • Can find duplicates in a Collection
  • return the set of duplicates
  • Unique Elements can be obtained from the Set

public static <T> List getDuplicate(Collection<T> list) {

    final List<T> duplicatedObjects = new ArrayList<T>();
    Set<T> set = new HashSet<T>() {
    @Override
    public boolean add(T e) {
        if (contains(e)) {
            duplicatedObjects.add(e);
        }
        return super.add(e);
    }
    };
   for (T t : list) {
        set.add(t);
    }
    return duplicatedObjects;
}


public static <T> boolean hasDuplicate(Collection<T> list) {
    if (getDuplicate(list).isEmpty())
        return false;
    return true;
}
Share:
297,148

Related videos on Youtube

Admin
Author by

Admin

Updated on July 08, 2022

Comments

  • Admin
    Admin almost 2 years

    How could I go about detecting (returning true/false) whether an ArrayList contains more than one of the same element in Java?

    Many thanks, Terry

    Edit Forgot to mention that I am not looking to compare "Blocks" with each other but their integer values. Each "block" has an int and this is what makes them different. I find the int of a particular Block by calling a method named "getNum" (e.g. table1[0][2].getNum();

    • Paul Tomblin
      Paul Tomblin about 15 years
      If "Block" is compared by an int, you should probably have hashCode return that same int and have equals compare those ints.
    • dmarquina
      dmarquina almost 5 years
      use Set instead of List
  • jon077
    jon077 about 15 years
    Make sure to implement hashCode/equals as well.
  • Fabian Steeg
    Fabian Steeg about 15 years
    Or even a bit easier: wrap it when creating the set, e.g. new HashSet(list), instead of using addAll.
  • palantus
    palantus about 15 years
    @jon077: That depends on your definition of "duplicate".
  • Admin
    Admin about 15 years
    Would the process of detecting the elements in a 2D array be the same? For example, checking from array[0][0] to array[0][6] (a 'row')..? Many thanks, Terry
  • Admin
    Admin about 15 years
    Each object in the array holds an integer value. By "duplicate", the object would have the same integer value.
  • Paul Tomblin
    Paul Tomblin about 15 years
    Terry, there are probably more efficient ways, but yes, you could add the array to a Set and compare Set.size to array.length.
  • Nikhil
    Nikhil about 15 years
    "Duplicate" has multiple meanings for objects (as opposed to primitives). If you're just using ints, or the Integer class which implements equals(), that shouldn't be an issue.
  • jon077
    jon077 about 15 years
    Make sure to implement hashCode/equals :)
  • palantus
    palantus about 15 years
    @jon077: Not necessarily, as I just said.
  • Admin
    Admin about 15 years
    I'm trying to implement the solution above but it seems to be triggering an error message which says: "myFile.java uses unchecked or unsafe operations" Not quite sure what it meansor what to do!
  • Paul Brinkley
    Paul Brinkley about 15 years
    I think we'd have to see the code you actually wrote. The snippet in the answer isn't intended to compile as written, but what was written, should compile fine, provided you filled in the bits well.
  • Paul Tomblin
    Paul Tomblin about 15 years
    You can't turn a table into a list like that. Look at the Arrays class.
  • Paul Tomblin
    Paul Tomblin about 15 years
    If you're trying to turn the entire 36 Block objects into a single list, you need to loop.
  • Paul Tomblin
    Paul Tomblin about 15 years
    Comments aren't a good place to solve a secondary problem like this. Edit your question and I'll edit the answer, or ask a new question.
  • Admin
    Admin about 15 years
    Ah, I see that a "table" is a different thing entirely. Assume that "table1" is the name given to 2D array of "Block" objects
  • Admin
    Admin about 15 years
    Sorry about the clutter in the comments, I've now edited my question. Thank you for the help so far!
  • Admin
    Admin about 15 years
    Ok, so from what I can tell, the loops add each 'row' to a Set. So am i correct in thinking that now all that remains to do is compare each Set to the size of table1 (and if Set's size is less than the size of table1's size then there are duplicates)?
  • Paul Tomblin
    Paul Tomblin about 15 years
    You said you wanted to find out if there were any duplicates within a row, not over the whole table. That's why I compare the set.size() to 6.
  • Admin
    Admin about 15 years
    Update: Solution implemented and working! Thank you all for your input on this problem and special thanks to Paul for going out of your way to help! Warm Regards, Terry
  • Paul Tomblin
    Paul Tomblin about 15 years
    In this case, n is 6 so I wouldn't waste a lot of time on implementation details, but I'll keep your idea of the special heap sort if I ever need to do something like that.
  • ChaimKut
    ChaimKut almost 12 years
    I don't understand the third paragraph. Mergesort and heapsort are both O(nlog(n)), not O(log(n)) as you write; even if you exit once you identify a duplicate, that still doesn't change your time complexity...
  • Jules Colle
    Jules Colle over 11 years
    That's pretty awesome. you have some invalid code, and maybe it's not the most optimal way, but your approach totally rocks! (and it works great)
  • Paul Jackson
    Paul Jackson about 10 years
    Would it be more efficient to tell the HashSet how much space to allocate: Set<T> set = new HashSet<T>(list.size());? Given a List parameter I think it is more efficient if it is common for the list to not contain duplicates.
  • NIMISHAN
    NIMISHAN over 8 years
    how can i do it to a array list with a given sliding window , example 100 indexes with sliding window size 5? ABCAD should output : ABCAD = 00010 and if the next value is E, then BCADE should output : 00000
  • Paul Tomblin
    Paul Tomblin over 8 years
    @NIMISHAN ask your own question, don't try to hijack this one.
  • ρяσѕρєя K
    ρяσѕρєя K almost 7 years
    Add some explanation with answer for how this answer help OP in fixing current issue
  • mcallahan
    mcallahan over 6 years
    However using a Set does not detect duplicates. It just prevents them. Unless of course you check the result of the add method as noted by @akuhn above.
  • Jay Anderson
    Jay Anderson about 6 years
    @PaulJackson Sizing based on the full list will probably be beneficial. However if the common case is for it to find a duplicate early then the space was wasted. Also even sizing the HashSet to the size of the list will result in resizing when running through the whole list because of the underlying loading factor of the hash structure.
  • Sandeep Chauhan
    Sandeep Chauhan about 6 years
    Unless you experience actual issues with runtime or space I would not finetune your code like that. Premature optimization is best avoided.
  • Christophe Roussy
    Christophe Roussy about 5 years
    Simplest and best answer if you want the duplicates, for performance you may init uniqueSet hint with size of args.
  • Sergiy Dakhniy
    Sergiy Dakhniy over 3 years
    That's too expensive considering algorithmic complexity. Best sorting is O(n*log(n)) while you can find duplicates with O(n) complexity.
  • Jezor
    Jezor over 3 years
    Could be even shorter: return stream.allMatch(new HashSet<>()::add);
  • Petr Dvořák
    Petr Dvořák almost 3 years
    Commenting just to remove confusing info from the internet. The time complexity of the accepted answer is actually O(n) when using insertion to HashSet. The memory complexity is doubled, but still O(n)... you do not need to sort anything unless you have a requirement of minimalistic memory impact...