remove duplicate strings in a List in Java

12,238

Solution 1

contains is not called as LinkedHashSet is not implemented that way.

If you want add() to call contains() you will need to override it as well.

The reason it is not implemented this way is that calling contains first would mean you are performing two lookups instead of one which would be slower.

Solution 2

Try

        Set set = new TreeSet(String.CASE_INSENSITIVE_ORDER);
        set.addAll(list);
        return new ArrayList(set);

UPDATE but as Tom Anderson mentioned it does not preserve the initial order, if this is really an issue try

    Set<String> set = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
    Iterator<String> i = list.iterator();
    while (i.hasNext()) {
        String s = i.next();
        if (set.contains(s)) {
            i.remove();
        }
        else {
            set.add(s);
        }
    }

prints

[2, 1]

Solution 3

add() method of LinkedHashSet do not call contains() internally else your method would have been called as well.

Instead of a LinkedHashSet, why dont you use a SortedSet with a case insensitive comparator ? With the String.CASE_INSENSITIVE_ORDER comparator

Your code is reduced to

public static List<String> removeDupList(List<String>list, boolean ignoreCase){
    Set<String> set = (ignoreCase?new TreeSet<String>(String.CASE_INSENSITIVE_ORDER):new LinkedHashSet<String>());
    set.addAll(list);

    List<String> res = new ArrayList<String>(set);
    return res;
}

If you wish to preserve the Order, as @tom anderson specified in his comment, you can use an auxiliary LinkedHashSet for the order.

You can try adding that element to TreeSet, if it returns true also add it to LinkedHashSet else not.

public static List<String> removeDupList(List<String>list){
        Set<String> sortedSet = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
        List<String> orderedList = new ArrayList<String>();
        for(String str : list){
             if(sortedSet.add(str)){ // add returns true, if it is not present already else false
                 orderedList.add(str);
             }
        }
        return orderedList;
    }
Share:
12,238

Related videos on Youtube

user121196
Author by

user121196

Updated on September 21, 2022

Comments

  • user121196
    user121196 over 1 year

    Update: I guess HashSet.add(Object obj) does not call contains. is there a way to implement what I want(remove dup strings ignore case using Set)?

    Original question: trying to remove dups from a list of String in java, however in the following code CaseInsensitiveSet.contains(Object ob) is not getting called, why?

    public static List<String> removeDupList(List<String>list, boolean ignoreCase){
        Set<String> set = (ignoreCase?new CaseInsensitiveSet():new LinkedHashSet<String>());
        set.addAll(list);
    
        List<String> res = new Vector<String>(set);
        return res;
    }
    
    
    public class CaseInsensitiveSet  extends LinkedHashSet<String>{
    
        @Override
        public boolean contains(Object obj){
            //this not getting called.
            if(obj instanceof String){
    
                return super.contains(((String)obj).toLowerCase());
            }
            return super.contains(obj);
        }
    
    }
    
    • Andrew Thompson
      Andrew Thompson over 11 years
      Please learn how to use code formatting rather than using <pre> statements.
  • Tom Anderson
    Tom Anderson over 11 years
    If the asker doesn't care about preserving the order in the list, then this is an excellent answer. If he does, sadly, it is merely a good one.