Getting unique values from a list of objects with a List<string> as a property

44,762

Solution 1

If I understand, you want a list of all of the unique certifications among all of the employees. This would be a job for SelectMany:

var uniqueCerts = empList.SelectMany(e => e.Certifications).Distinct().ToList();

Solution 2

You want to use SelectMany, which lets you select sublists, but returns them in a flattened form:

stringList = empList.SelectMany(emp => emp.Certifications).Distinct().ToList();

Solution 3

If you want to have a set type for the variable, try Select() instead:

List<string> employeeCertifications =
    employeeList.Select(e => e.Certifications).Distinct().ToList();
Share:
44,762
wootscootinboogie
Author by

wootscootinboogie

Updated on September 25, 2020

Comments

  • wootscootinboogie
    wootscootinboogie over 3 years

    For illustrative purposes I've got a simple Employee class with several fields and a method to remove multiple occurrences in the Certifications property

    public int EmployeeId { get; set; }
            public string FirstName { get; set; }
            public string LastName { get; set; }
    
            private List<string> certifications = new List<string>();
            public List<string> Certifications
            {
                get { return certifications; }
                set { certifications = value; }
            }
    
    public List<string> RemoveDuplicates(List<string> s)
            {
                List<string> dupesRemoved = s.Distinct().ToList();
                foreach(string str in dupesRemoved)
                    Console.WriteLine(str);
                return dupesRemoved;
            }
    

    The RemoveDuplicates method will work remove any duplicate strings in the Certifications property of the Employee object. Now consider if I have a list of Employee objects.

     Employee e = new Employee();
               List<string> stringList = new List<string>();
               stringList.Add("first");
               stringList.Add("second");
               stringList.Add("third");
               stringList.Add("first");
               e.Certifications = stringList;
              // e.Certifications = e.RemoveDuplicates(e.Certifications); works fine
    
               Employee e2 = new Employee();
               e2.Certifications.Add("fourth");
               e2.Certifications.Add("fifth");
               e2.Certifications.Add("fifth");
               e2.Certifications.Add("sixth");
    
               List<Employee> empList = new List<Employee>();
               empList.Add(e);
               empList.Add(e2);
    

    I could use

    foreach (Employee emp in empList)
               {
                   emp.Certifications = emp.RemoveDuplicates(emp.Certifications);
               }
    

    to get a list of ALL unique Certifications, from all employees in the List but I would like to do this in LINQ, something akin to

    stringList = empList.Select(emp => emp.Certifications.Distinct().ToList());
    

    this gives me an error saying

    Cannot implicitly convert type 'System.Collections.Generic.IEnumerable<System.Collections.Generic.List<string>>' to 'System.Collections.Generic.List<string>'. An explicit conversion exists (are you missing a cast?)  
    

    How can I get a list of unique Certifications from a list of Employee objects?

    • paparazzo
      paparazzo almost 11 years
      If EmployeeId is a unique identifier then I recommend you override GetHashCode and Equals using EmployeeId. Also make List<string> certifications a HashSet (not List) to not allow duplicates in Employee.
    • wootscootinboogie
      wootscootinboogie almost 11 years
      This is for illustrative purposes only, but thank you for pointing this out, I will file this one away in the tool set.
  • wootscootinboogie
    wootscootinboogie almost 11 years
    I knew I had to be close, and that it was a one liner :)