Java filter List that so it only contains objects that have same attribute as in another lists

12,742

Solution 1

in this case you can't use retainAll() , but I would solve the problem like this:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;


public class DifferentCollections {

    public static void main(String[] args) {


        List<Customer> customers = new ArrayList<>(Arrays.asList(new Customer(1), new Customer(2), new Customer(10)));
        List<License> licenses = new ArrayList<>(Arrays.asList(new License(1), new License(2), new License(30)));

        List<Customer> filteredCustomers = customers.stream().
                filter(c -> customerIdFoundInLicensesList(c, licenses)).
                collect(Collectors.toList());

        System.out.println(filteredCustomers);
    }

    private static boolean customerIdFoundInLicensesList(Customer customer, List<License> licenses) {
        return licenses.stream().
                filter(l -> l.getId().equals(customer.getId())).
                findAny().
                isPresent();
    }
}

class Customer {
    Integer id;

    public Customer(Integer id) {
        this.id = id;
    }

    public Integer getId() {
        return id;
    }

    @Override
    public String toString() {
        return "Customer{" + "id=" + id + '}';
    }
}

class License {
    Integer id;

    public License(Integer id) {
        this.id = id;
    }

    public Integer getId() {
        return id;
    }

    @Override
    public String toString() {
        return "License{" + "id=" + id + '}';
    }
}

Solution 2

Your task description is not clear, but apparently, you want to get all Customer instance, for which a License instance having the same id exist.

While it is possible to describe this as one stream operation, searching for a match in one list for every element of the other list would imply an operation of O(nxm) time complexity, in other words, it would perform very bad if you have large lists.

Therefore, it’s better to do it in two operations having O(n+m) time complexity:

List<Customer> cList = Customer.getAllCustomers();
List<License> lList  = License.getAllLicenses();

Set<?> licenseIDs = lList.stream()
    .map(l -> l.id).filter(Objects::nonNull)
    .collect(Collectors.toSet());

List<Customer> cListFiltered = cList.stream()
    .filter(c -> licenseIDs.contains(c.id))
    .collect(Collectors.toList());

if(cListFiltered.isEmpty()) System.out.println("no matches");
else cListFiltered.forEach(System.out::println);

While the exact Set type returned by collect(Collectors.toSet()) is unspecified, you can expect it to have a better than linear lookup, which allows to use its contains method in the subsequent stream operation. Note that only the first operation has a filter for null values; since that guarantees that the licenseIDs set does not contain null, customers with a null id are rejected implicitly.

It’s easy to get the common IDs instead

Set<?> commonIDs = cList.stream()
    .map(l -> l.id).filter(licenseIDs::contains)
    .collect(Collectors.toSet());

Using commonIDs, you may filter both, the lists of customers or the list of licenses, if you wish.

Solution 3

You cannot do retainAll operation between two different types (Customer and License). I suggest you to find only ids that are in both collections and then use them as you want

List<Long> customerIds = Clist.map(Customer::id).filter(Objects::notNull);
List<Long> licenseIds = Llist.map(Customer::id).filter(Objects::notNull);
List<Long> sharedIds = cusomerIds
    .stream()
    .filter(customerId -> licenseIds.contains(customerId))
    .collect(Collectors.toList());

Obviously, not sure that id in your case is Long, but it should work for all types.

Share:
12,742
Elias Johannes
Author by

Elias Johannes

Updated on June 16, 2022

Comments

  • Elias Johannes
    Elias Johannes almost 2 years

    I got 2 lists containing several objects. I want to filter the objects that contain the same String value at a specific attribute. So let's say listA contains objects with attribute id. Same for listB, although it contains different objects. Some objects from both lists have the same id though. I want to filter these objects and put them in the a new list. This is what i got so far:

    List<Customer> Clist = Customer.getAllCustomers();
        List<License> Llist = License.getAllLicenses();
    
        Predicate<Customer> customerNotNullPredicate = u -> (u.id != null);
        Predicate<License> licenseNotNullPredicate = u -> (u.id != null);
    
        List<Customer> Clistfiltered1 = Clist.parallelStream().filter(customerNotNullPredicate).collect(Collectors.toList());
        List<License> Llistfiltered1 = Llist.parallelStream().filter(licenseNotNullPredicate).collect(Collectors.toList());
        Clistfiltered1.retainAll(Llistfiltered1);
        try {
            Clistfiltered1.get(0);
        } catch (Exception e){
            System.out.println(e);
        }
    

    If course, retainAll() doesn't return anything, as both lists just contain objects of the different type. How can i try to use retainAll() on a specific attribute of the objects?

    Thank you a lot in advance.

    • UninformedUser
      UninformedUser almost 7 years
      You can't. Of course you have to find first the ids that both lists have in common. And then, in a second step you can filter both lists.
    • Holger
      Holger almost 7 years
      It is not clear what you want. You are only filtering for non-null IDs, which has nothing to do with being the same ID. Then, what is the expected result, a list of Customer instance or a list of License instance? Well, and try { Clistfiltered1.get(0); } catch (Exception e){ … } … is that your preferred way to test for list.isEmpty()? Seriously?
    • Elias Johannes
      Elias Johannes almost 7 years
      @Holger As this example comes straight out of the application i'm working on it might be that i leave out some information. The non-null search is because i got a lot of db entries being null at this attribute (i'm working with a predefined DB) and it throws an exception otherwise. The expected result is a list with only objects where the id matches in both lists. The try,catch block was just for a quick test to see what error i get. I'm still in my learning process of coding and i learn everything from internet resources. I'm sorry that i'm not familiar with best practices.
  • Holger
    Holger almost 7 years
    There is no reason to copy the List returned by Arrays.asList into an ArrayList when all you need is a List. Further, filter(condition).findAny().isPresent() can be simplified to .anyMatch(condition).
  • Holger
    Holger almost 7 years
    As a side note you should not eagerly use parallelStream() everywhere. Not everything benefits from parallel processing. It may even reduce the performance when being used inappropriately.
  • Elias Johannes
    Elias Johannes almost 7 years
    Although you had issues with understanding my question i have to thank you for your help, appreciate that a lot Holger. I already imagined a really bad performance when working with Lists and streams, but in this case it wasn't important, as only executed this function one time to generate DB entries. But i got more List streaming in my app that have a poor performance, is it in general best case to use Sets instead?
  • fps
    fps almost 7 years
    There's also no need to traverse the list of licenses for each customer.
  • Holger
    Holger almost 7 years
    It depends on tho actual operation. Streams are linear operations, so if there’s a direct collection operation promising better than linear performance (like a Set lookup), you should prefer that, especially when the operation will be combined with another linear (or worse) operation.