Java 8 Streams: Map the same object multiple times based on different properties
I think your alternatives 2 and 3 can be re-written to be more clear:
Alternative 2:
Map<String, Customer> res2 = customers.stream()
.flatMap(
c -> Stream.of(c.first, c.last)
.map(k -> new AbstractMap.SimpleImmutableEntry<>(k, c))
).collect(toMap(Map.Entry::getKey, Map.Entry::getValue));
Alternative 3: Your code abuses reduce
by mutating the HashMap. To do mutable reduction, use collect
:
Map<String, Customer> res3 = customers.stream()
.collect(
HashMap::new,
(m,c) -> {m.put(c.first, c); m.put(c.last, c);},
HashMap::putAll
);
Note that these are not identical. Alternative 2 will throw an exception if there are duplicate keys while Alternative 3 will silently overwrite the entries.
If overwriting entries in case of duplicate keys is what you want, I would personally prefer Alternative 3. It is immediately clear to me what it does. It most closely resembles the iterative solution. I would expect it to be more performant as Alternative 2 has to do a bunch of allocations per customer with all that flatmapping.
However, Alternative 2 has a huge advantage over Alternative 3 by separating the production of entries from their aggregation. This gives you a great deal of flexibility. For example, if you want to change Alternative 2 to overwrite entries on duplicate keys instead of throwing an exception, you would simply add (a,b) -> b
to toMap(...)
. If you decide you want to collect matching entries into a list, all you would have to do is replace toMap(...)
with groupingBy(...)
, etc.
wassgren
Experienced agile leader with strong technical background and a sweet spot for new technology. Specialties include technical recruitment, technical mentoring, software development, project management, sales and online marketing. Somewhat fluent in full stack development including UX, information architecture, frontend- and backend development, designing API:s, Continuous Delivery etc. Preferred programming languages and technologies include JavaScript, Node.js, Serverless, AWS, NoSQL such as Redis/Neo4j and a wide variety of frontend- and mobile technologies.
Updated on July 09, 2022Comments
-
wassgren almost 2 years
I was presented with an interesting problem by a colleague of mine and I was unable to find a neat and pretty Java 8 solution. The problem is to stream through a list of POJOs and then collect them in a map based on multiple properties - the mapping causes the POJO to occur multiple times
Imagine the following POJO:
private static class Customer { public String first; public String last; public Customer(String first, String last) { this.first = first; this.last = last; } public String toString() { return "Customer(" + first + " " + last + ")"; } }
Set it up as a
List<Customer>
:// The list of customers List<Customer> customers = Arrays.asList( new Customer("Johnny", "Puma"), new Customer("Super", "Mac"));
Alternative 1: Use a
Map
outside of the "stream" (or rather outsideforEach
).// Alt 1: not pretty since the resulting map is "outside" of // the stream. If parallel streams are used it must be // ConcurrentHashMap Map<String, Customer> res1 = new HashMap<>(); customers.stream().forEach(c -> { res1.put(c.first, c); res1.put(c.last, c); });
Alternative 2: Create map entries and stream them, then
flatMap
them. IMO it is a bit too verbose and not so easy to read.// Alt 2: A bit verbose and "new AbstractMap.SimpleEntry" feels as // a "hard" dependency to AbstractMap Map<String, Customer> res2 = customers.stream() .map(p -> { Map.Entry<String, Customer> firstEntry = new AbstractMap.SimpleEntry<>(p.first, p); Map.Entry<String, Customer> lastEntry = new AbstractMap.SimpleEntry<>(p.last, p); return Stream.of(firstEntry, lastEntry); }) .flatMap(Function.identity()) .collect(Collectors.toMap( Map.Entry::getKey, Map.Entry::getValue));
Alternative 3: This is another one that I came up with the "prettiest" code so far but it uses the three-arg version of
reduce
and the third parameter is a bit dodgy as found in this question: Purpose of third argument to 'reduce' function in Java 8 functional programming. Furthermore,reduce
does not seem like a good fit for this problem since it is mutating and parallel streams may not work with the approach below.// Alt 3: using reduce. Not so pretty Map<String, Customer> res3 = customers.stream().reduce( new HashMap<>(), (m, p) -> { m.put(p.first, p); m.put(p.last, p); return m; }, (m1, m2) -> m2 /* <- NOT USED UNLESS PARALLEL */);
If the above code is printed like this:
System.out.println(res1); System.out.println(res2); System.out.println(res3);
The result would be:
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}So, now to my question: How should I, in a Java 8 orderly fashion, stream through the
List<Customer>
and then somehow collect it as aMap<String, Customer>
where you split the whole thing as two keys (first
ANDlast
) i.e. theCustomer
is mapped twice. I do not want to use any 3rd party libraries, I do not want to use a map outside of the stream as in alt 1. Are there any other nice alternatives?The full code can be found on hastebin for simple copy-paste to get the whole thing running.