Merge maps by key
Solution 1
scala.collection.immutable.IntMap
has an intersectionWith
method that does precisely what you want (I believe):
import scala.collection.immutable.IntMap
val a = IntMap(1 -> "one", 2 -> "two", 3 -> "three", 4 -> "four")
val b = IntMap(1 -> "un", 2 -> "deux", 3 -> "trois")
val merged = a.intersectionWith(b, (_, av, bv: String) => Seq(av, bv))
This gives you IntMap(1 -> List(one, un), 2 -> List(two, deux), 3 -> List(three, trois))
. Note that it correctly ignores the key that only occurs in a
.
As a side note: I've often found myself wanting the unionWith
, intersectionWith
, etc. functions from Haskell's Data.Map
in Scala. I don't think there's any principled reason that they should only be available on IntMap
, instead of in the base collection.Map
trait.
Solution 2
val a = Map(1 -> "one", 2 -> "two", 3 -> "three")
val b = Map(1 -> "un", 2 -> "deux", 3 -> "trois")
val c = a.toList ++ b.toList
val d = c.groupBy(_._1).map{case(k, v) => k -> v.map(_._2).toSeq}
//res0: scala.collection.immutable.Map[Int,Seq[java.lang.String]] =
//Map((2,List(two, deux)), (1,List(one, un), (3,List(three, trois)))
Solution 3
Scalaz adds a method |+|
for any type A
for which a Semigroup[A]
is available.
If you mapped your Maps so that each value was a single-element sequence, then you could use this quite simply:
scala> a.mapValues(Seq(_)) |+| b.mapValues(Seq(_))
res3: scala.collection.immutable.Map[Int,Seq[java.lang.String]] = Map(1 -> List(one, un), 2 -> List(two, deux), 3 -> List(three, trois))
Solution 4
Starting Scala 2.13
, you can use groupMap
which (as its name suggests) is an equivalent of a groupBy
followed by map
on values:
// val map1 = Map(1 -> "one", 2 -> "two", 3 -> "three")
// val map2 = Map(1 -> "un", 2 -> "deux", 3 -> "trois")
(map1.toSeq ++ map2).groupMap(_._1)(_._2)
// Map(1 -> List("one", "un"), 2 -> List("two", "deux"), 3 -> List("three", "trois"))
This:
Concatenates the two maps as a sequence of tuples (
List((1, "one"), (2, "two"), (3, "three"))
). For conciseness,map2
is implicitly converted toSeq
to align withmap1.toSeq
's type - but you could choose to make it explicit by usingmap2.toSeq
.group
s elements based on their first tuple part (_._1
) (group part of groupMap)map
s grouped values to their second tuple part (_._2
) (map part of groupMap)
Solution 5
val fr = Map(1 -> "one", 2 -> "two", 3 -> "three")
val en = Map(1 -> "un", 2 -> "deux", 3 -> "trois")
def innerJoin[K, A, B](m1: Map[K, A], m2: Map[K, B]): Map[K, (A, B)] = {
m1.flatMap{ case (k, a) =>
m2.get(k).map(b => Map((k, (a, b)))).getOrElse(Map.empty[K, (A, B)])
}
}
innerJoin(fr, en) // Map(1 -> ("one", "un"), 2 -> ("two", "deux"), 3 -> ("three", "trois")): Map[Int, (String, String)]
Related videos on Youtube
Submonoid
Updated on July 09, 2022Comments
-
Submonoid almost 2 years
Say I have two maps:
val a = Map(1 -> "one", 2 -> "two", 3 -> "three") val b = Map(1 -> "un", 2 -> "deux", 3 -> "trois")
I want to merge these maps by key, applying some function to collect the values (in this particular case I want to collect them into a seq, giving:
val c = Map(1 -> Seq("one", "un"), 2 -> Seq("two", "deux"), 3 -> Seq("three", "trois"))
It feels like there should be a nice, idiomatic way of doing this.
-
user unknown over 12 yearsYou should include the information, how to handle elements which happen to exist only in one Map, preferably in the example data for easy testing, to avoid ambiguity.
-
-
Submonoid over 12 yearsActually, my values in the real case are Sequences, but I want to combine them by building into another sequence, rather than by appending one to the other.
-
Cristiano Fontes over 12 yearsWould you mind explaining that _._1 for a complete scala newbie ?
-
Infinity over 12 yearsA map is collection of Tuples2. For example: val tuple: Tuple3[Int, Int, String] = (100, 10, "one") , if you want get a string "one" you can use tuple._3 . Tuples are useful e.g. if you want return more than one value
-
Ben James over 12 yearsI'm not sure if I understand you, sorry - do you want the values to be nested sequences or not?
-
om-nom-nom over 12 yearsAnd the first part of
_._1
(underscore before dot) is an anonymous name of argument. For example:List(1,2,3,4).map(_.toDouble)
will cast all of the list members to Double. It is likei
infor(i <- List(1,2,3,4)) ...
-
Submonoid over 12 yearsYes, I would want nested sequences, which I could do by wrapping my existing sequences in a Seq, but this feels somewhat like cheating - and in other cases I might want to use a completely different combiner that wouldn't fit into the semigroup structure - giving the size of the intersection of the value sequences, for example.
-
Ben James over 12 yearsI see, admittedly this is just a cheat that I would consider using in this particular situation, not a general solution.
-
Joshua Hartman over 12 yearsYou've implemented a hash join. You could write different methods for each type of join, like left outer, right outer, outer, and inner that would give you the behavior you needed in each circumstance.
-
Luigi Plinge over 12 years+ 1 but you can simplify by leaving off the final
.toSeq
as it doesn't do anything useful -
Travis Brown over 12 yearsNote that
IntMap
'sintersectionWith
handles the case of a key only occurring in one map as you specify here. -
Travis Brown over 12 yearsThis doesn't correctly handle cases where a key is in one map but not the other, and rebuilding the map also makes it more expensive than
intersectionWith
, which is linear with the total number of elements. -
Submonoid over 12 yearsunionWith, interesectionWith etc look exactly like what I'm looking for. Just a shame they're in the wrong language!
-
OliverKK over 8 yearsI just tested the scalaz functionality
intersectionWith
and found out, that keys which occur in b are ignored as well as in a. -
Markus Knecht over 8 yearsA shorter more efficent alternative that does the same would be:
def merge[A,B,C](a : Map[A,B], b : Map[A,B])(c : (B,B) => C) = { for((k,v1) <-a; v2 <- b.get(k)) yield (k, c(v1, v2)) } merge(a,b){Seq(_,_)}
-
mpr about 7 yearswhat if keys are of type String? or any other type?
-
Travis Brown about 7 years@mpr Then you'll need to do something like map over the values with
List(_)
and sum with the monoid instance for maps in Scalaz or Cats (or of course just write your ownintersectionWith
from scratch). -
stackexchanger about 5 years"This doesn't correctly handle cases where a key is in one map but not the other." Doesn't it? I can't see how that fails. Agree with the second point though.
-
gehbiszumeis over 4 yearsPlease put your answer always in context instead of just pasting code. See here for more details.