How does one do a full join using data.table?
11,655
Solution 1
You actually have it right there. Use merge.data.table
which is exactly what you are doing when you call
merge(a, b, by = "dog", all = TRUE)
since a
is a data.table
, merge(a, b, ...)
calls merge.data.table(a, b, ...)
Solution 2
x= data.table(a=1:5,b=11:15)
y= data.table(a=c(1:4,6),c=c(101:104,106))
setkey(x,a)
setkey(y,a)
unique_keys <- unique(c(x[,a], y[,a]))
y[x[.(unique_keys), on="a"] ] # Full Outer Join
Author by
Paul Murray
Updated on June 17, 2022Comments
-
Paul Murray almost 2 years
In the
data.table
FAQ, thenomatch = NA
parameter is said to be akin to an outer join. However, I haven't been able to getdata.table
to do a full outer join – only right outer joins.For example:
a <- data.table("dog" = c(8:12), "cat" = c(15:19)) dog cat 1: 8 15 2: 9 16 3: 10 17 4: 11 18 5: 12 19 b <- data.table("dog" = 1:10, "bullfrog" = 11:20) dog bullfrog 1: 1 11 2: 2 12 3: 3 13 4: 4 14 5: 5 15 6: 6 16 7: 7 17 8: 8 18 9: 9 19 10: 10 20 setkey(a, dog) setkey(b, dog) a[b, nomatch = NA] dog cat bullfrog 1: 1 NA 11 2: 2 NA 12 3: 3 NA 13 4: 4 NA 14 5: 5 NA 15 6: 6 NA 16 7: 7 NA 17 8: 8 15 18 9: 9 16 19 10: 10 17 20
So,
nomatch = NA
produces a right outer join (which is the default). What if I need a full join? For example:merge(a, b, by = "dog", all = TRUE) # Or with plyr: join(a, b, by = "dog", type = "full") dog cat bullfrog 1: 1 NA 11 2: 2 NA 12 3: 3 NA 13 4: 4 NA 14 5: 5 NA 15 6: 6 NA 16 7: 7 NA 17 8: 8 15 18 9: 9 16 19 10: 10 17 20 11: 11 18 NA 12: 12 19 NA
Is that possible with
data.table
? -
Ameya almost 6 yearsIs there any way we can achieve the same result with the
A[B]
syntax? -
Feng Jiang over 4 yearsBrilliant! To make it easier to understand, I'd like to break the last two lines to three: 1)
unique_keys <- data.table(a = unique(c(x[,a], y[,a])))
2)xx <- x[unique_keys]
3)y[xx]