Non-redundant version of expand.grid
Solution 1
How about using outer
? But this particular function concatenates them into one character string.
outer( c("aa", "ab", "cc"), c("aa", "ab", "cc") , "paste" )
# [,1] [,2] [,3]
#[1,] "aa aa" "aa ab" "aa cc"
#[2,] "ab aa" "ab ab" "ab cc"
#[3,] "cc aa" "cc ab" "cc cc"
You can also use combn
on the unique elements of the two vectors if you don't want the repeating elements (e.g. aa aa
)
vals <- c( c("aa", "ab", "cc"), c("aa", "ab", "cc") )
vals <- unique( vals )
combn( vals , 2 )
# [,1] [,2] [,3]
#[1,] "aa" "aa" "ab"
#[2,] "ab" "cc" "cc"
Solution 2
In base R, you can use this:
expand.grid.unique <- function(x, y, include.equals=FALSE)
{
x <- unique(x)
y <- unique(y)
g <- function(i)
{
z <- setdiff(y, x[seq_len(i-include.equals)])
if(length(z)) cbind(x[i], z, deparse.level=0)
}
do.call(rbind, lapply(seq_along(x), g))
}
Results:
> x <- c("aa", "ab", "cc")
> y <- c("aa", "ab", "cc")
> expand.grid.unique(x, y)
[,1] [,2]
[1,] "aa" "ab"
[2,] "aa" "cc"
[3,] "ab" "cc"
> expand.grid.unique(x, y, include.equals=TRUE)
[,1] [,2]
[1,] "aa" "aa"
[2,] "aa" "ab"
[3,] "aa" "cc"
[4,] "ab" "ab"
[5,] "ab" "cc"
[6,] "cc" "cc"
Solution 3
If the two vectors are the same, there's the combinations
function in the gtools
package:
library(gtools)
combinations(n = 3, r = 2, v = c("aa", "ab", "cc"), repeats.allowed = TRUE)
# [,1] [,2]
# [1,] "aa" "aa"
# [2,] "aa" "ab"
# [3,] "aa" "cc"
# [4,] "ab" "ab"
# [5,] "ab" "cc"
# [6,] "cc" "cc"
And without "aa" "aa"
, etc.
combinations(n = 3, r = 2, v = c("aa", "ab", "cc"), repeats.allowed = FALSE)
Solution 4
The previous answers were lacking a way to get a specific result, namely to keep the self-pairs but remove the ones with different orders. The gtools package has two functions for these purposes, combinations
and permutations
. According to this website:
- When the order doesn't matter, it is a Combination.
- When the order does matter it is a Permutation.
In both cases, we have the decision to make of whether repetitions are allowed or not, and correspondingly, both functions have a repeats.allowed
argument, yielding 4 combinations (deliciously meta!). It's worth going over each of these. I simplified the vector to single letters for ease of understanding.
Permutations with repetition
The most expansive option is to allow both self-relations and differently ordered options:
> permutations(n = 3, r = 2, repeats.allowed = T, v = c("a", "b", "c"))
[,1] [,2]
[1,] "a" "a"
[2,] "a" "b"
[3,] "a" "c"
[4,] "b" "a"
[5,] "b" "b"
[6,] "b" "c"
[7,] "c" "a"
[8,] "c" "b"
[9,] "c" "c"
which gives us 9 options. This value can be found from the simple formula n^r
i.e. 3^2=9
. This is the Cartesian product/join for users familiar with SQL.
There are two ways to limit this: 1) remove self-relations (disallow repetitions), or 2) remove differently ordered options (i.e. combinations).
Combinations with repetitions
If we want to remove differently ordered options, we use:
> combinations(n = 3, r = 2, repeats.allowed = T, v = c("a", "b", "c"))
[,1] [,2]
[1,] "a" "a"
[2,] "a" "b"
[3,] "a" "c"
[4,] "b" "b"
[5,] "b" "c"
[6,] "c" "c"
which gives us 6 options. The formula for this value is (r+n-1)!/(r!*(n-1)!)
i.e. (2+3-1)!/(2!*(3-1)!)=4!/(2*2!)=24/4=6
.
Permutations without repetition
If instead we want to disallow repetitions, we use:
> permutations(n = 3, r = 2, repeats.allowed = F, v = c("a", "b", "c"))
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "a"
[4,] "b" "c"
[5,] "c" "a"
[6,] "c" "b"
which also gives us 6 options, but different ones! The number of options is the same as above but it's a coincidence. The value can be found from the formula n!/(n-r)!
i.e. (3*2*1)/(3-2)!=6/1!=6
.
Combinations without repetitions
The most limiting is when we want neither self-relations/repetitions or differently ordered options, in which case we use:
> combinations(n = 3, r = 2, repeats.allowed = F, v = c("a", "b", "c"))
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "c"
which gives us only 3 options. The number of options can be calculated from the rather complex formula n!/(r!(n-r)!)
i.e. 3*2*1/(2*1*(3-2)!)=6/(2*1!)=6/2=3
.
Solution 5
Try:
factors <- c("a", "b", "c")
all.combos <- t(combn(factors,2))
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "c"
This will not include duplicates of each factor (e.g. "a" "a"), but you can add those on easily if needed.
dup.combos <- cbind(factors,factors)
factors factors
[1,] "a" "a"
[2,] "b" "b"
[3,] "c" "c"
all.combos <- rbind(all.combos,dup.combos)
factors factors
[1,] "a" "b"
[2,] "a" "c"
[3,] "b" "c"
[4,] "a" "a"
[5,] "b" "b"
[6,] "c" "c"
Michele
Recently moved to an IT consultancy position after having spent 6 years studying condensed matter physics.
Updated on June 06, 2022Comments
-
Michele almost 2 years
The
R
functionexpand.grid
returns all possible combination between the elements of supplied parameters. e.g.> expand.grid(c("aa", "ab", "cc"), c("aa", "ab", "cc")) Var1 Var2 1 aa aa 2 ab aa 3 cc aa 4 aa ab 5 ab ab 6 cc ab 7 aa cc 8 ab cc 9 cc cc
Do you know an efficient way to get directly (so without any row comparison after
expand.grid
) only the 'unique' combinations between the supplied vectors? The output will beVar1 Var2 1 aa aa 2 ab aa 3 cc aa 5 ab ab 6 cc ab 9 cc cc
EDIT the combination of each element with itself could be eventually discarded from the answer. I don't actually need it in my program even though (mathematically)
aa aa
would be one (regular) unique combination between one element ofVar1
and another ofvar2
.The solution needs to produce pairs of elements from both vectors (i.e. one from each of the input vectors - so that it could be applied to more than 2 inputs)
-
Michele almost 11 yearsoh no, thanks. I need them separate to pass to other function. the previous was perfect! I actually realised I don't need
aa aa
,bb bb
andcc cc
. so if you re-edit back and I'll accept it :-) -
CoderGuy123 over 7 yearsI note that repeats.allowed also removes the self-pairs such as ("aa", "aa"), which are not actually repeated. There is a missing middle way with this function.
-
CoderGuy123 over 7 yearsI note that it is missing a way to get the self-relationships included, but not the duplicate ones. For instance ("aa", "aa") is not a duplicate for it only appears only. In some cases, one wants the self-relationships too, but not the duplicates.
-
Joseph Wood almost 3 yearsDon't want to offend, but I don't see how this answer is the most upvoted accepted answer and continuing to get upvotes. Both approaches are incorrect. The first proposal with
outer
is nearly identical toexpand.grid
, it is just in a different format. It is also limited to only 2 inputs. For the 2nd approach,combn
can only act on a single vector. If you have vectors with different elements the approach suggested here will give many incorrect results. Continuing... -
Joseph Wood almost 3 yearsThe most extreme example is when we have disjoint vectors E.g.
v1 = 1:5, v2 = 6:10
. The correct solution should be isomorphic toexpand.grid(1:5, 6:10)
, which only has 25 results, howevercombn(1:10, 2)
has 45 results. Again, I don't mean to offend, I just want to help to clarify for future users. -
Joseph Wood almost 3 yearsI have added an answer that addresses these concerns: stackoverflow.com/a/68050873/4408538