dplyr::group_by_ with character string input of several variable names
Solution 1
No need for interp
here, just use as.formula
to convert the strings to formulas:
dots = sapply(y, . %>% {as.formula(paste0('~', .))})
mtcars %>% group_by_(.dots = dots)
The reason why your interp
approach doesn’t work is that the expression gives you back the following:
~list(c("cyl", "gear"))
– not what you want. You could, of course, sapply
interp
over y
, which would be similar to using as.formula
above:
dots1 = sapply(y, . %>% {interp(~var, var = .)})
But, in fact, you can also directly pass y
:
mtcars %>% group_by_(.dots = y)
The dplyr vignette on non-standard evaluation goes into more detail and explains the difference between these approaches.
Solution 2
slice_rows()
from the purrrlyr
package (https://github.com/hadley/purrrlyr) groups a data.frame
by taking a vector of column names (strings) or positions (integers):
y <- c("cyl", "gear")
mtcars_grp <- mtcars %>% purrrlyr::slice_rows(y)
class(mtcars_grp)
#> [1] "grouped_df" "tbl_df" "tbl" "data.frame"
group_vars(mtcars_grp)
#> [1] "cyl" "gear"
Particularly useful now that group_by_()
has been depreciated.
Related videos on Youtube
Comments
-
talat almost 2 years
I'm writing a function where the user is asked to define one or more grouping variables in the function call. The data is then grouped using dplyr and it works as expected if there is only one grouping variable, but I haven't figured out how to do it with multiple grouping variables.
Example:
x <- c("cyl") y <- c("cyl", "gear") dots <- list(~cyl, ~gear) library(dplyr) library(lazyeval) mtcars %>% group_by_(x) # groups by cyl mtcars %>% group_by_(y) # groups only by cyl (not gear) mtcars %>% group_by_(.dots = dots) # groups by cyl and gear, this is what I want.
I tried to turn
y
into the same asdots
using:mtcars %>% group_by_(.dots = interp(~var, var = list(y))) #Error: is.call(expr) || is.name(expr) || is.atomic(expr) is not TRUE
How to use a user-defined input string of > 1 variable names (like
y
in the example) to group the data using dplyr?(This question is somehow related to this one but not answered there.)
-
David Arenburg over 9 yearsThis is why you should start using
data.table
:)as.data.table(mtcars)[, sum(carb), y]
j/k. Good question though. -
talat over 9 yearsI might, some day :) But for now I'll stick with dplyr..
-