Using dplyr to group_by and conditionally mutate a dataframe by group
10,786
@eipi10 's answer works. However, I think you should use case_when
instead of ifelse
. It is vectorised and will be much faster on larger datasets.
foo %>% group_by(A) %>%
mutate(E = case_when(any(B == 1 & C == 2 & D > 0) ~ 1, TRUE ~ 0))
Author by
ucsbcoding
Updated on June 22, 2022Comments
-
ucsbcoding about 2 years
I'd like to use dplyr functions to group_by and conditionally mutate a df. Given this sample data:
A B C D 1 1 1 0.25 1 1 2 0 1 2 1 0.5 1 2 2 0 1 3 1 0.75 1 3 2 0.25 2 1 1 0 2 1 2 0.5 2 2 1 0 2 2 2 0 2 3 1 0 2 3 2 0 3 1 1 0.5 3 1 2 0 3 2 1 0.25 3 2 2 1 3 3 1 0 3 3 2 0.75
I want to use new column E to categorize A by whether B == 1, C == 2, and D > 0. For each unique value of A for which all of these conditions hold true, then E = 1, else E = 0. So, the output should look like this:
A B C D E 1 1 1 0.25 0 1 1 2 0 0 1 2 1 0.5 0 1 2 2 0 0 1 3 1 0.75 0 1 3 2 0.25 0 2 1 1 0 1 2 1 2 0.5 1 2 2 1 0 1 2 2 2 0 1 2 3 1 0 1 2 3 2 0 1 3 1 1 0.5 0 3 1 2 0 0 3 2 1 0.25 0 3 2 2 1 0 3 3 1 0 0 3 3 2 0.75 0
I initially tried this code but the conditionals don't seem to be working right:
foo$E <- foo %>% group_by(A) %>% mutate(E = {if (B == 1 & C == 2 & D > 0) 1 else 0})
Any insights appreciated. Thanks!