Using dplyr to group_by and conditionally mutate a dataframe by group

10,786

@eipi10 's answer works. However, I think you should use case_when instead of ifelse. It is vectorised and will be much faster on larger datasets.

foo %>% group_by(A) %>%
  mutate(E = case_when(any(B == 1 & C == 2 & D > 0) ~ 1, TRUE ~ 0))
Share:
10,786
ucsbcoding
Author by

ucsbcoding

Updated on June 22, 2022

Comments

  • ucsbcoding
    ucsbcoding about 2 years

    I'd like to use dplyr functions to group_by and conditionally mutate a df. Given this sample data:

    A   B   C   D
    1   1   1   0.25
    1   1   2   0
    1   2   1   0.5
    1   2   2   0
    1   3   1   0.75
    1   3   2   0.25
    2   1   1   0
    2   1   2   0.5
    2   2   1   0
    2   2   2   0
    2   3   1   0
    2   3   2   0
    3   1   1   0.5
    3   1   2   0
    3   2   1   0.25
    3   2   2   1
    3   3   1   0
    3   3   2   0.75
    

    I want to use new column E to categorize A by whether B == 1, C == 2, and D > 0. For each unique value of A for which all of these conditions hold true, then E = 1, else E = 0. So, the output should look like this:

    A   B   C   D    E
    1   1   1   0.25 0
    1   1   2   0    0
    1   2   1   0.5  0
    1   2   2   0    0
    1   3   1   0.75 0
    1   3   2   0.25 0
    2   1   1   0    1
    2   1   2   0.5  1
    2   2   1   0    1
    2   2   2   0    1
    2   3   1   0    1
    2   3   2   0    1
    3   1   1   0.5  0
    3   1   2   0    0
    3   2   1   0.25 0
    3   2   2   1    0
    3   3   1   0    0
    3   3   2   0.75 0
    

    I initially tried this code but the conditionals don't seem to be working right:

     foo$E <- foo %>% 
        group_by(A) %>% 
        mutate(E = {if (B == 1 & C == 2 & D > 0) 1 else 0})
    

    Any insights appreciated. Thanks!