R ddply with multiple variables

12,238

Okay, took me a little bit to figure out what you want, but here is a solution:

cols.to.sub <- paste0("mm", 1:3)
df1 <- ddply(
  df, .(ID, variable), 
  function(x) {
    x[cols.to.sub] <- t(t(as.matrix(x[cols.to.sub])) - unlist(x[x$phase == 3, cols.to.sub]))
    x
} ) 

This produces (first 6 rows):

    ID phase variable mm1 mm2 mm3
1  101     1        A  -2  -2  -2
2  101     2        A  -1  -1  -1
3  101     3        A   0   0   0
4  101     1        B  -2  -2  -2
5  101     2        B  -1  -1  -1
6  101     3        B   0   0   0

Generally speaking the best way to debug this type of issue is to put a browser() statement inside the function you are passing to ddply, so you can examine the objects at your leisure. Doing so would have revealed that:

  1. The data frame passed to your function includes the ID columns, as well as the phase columns, so your mm columns are not the first three (hence the need to define cols.to.sub)
  2. Even if you address that, you can't operate on data frames that have unequal dimensions, so what I do here is convert to matrix, and then take advantage of vector recycling to subtract the one row from the rest of the matrix. I need to t (transpose) because vector recycling is column-wise.
Share:
12,238
Rosa
Author by

Rosa

Updated on June 27, 2022

Comments

  • Rosa
    Rosa almost 2 years

    Here is a simple data frame for my real data set:

    df <- data.frame(ID=rep(101:102,each=9),phase=rep(1:3,6),variable=rep(LETTERS[1:3],each=3,times=2),mm1=c(1:18),mm2=c(19:36),mm3=c(37:54))
    

    I would like to first group by ID and variable, then for values(mm1, mm2, mm3), phase 3 is subtracted from all phases(phase1 to phase3), which would make mm(1-3) in phase 1 all -2, in phase 2 all -1, and phase 3 all 0.

    R throws an error of "Error in Ops.data.frame(x, x[3, ]) : - only defined for equally-sized data frames" as I tried:

    df1 <- ddply(df, .(ID, variable), function(x) (x - x[3,]))   
    

    Any advice would be greatly appreciated. The output should be look like this:

    ID phase variable mm1 mm2 mm3
    101  1      A     -2  -2  -2
    101  2      A     -1  -1  -1
    101  3      A      0   0   0
    101  1      B     -2  -2  -2
    101  2      B     -1  -1  -1
    101  3      B      0   0   0
    101  1      C     -2  -2  -2
    101  2      C     -1  -1  -1
    101  3      C      0   0   0
    102  1      A     -2  -2  -2
    102  2      A     -1  -1  -1
    102  3      A      0   0   0
    102  1      B     -2  -2  -2
    102  2      B     -1  -1  -1
    102  3      B      0   0   0
    102  1      C     -2  -2  -2
    102  2      C     -1  -1  -1
    102  3      C      0   0   0