Transpose only certain columns in data.frame

15,857

The basic idea would be to go to a "long" format first, and then go into a "wide" format.

Here are a few ways to do this....

melt + dcast

library(data.table) ## or library(reshape2)
dcast(melt(as.data.table(mydf), id.vars = c("am", "group")), 
      group + variable ~ am, value.var = "value")

recast

(This is basically the same as above, but in one step.)

library(reshape2)
recast(mydf, group + variable ~ am, id.var = c("am", "group"))

gather + spread

library(dplyr)
library(tidyr)

mydf %>%
  gather(key, value, v1:v4) %>%
  spread(am, value)

reshape

reshape(cbind(mydf[c(1, 2)], stack(mydf[-c(1, 2)])), 
        direction = "wide", idvar = c("group", "ind"), timevar = "am")
Share:
15,857
Ken
Author by

Ken

My interests are but not limited to predictive modeling, missing data, data imputation, cluster analysis, competing risks, and Bayesian statistics.

Updated on June 25, 2022

Comments

  • Ken
    Ken over 1 year

    Here is the data I have:

               am   group  v1  v2  v3    v4
    1  2015-10-31       A 693 803 700   17%
    2  2015-10-31       B 524 859 302   77%
    3  2015-10-31       C 266 675 86    7%
    4  2015-10-31       D 376 455 650   65%
    5  2015-11-30       A 618 715 200   38%
    6  2015-11-30       B 249 965 215   54%
    7  2015-11-30       C 881 106 184   24%
    8  2015-11-30       D 033 047 492   46%
    9  2015-12-31       A 229 994 720   19%
    10 2015-12-31       B 539 543 332   57%
    11 2015-12-31       C 100 078 590   24%
    12 2015-12-31       D 517 413 716   57%
    

    Question: How can I transpose the data such that

    1. transpose v1-v4 and
    2. make values in am as column variables
    3. group variable is replicated by the number of v1-v4

    The result I'd like to produce:

    group metric 2015-10-31 2015-11-30 2015-12-31
        A     v1        693        618        229
        A     v2        803        715        994 
        A     v3        700        200        720
        A     v4        17%        38%        19%
        B     v1        524        249        539
        B     v2        859        965        543 
        B     v3        302        215        332
        B     v4        77%        54%        57%
        ...
    

    What I have tried so far:

    name <- mydata$am
    data <- as.data.frame(t(mydata[, -1]))
    colnames(mydata) <- name
    

    This doesn't handle group variable the way I want.

    Thanks for your help.