stacking columns into 1 column in R

53,539

Solution 1

Here's the stack approach:

dat2a <- data.frame(dat[1:2], stack(dat[3:ncol(dat)]))
dat2a
#    ID Time values ind
# 1   1   20      1  U1
# 2   2   20      2  U1
# 3   3   20      2  U1
# 4   1   20      2  U2
# 5   2   20      5  U2
# 6   3   20      5  U2
# 7   1   20      3  U3
# 8   2   20      9  U3
# 9   3   20      6  U3
# 10  1   20      5  U4
# 11  2   20      4  U4
# 12  3   20      4  U4

This is very similar to melt from "reshape2":

library(reshape2)
dat2b <- melt(dat, id.vars=1:2)
dat2b
#    ID Time variable value
# 1   1   20       U1     1
# 2   2   20       U1     2
# 3   3   20       U1     2
# 4   1   20       U2     2
# 5   2   20       U2     5
# 6   3   20       U2     5
# 7   1   20       U3     3
# 8   2   20       U3     9
# 9   3   20       U3     6
# 10  1   20       U4     5
# 11  2   20       U4     4
# 12  3   20       U4     4

And, very similar to @TylerRinker's answer, but not dropping the "times", is to just use sep = "" to help R guess time and variable names.

dat3 <- reshape(dat, direction = "long", idvar=1:2, 
                varying=3:ncol(dat), sep = "", timevar="Measure")
dat3
#        ID Time Measure U
# 1.20.1  1   20       1 1
# 2.20.1  2   20       1 2
# 3.20.1  3   20       1 2
# 1.20.2  1   20       2 2
# 2.20.2  2   20       2 5
# 3.20.2  3   20       2 5
# 1.20.3  1   20       3 3
# 2.20.3  2   20       3 9
# 3.20.3  3   20       3 6
# 1.20.4  1   20       4 5
# 2.20.4  2   20       4 4
# 3.20.4  3   20       4 4

In all three of those, you end up with four columns, not three, like you describe in your desired output. However, as @ndoogan points out, by doing so, you're loosing information about your data. If you're fine with that, you can always drop that column from the resulting data.frame quite easily (for example, dat2a <- dat2a[-4].

Solution 2

With base reshape:

dat <- read.table(text="ID Time U1 U2 U3 U4
1  20    1  2 3  5
2  20    2  5 9  4
3  20    2  5 6  4", header=TRUE)


colnames(dat) <- gsub("([a-zA-Z]*)([0-9])", "\\1.\\2", colnames(dat))
reshape(dat, varying=3:ncol(dat), v.names="U", direction ="long", timevar = "Time", 
    idvar = "ID")

Solution 3

You can also use melt():

library(reshape2)

new_data <- melt(old_data, id.vars=c("ID","Time"),
    value.name = "U")

Then remove the 'variable' column:

new_data$variable <- NULL
Share:
53,539
user2263330
Author by

user2263330

Updated on June 19, 2020

Comments

  • user2263330
    user2263330 almost 4 years

    I have a data frame that looks like:

    ID Time U1 U2 U3 U4 ...
    1  20    1  2 3  5 .. 
    2  20    2  5 9  4 ..
    3  20    2  5 6  4 ..
    .
    .
    
    And I would need to keep it like: 
    
    ID Time  U
    1  20    1
    1  20    2
    1  20    3
    1  20    5
    2  20    2
    2  20    5
    2  20    9
    2  20    4
    3  20    2
    3  20    5
    3  20    6
    3  20    4
    

    I tried with:

    X <- read.table("mydata.txt", header=TRUE, sep=",")
    X_D <- as.data.frame(X)
    X_new <- stack(X_D, select = -c(ID, Time))
    

    But I haven't managed to get the data into that form. Honestly, I have little experience with stacking/transposing, so any help is greatly appreciated!

    • ndoogan
      ndoogan about 11 years
      This is commonly called conversion from wide format to long format. However, the way you've defined it, you lose information about which column a piece of data came from. Incidentally, the package reshape2 covers this sort of transformation.
    • ndoogan
      ndoogan about 11 years
      Additionally, you're more likely to get the answer you really want if you provide a really easy way for answerers to get example data into R to work with. Providing what's in the file isn't simple, but providing dput() output of the original example dataframe object IS!
    • Phil
      Phil over 8 years
      There's also a solution with tidyr::gather()
    • EngrStudent
      EngrStudent over 6 years
      @Phil - can you enumerate it? I'm looking for the tidyr equivalent of the JMP "stack columns" platform.
  • user2263330
    user2263330 about 11 years
    Thanks! It gives me an error though: 'undefined columns selected' here '[.data.frame'(dat, ,[,c("ID", "Time",
  • Ferdinand.kraft
    Ferdinand.kraft about 11 years
    Hi @user2263330, it works for me with dat <- data.frame(ID=1:3, Time=20, U1=1:3, U2=4:6, U3=7:9, U4=10:12). What are the names of your data.frame?
  • ndoogan
    ndoogan about 11 years
    (+1) It could be worth specifying what "base reshape" means in the context of another reshape answer that depends on reshape2 package.
  • Tyler Rinker
    Tyler Rinker about 11 years
    This should do it for anyone wondering: ?reshape
  • A5C1D2H2I1M1N2O1R2T1
    A5C1D2H2I1M1N2O1R2T1 about 11 years
    +1 for matching the output exactly, but I do also worry about loosing information this way.
  • Tyler Rinker
    Tyler Rinker about 11 years
    @AnandaMahto Agreed. In this case I didn't see it as a problem so I dropped it to match the user's wants.
  • Jala015
    Jala015 almost 7 years
    Oops, this will actually order the U rows by rows in the old_data