stacking columns into 1 column in R

r dataframe rows reshape

53,539

Solution 1

Here's the stack approach:

dat2a <- data.frame(dat[1:2], stack(dat[3:ncol(dat)]))
dat2a
#    ID Time values ind
# 1   1   20      1  U1
# 2   2   20      2  U1
# 3   3   20      2  U1
# 4   1   20      2  U2
# 5   2   20      5  U2
# 6   3   20      5  U2
# 7   1   20      3  U3
# 8   2   20      9  U3
# 9   3   20      6  U3
# 10  1   20      5  U4
# 11  2   20      4  U4
# 12  3   20      4  U4

This is very similar to melt from "reshape2":

library(reshape2)
dat2b <- melt(dat, id.vars=1:2)
dat2b
#    ID Time variable value
# 1   1   20       U1     1
# 2   2   20       U1     2
# 3   3   20       U1     2
# 4   1   20       U2     2
# 5   2   20       U2     5
# 6   3   20       U2     5
# 7   1   20       U3     3
# 8   2   20       U3     9
# 9   3   20       U3     6
# 10  1   20       U4     5
# 11  2   20       U4     4
# 12  3   20       U4     4

And, very similar to @TylerRinker's answer, but not dropping the "times", is to just use sep = "" to help R guess time and variable names.

dat3 <- reshape(dat, direction = "long", idvar=1:2, 
                varying=3:ncol(dat), sep = "", timevar="Measure")
dat3
#        ID Time Measure U
# 1.20.1  1   20       1 1
# 2.20.1  2   20       1 2
# 3.20.1  3   20       1 2
# 1.20.2  1   20       2 2
# 2.20.2  2   20       2 5
# 3.20.2  3   20       2 5
# 1.20.3  1   20       3 3
# 2.20.3  2   20       3 9
# 3.20.3  3   20       3 6
# 1.20.4  1   20       4 5
# 2.20.4  2   20       4 4
# 3.20.4  3   20       4 4

In all three of those, you end up with four columns, not three, like you describe in your desired output. However, as @ndoogan points out, by doing so, you're loosing information about your data. If you're fine with that, you can always drop that column from the resulting data.frame quite easily (for example, dat2a <- dat2a[-4].

Solution 2

With base reshape:

dat <- read.table(text="ID Time U1 U2 U3 U4
1  20    1  2 3  5
2  20    2  5 9  4
3  20    2  5 6  4", header=TRUE)


colnames(dat) <- gsub("([a-zA-Z]*)([0-9])", "\\1.\\2", colnames(dat))
reshape(dat, varying=3:ncol(dat), v.names="U", direction ="long", timevar = "Time", 
    idvar = "ID")

Solution 3

You can also use melt():

library(reshape2)

new_data <- melt(old_data, id.vars=c("ID","Time"),
    value.name = "U")

Then remove the 'variable' column:

new_data$variable <- NULL

53,539

Author by

user2263330

Updated on June 19, 2020

Comments

user2263330 almost 4 years
I have a data frame that looks like:
```
ID Time U1 U2 U3 U4 ...
1  20    1  2 3  5 .. 
2  20    2  5 9  4 ..
3  20    2  5 6  4 ..
.
.

And I would need to keep it like: 

ID Time  U
1  20    1
1  20    2
1  20    3
1  20    5
2  20    2
2  20    5
2  20    9
2  20    4
3  20    2
3  20    5
3  20    6
3  20    4
```
I tried with:
```
X <- read.table("mydata.txt", header=TRUE, sep=",")
X_D <- as.data.frame(X)
X_new <- stack(X_D, select = -c(ID, Time))
```
But I haven't managed to get the data into that form. Honestly, I have little experience with stacking/transposing, so any help is greatly appreciated!
- ndoogan about 11 years
  
  This is commonly called conversion from wide format to long format. However, the way you've defined it, you lose information about which column a piece of data came from. Incidentally, the package reshape2 covers this sort of transformation.
- ndoogan about 11 years
  
  Additionally, you're more likely to get the answer you really want if you provide a really easy way for answerers to get example data into R to work with. Providing what's in the file isn't simple, but providing dput() output of the original example dataframe object IS!
- Phil over 8 years
  
  There's also a solution with tidyr::gather()
- EngrStudent over 6 years
  
  @Phil - can you enumerate it? I'm looking for the tidyr equivalent of the JMP "stack columns" platform.
user2263330 about 11 years

Thanks! It gives me an error though: 'undefined columns selected' here '[.data.frame'(dat, ,[,c("ID", "Time",
Ferdinand.kraft about 11 years

Hi @user2263330, it works for me with dat <- data.frame(ID=1:3, Time=20, U1=1:3, U2=4:6, U3=7:9, U4=10:12). What are the names of your data.frame?
ndoogan about 11 years

(+1) It could be worth specifying what "base reshape" means in the context of another reshape answer that depends on reshape2 package.
Tyler Rinker about 11 years

This should do it for anyone wondering: ?reshape
A5C1D2H2I1M1N2O1R2T1 about 11 years

+1 for matching the output exactly, but I do also worry about loosing information this way.
Tyler Rinker about 11 years

@AnandaMahto Agreed. In this case I didn't see it as a problem so I dropped it to match the user's wants.
Jala015 almost 7 years

Oops, this will actually order the U rows by rows in the old_data