R re-arrange dataframe: some rows to columns

28,601

Solution 1

Using reshape from base R:

nn<-reshape(d,timevar="cat",idvar="sample",direction="wide")
names(nn)[-1]<-as.character(d$cat)
nn[is.na(nn)]<-0
> nn
  sample k l m n o p q r s  t
1      A 1 0 3 0 5 0 7 0 9  0
2      B 0 2 0 4 0 6 0 8 0 10

Solution 2

Use dcast from reshape2 package

> dcast(d, sample~cat, fill=0)
  sample k l m n o p q r s  t
1      A 1 0 3 0 5 0 7 0 9  0
2      B 0 2 0 4 0 6 0 8 0 10

xtabs from base is another alternative

> xtabs(count~sample+cat, d)
      cat
sample  k  l  m  n  o  p  q  r  s  t
     A  1  0  3  0  5  0  7  0  9  0
     B  0  2  0  4  0  6  0  8  0 10

If you prefer the output to be a data.frame, then try:

> as.data.frame.matrix(xtabs(count~sample+cat, d))
  k l m n o p q r s  t
A 1 0 3 0 5 0 7 0 9  0
B 0 2 0 4 0 6 0 8 0 10
Share:
28,601
crs
Author by

crs

Updated on August 25, 2021

Comments

  • crs
    crs over 2 years

    I'm not even sure how to title the question properly!

    Suppose I have a dataframe d:

    Current dataframe:

    d <- data.frame(sample = LETTERS[1:2], cat = letters[11:20], count = c(1:10))
    
       sample cat count
    1       A   k     1
    2       B   l     2
    3       A   m     3
    4       B   n     4
    5       A   o     5
    6       B   p     6
    7       A   q     7
    8       B   r     8
    9       A   s     9
    10      B   t    10
    

    and I'm trying to re-arrange things such that each cat value becomes a column of its own, sample remains a column (or becomes the row name), and count will be the values in the new cat columns, with 0 where a sample doesn't have a count for a cat. Like so:

    Desired dataframe layout:

       sample   k   l   m   n   o   p   q   r   s   t
    1       A   1   0   3   0   5   0   7   0   9   0
    2       B   0   2   0   4   0   6   0   8   0  10
    

    What's the best way to go about this?

    This is as far as I've gotten:

    for (i in unique(d$sample)) {
        s <- d[d$sample==i,]
        st <- as.data.frame(t(s[,3]))
        colnames(st) <- s$cat
        rownames(st) <- i
    } 
    

    i.e. looping through the samples in the original data frame, and transposing for each sample subset. So in this case I get

       k m o q s
     A 1 3 5 7 9
    

    and

       l n p r  t
     B 2 4 6 8 10
    

    And this is where I get stuck. I've tried a bunch of things with merge, bind, apply,... but I can't seem to hit on the right thing. Plus, I can't help but wonder if that loop above is a necessary step at all - something with unstack perhaps?

    Needless to say, I'm new to R... If someone can help me out, it would be greatly appreciated!

    PS Reason I'm trying to re-arrange my dataframe is in the hopes of making plotting of the values easier (i.e. I want to show the actual df in a plot in table format).

    Thank you!