How to Transpose (t) in the Tidyverse Using Tidyr

11,215

Solution 1

The general idiom in the tidyverse is to gather() your data to the maximal extent, forming a "long" data frame with one measurement per row. Then, spread() can revert this long data frame into whichever "wide" format that you like best. This procedure can effectively transpose the data: just gather() all the identifier columns except the row names, and then spread() the row names.

For example, here is how to effectively transpose mtcars:

require(tidyverse)

mtcars %>% 
    rownames_to_column %>%
    gather(variable, value, -rowname) %>% 
    spread(rowname, value)

Your data does not have "row names" as understood in R, but Code1 effectively serves as a row name because it uniquely identifies each (original) row of your data.

Df1 <- Df %>% 
    group_by(Code1, Code2, Level) %>%
    summarise_all(funs(count = sum(!is.na(.)))) %>%
    gather(column, value, -Code1) %>%
    spread(Code1, value)

UPDATE for tidyr 1.0 or higher (late 2019 onwards)

The new pivot_wider() and pivot_longer() functions are now preferred over the older (but still supported) gather() and spread(). Thus the preferred way to transpose mtcars is probably

require(tidyverse)

mtcars %>% 
    rownames_to_column() %>%
    pivot_longer(-rowname, 'variable', 'value') %>%
    pivot_wider(variable, rowname)

Solution 2

library(tidyr)
library(dplyr)

Df <- Df %>% group_by(Code1, Code2, Level) %>%
    summarise_all(funs(count = sum(!is.na(.)))) %>%
    gather(var, val, 2:ncol(Df)) %>%
    spread(Code1, val)
Share:
11,215

Related videos on Youtube

Mike
Author by

Mike

Updated on June 11, 2022

Comments

  • Mike
    Mike almost 2 years

    Using the sample data (bottom), I want to use the code below to group and summarise the data. After this, I want to transpose, but I'm stuck on how to use tidyr to achieve this?

    For context, I'm attempting to recreate an existing table that was created in Excel using knitr::kable, so the final product of my code below is expected to break tidy principles.

    For example:

    library(tidyverse)
    
    Df <- Df %>% group_by(Code1, Code2, Level) %>%
        summarise_all(funs(count = sum(!is.na(.))))
    

    I can add t(.) using the pipe...

    Df <- Df %>% group_by(Code1, Code2, Level) %>%
        summarise_all(funs(count = sum(!is.na(.)))) %>%
        t(.)
    

    or I can add...

    Df <- as.data.frame(t(Df)
    

    Both of these options allow me to transpose, but I'm wondering if there's a tidyverse method of achieving this using tidyr's gather and spread functions? I want to have more control over the process and also want to remove the "V1","V2", etc, that appear as column names when using transpose (t).

    How can I achieve this using tidyverse?

    Sample Code:

    Code1 <- c("H200","H350","H250","T400","T240","T600")
    Code2 <- c("4A","4A","4A","2B","2B","2B")
    Level <- c(1,2,3,1,2,3)
    Q1 <- c(30,40,40,50,60,80)
    Q2 <- c(50,30,50,40,80,30)
    Q3 <- c(30,45,70,42,81,34)
    
    Df <- data.frame(Code1, Code2, Level, Q1, Q2, Q3)
    
  • mx0
    mx0 over 6 years
    While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
  • Earlien
    Earlien almost 3 years
    If your data.frame or tibble has different data types, you may have to explicilty coerce them to a consistent data type first, e.g. using apply(mtcars, 2, as.character). Otherwise, pivot_longer may result in an error. (Works fine with mtcars, but might not in general).