rearrange a data frame by sorting a column within groups

14,771

Solution 1

I think the most straightforward way is

d[order(d$lvl,d$v1),]

which gives

   v1  v2 lvl
3 1.2 3.5   a
1 2.2 4.5   a
2 3.2 2.5   a
5 2.2 7.5   b
6 3.2 6.5   b
4 4.2 5.5   b
8 1.2 1.5   c
7 2.2 2.5   c
9 5.2 3.5   c

Solution 2

Using dplyr, you can add .by_group = TRUE to arrange() to sort the column within each group. Try:

library(dplyr)
d %>% 
        group_by(lvl) %>%
        arrange(v1, .by_group = TRUE)
# output
# A tibble: 9 x 3
# Groups:   lvl [3]
     v1    v2    lvl
  <dbl> <dbl> <fctr>
1   1.2   3.5      a
2   2.2   4.5      a
3   3.2   2.5      a
4   2.2   7.5      b
5   3.2   6.5      b
6   4.2   5.5      b
7   1.2   1.5      c
8   2.2   2.5      c
9   5.2   3.5      c

Solution 3

I believe this is also a legitimate solution using dplyr:

require(data.table)
require(dplyr)
require(dtplyr)

DT <- as.data.table(d)
DT %>% group_by(lvl) %>% arrange(v1)
Share:
14,771
user2783615
Author by

user2783615

Updated on July 18, 2022

Comments

  • user2783615
    user2783615 almost 2 years

    What is a good way to perform the following task?

    I have a data frame, for example:

    v2 <- c(4.5, 2.5, 3.5, 5.5, 7.5, 6.5, 2.5, 1.5, 3.5)
    v1 <- c(2.2, 3.2, 1.2, 4.2, 2.2, 3.2, 2.2, 1.2, 5.2)
    lvl <- c("a","a","a","b","b","b","c","c","c")
    d <- data.frame(v1,v2,lvl)
    
    > d
       v1  v2 l
    1 2.2 4.5 a
    2 3.2 2.5 a
    3 1.2 3.5 a
    4 4.2 5.5 b
    5 2.2 7.5 b
    6 3.2 6.5 b
    7 2.2 2.5 c
    8 1.2 1.5 c
    9 5.2 3.5 c
    

    Within each level of d$lvl, I want to sort the data frame by value of d$v1. So I want to get

       v1  v2 l
    3 1.2 3.5 a
    1 2.2 4.5 a
    2 3.2 2.5 a
    
    5 2.2 7.5 b
    6 3.2 6.5 b
    4 4.2 5.5 b
    
    8 1.2 1.5 c
    7 2.2 2.5 c
    9 5.2 3.5 c
    
  • user2783615
    user2783615 almost 11 years
    Thanks Frank! what if I want to extract all the rows with medians of v1 for each level?
  • Frank
    Frank almost 11 years
    This is a solution I've seen @eddi use: require(data.table); DT <- data.table(d); DT[DT[,.I[ceiling(.N/2)],by=lvl]$V1]. You can put the stuff separated by ";" on separate lines. Someone answered your question using data.table before, but I guess they deleted the answer. Anyway, I think it's a great package to use if you find yourself wanting to do things "by group" all the time (as I do). If you have another similar question, you might want to post it as a new question on SO. Space and typesetting are limited in comments. :)