How to reverse the order of a dataframe in R

49,587

Solution 1

Another tidyverse solution and I think the simplest one is:

df %>% map_df(rev)

or using just purrr::map_df we can do map_df(df, rev).

Solution 2

If you just want to reverse the order of the rows in a dataframe, you can do the following:

df<- df[seq(dim(df)[1],1),]

Solution 3

Just for completeness sake. There is actually no need to call seq here. You can just use the :-R-logic:

### Create some sample data
n=252
sampledata<-data.frame(a=sample(letters,n,replace=TRUE),b=rnorm(n,1,0.7),
                       c=rnorm(n,1,0.6),d=runif(n))

### Compare some different ways to reorder the dataframe
myfun1<-function(df=sampledata){df<-df[seq(nrow(df),1),]}
myfun2<-function(df=sampledata){df<-df[seq(dim(df)[1],1),]}
myfun3<-function(df=sampledata){df<-df[dim(df)[1]:1,]}
myfun4<-function(df=sampledata){df<-df[nrow(df):1,]}

### Microbenchmark the functions


microbenchmark::microbenchmark(myfun1(),myfun2(),myfun3(),myfun4(),times=1000L)
    Unit: microseconds
         expr    min     lq      mean  median      uq      max neval
     myfun1() 63.994 67.686 117.61797 71.3780 87.3765 5818.494  1000
     myfun2() 63.173 67.686  99.29120 70.9680 87.7865 2299.258  1000
     myfun3() 56.610 60.302  92.18913 62.7635 76.9155 3241.522  1000
     myfun4() 56.610 60.302  99.52666 63.1740 77.5310 4440.582  1000

The fastest way in my trial here was to use df<-df[dim(df)[1]:1,]. However using nrow instead of dim is only slightly slower. Making this a question of personal preference.

Using seq here definitely slows the process down.

UPDATE September 2018:

From a speed view there is little reason to use dplyr here. For maybe 90% of users the basic R functionality should suffice. The other 10% need to use dplyr for querying a database or need code translation into another language.

## hmhensen's function
dplyr_fun<-function(df=sampledata){df %>% arrange(rev(rownames(.)))}

microbenchmark::microbenchmark(myfun3(),myfun4(),dplyr_fun(),times=1000L)
Unit: microseconds
        expr    min      lq      mean  median      uq    max neval
    myfun3()   55.8   69.75  132.8178  103.85  139.95 8949.3  1000
    myfun4()   55.9   68.40  115.6418  100.05  135.00 2409.1  1000
 dplyr_fun() 1364.8 1541.15 2173.0717 1786.10 2757.80 8434.8  1000

Solution 4

Yet another tidyverse solution is:

df %>% arrange(desc(row_number()))

Solution 5

Another option is to order the list by the vector you want to sort it by,

> data[order(data$Date), ]
# A tibble: 10 x 4
   Date                priceA priceB priceC
   <dttm>               <dbl>  <dbl>  <dbl>
 1 2016-09-27 00:00:00   46.5   43.6   45.2
 2 2016-09-28 00:00:00   49.2   46.1   47.6
 3 2016-09-29 00:00:00   49.8   46.9   48.4
 4 2016-09-30 00:00:00   50.2   47.4   48.8
 5 2016-10-03 00:00:00   50.9   48.1   49.4
 6 2016-10-04 00:00:00   50.9   48.2   49.3
 7 2016-10-05 00:00:00   51.9   49.1   50.4
 8 2016-10-06 00:00:00   52.5   49.7   51.0
 9 2016-10-07 00:00:00   51.9   49.2   50.4
10 2016-10-10 00:00:00   53.1   50.4   51.9

Then if you are so inclined, you want to flip the order, reverse it,

> data[rev(order(data$Date)), ]
# A tibble: 10 x 4
   Date                priceA priceB priceC
   <dttm>               <dbl>  <dbl>  <dbl>
 1 2016-10-10 00:00:00   53.1   50.4   51.9
 2 2016-10-07 00:00:00   51.9   49.2   50.4
 3 2016-10-06 00:00:00   52.5   49.7   51.0
 4 2016-10-05 00:00:00   51.9   49.1   50.4
 5 2016-10-04 00:00:00   50.9   48.2   49.3
 6 2016-10-03 00:00:00   50.9   48.1   49.4
 7 2016-09-30 00:00:00   50.2   47.4   48.8
 8 2016-09-29 00:00:00   49.8   46.9   48.4
 9 2016-09-28 00:00:00   49.2   46.1   47.6
10 2016-09-27 00:00:00   46.5   43.6   45.2
Share:
49,587
jmn8
Author by

jmn8

NOOB in all languages...

Updated on July 10, 2022

Comments

  • jmn8
    jmn8 almost 2 years

    I've endlessly looked for this and somehow nothing has solved this simple problem.

    I have a dataframe called Prices in which there are 4 columns, one of which is a list of historical dates - the other 3 are lists of prices for products.

    1   10/10/2016  53.14   50.366  51.87
    2   07/10/2016  51.93   49.207  50.38
    3   06/10/2016  52.51   49.655  50.98
    4   05/10/2016  51.86   49.076  50.38
    5   04/10/2016  50.87   48.186  49.3
    6   03/10/2016  50.89   48.075  49.4
    7   30/09/2016  50.19   47.384  48.82
    8   29/09/2016  49.81   46.924  48.4
    9   28/09/2016  49.24   46.062  47.65
    10  27/09/2016  46.52   43.599  45.24
    

    The list is 252 prices long. How can I have my output stored with the latest date at the bottom of the list and the corresponding prices listed with the latest prices at the bottom of the list?

  • 5th
    5th over 6 years
    Nice solution. But you do not need seq here. It will only slow your code down.
  • James Hirschorn
    James Hirschorn over 5 years
    Your dplyr solution will not work as expected, because you are sorting by rownames which are of type character. So, ascending, you would have: "1", "10", "100", "2", etc...
  • James Hirschorn
    James Hirschorn over 5 years
    Should be the selected answer IMO.
  • 5th
    5th over 5 years
    @JamesHirschorn Not my solution. This is the dplyr solution from hmhensen. I upvoted your comment though
  • jan-glx
    jan-glx over 3 years
    all of these solutions break for data.frames with zero rows
  • mccurcio
    mccurcio over 3 years
    This should not be overlooked, due to its simplicity!
  • JJGabe
    JJGabe about 3 years
    Agreed...I would give more up votes if I could
  • banbh
    banbh almost 3 years
    This answer generates an error for matrix columns (which are, admittedly, a bit wierd). For example: tibble(a=1:2, b=matrix(1:6, nrow = 2)) %>% map_df(rev) # error.
  • banbh
    banbh almost 3 years
    Note that this answer works for matrix-valued columns. For example: tibble(a=1:2, b=matrix(1:6, nrow = 2)) %>% arrange(desc(row_number())).