How to reverse the order of a dataframe in R
Solution 1
Another tidyverse
solution and I think the simplest one is:
df %>% map_df(rev)
or using just purrr::map_df
we can do map_df(df, rev)
.
Solution 2
If you just want to reverse the order of the rows in a dataframe, you can do the following:
df<- df[seq(dim(df)[1],1),]
Solution 3
Just for completeness sake. There is actually no need to call seq
here. You can just use the :
-R-logic:
### Create some sample data
n=252
sampledata<-data.frame(a=sample(letters,n,replace=TRUE),b=rnorm(n,1,0.7),
c=rnorm(n,1,0.6),d=runif(n))
### Compare some different ways to reorder the dataframe
myfun1<-function(df=sampledata){df<-df[seq(nrow(df),1),]}
myfun2<-function(df=sampledata){df<-df[seq(dim(df)[1],1),]}
myfun3<-function(df=sampledata){df<-df[dim(df)[1]:1,]}
myfun4<-function(df=sampledata){df<-df[nrow(df):1,]}
### Microbenchmark the functions
microbenchmark::microbenchmark(myfun1(),myfun2(),myfun3(),myfun4(),times=1000L)
Unit: microseconds
expr min lq mean median uq max neval
myfun1() 63.994 67.686 117.61797 71.3780 87.3765 5818.494 1000
myfun2() 63.173 67.686 99.29120 70.9680 87.7865 2299.258 1000
myfun3() 56.610 60.302 92.18913 62.7635 76.9155 3241.522 1000
myfun4() 56.610 60.302 99.52666 63.1740 77.5310 4440.582 1000
The fastest way in my trial here was to use df<-df[dim(df)[1]:1,]
. However using nrow
instead of dim
is only slightly slower. Making this a question of personal preference.
Using seq
here definitely slows the process down.
UPDATE September 2018:
From a speed view there is little reason to use dplyr
here. For maybe 90% of users the basic R functionality should suffice. The other 10% need to use dplyr
for querying a database or need code translation into another language.
## hmhensen's function
dplyr_fun<-function(df=sampledata){df %>% arrange(rev(rownames(.)))}
microbenchmark::microbenchmark(myfun3(),myfun4(),dplyr_fun(),times=1000L)
Unit: microseconds
expr min lq mean median uq max neval
myfun3() 55.8 69.75 132.8178 103.85 139.95 8949.3 1000
myfun4() 55.9 68.40 115.6418 100.05 135.00 2409.1 1000
dplyr_fun() 1364.8 1541.15 2173.0717 1786.10 2757.80 8434.8 1000
Solution 4
Yet another tidyverse
solution is:
df %>% arrange(desc(row_number()))
Solution 5
Another option is to order the list by the vector you want to sort it by,
> data[order(data$Date), ]
# A tibble: 10 x 4
Date priceA priceB priceC
<dttm> <dbl> <dbl> <dbl>
1 2016-09-27 00:00:00 46.5 43.6 45.2
2 2016-09-28 00:00:00 49.2 46.1 47.6
3 2016-09-29 00:00:00 49.8 46.9 48.4
4 2016-09-30 00:00:00 50.2 47.4 48.8
5 2016-10-03 00:00:00 50.9 48.1 49.4
6 2016-10-04 00:00:00 50.9 48.2 49.3
7 2016-10-05 00:00:00 51.9 49.1 50.4
8 2016-10-06 00:00:00 52.5 49.7 51.0
9 2016-10-07 00:00:00 51.9 49.2 50.4
10 2016-10-10 00:00:00 53.1 50.4 51.9
Then if you are so inclined, you want to flip the order, reverse it,
> data[rev(order(data$Date)), ]
# A tibble: 10 x 4
Date priceA priceB priceC
<dttm> <dbl> <dbl> <dbl>
1 2016-10-10 00:00:00 53.1 50.4 51.9
2 2016-10-07 00:00:00 51.9 49.2 50.4
3 2016-10-06 00:00:00 52.5 49.7 51.0
4 2016-10-05 00:00:00 51.9 49.1 50.4
5 2016-10-04 00:00:00 50.9 48.2 49.3
6 2016-10-03 00:00:00 50.9 48.1 49.4
7 2016-09-30 00:00:00 50.2 47.4 48.8
8 2016-09-29 00:00:00 49.8 46.9 48.4
9 2016-09-28 00:00:00 49.2 46.1 47.6
10 2016-09-27 00:00:00 46.5 43.6 45.2
Comments
-
jmn8 almost 2 years
I've endlessly looked for this and somehow nothing has solved this simple problem.
I have a dataframe called Prices in which there are 4 columns, one of which is a list of historical dates - the other 3 are lists of prices for products.
1 10/10/2016 53.14 50.366 51.87 2 07/10/2016 51.93 49.207 50.38 3 06/10/2016 52.51 49.655 50.98 4 05/10/2016 51.86 49.076 50.38 5 04/10/2016 50.87 48.186 49.3 6 03/10/2016 50.89 48.075 49.4 7 30/09/2016 50.19 47.384 48.82 8 29/09/2016 49.81 46.924 48.4 9 28/09/2016 49.24 46.062 47.65 10 27/09/2016 46.52 43.599 45.24
The list is 252 prices long. How can I have my output stored with the latest date at the bottom of the list and the corresponding prices listed with the latest prices at the bottom of the list?
-
5th over 6 yearsNice solution. But you do not need
seq
here. It will only slow your code down. -
James Hirschorn over 5 yearsYour
dplyr
solution will not work as expected, because you are sorting by rownames which are of typecharacter
. So, ascending, you would have: "1", "10", "100", "2", etc... -
James Hirschorn over 5 yearsShould be the selected answer IMO.
-
5th over 5 years@JamesHirschorn Not my solution. This is the dplyr solution from hmhensen. I upvoted your comment though
-
jan-glx over 3 yearsall of these solutions break for data.frames with zero rows
-
mccurcio over 3 yearsThis should not be overlooked, due to its simplicity!
-
JJGabe about 3 yearsAgreed...I would give more up votes if I could
-
banbh almost 3 yearsThis answer generates an error for matrix columns (which are, admittedly, a bit wierd). For example:
tibble(a=1:2, b=matrix(1:6, nrow = 2)) %>% map_df(rev) # error
. -
banbh almost 3 yearsNote that this answer works for matrix-valued columns. For example:
tibble(a=1:2, b=matrix(1:6, nrow = 2)) %>% arrange(desc(row_number()))
.