How does one reorder columns in a data frame?
Solution 1
Your dataframe has four columns like so df[,c(1,2,3,4)]
.
Note the first comma means keep all the rows, and the 1,2,3,4 refers to the columns.
To change the order as in the above question do df2[,c(1,3,2,4)]
If you want to output this file as a csv, do write.csv(df2, file="somedf.csv")
Solution 2
# reorder by column name
data <- data[, c("A", "B", "C")] # leave the row index blank to keep all rows
#reorder by column index
data <- data[, c(1,3,2)] # leave the row index blank to keep all rows
Solution 3
You can also use the subset function:
data <- subset(data, select=c(3,2,1))
You should better use the [] operator as in the other answers, but it may be useful to know that you can do a subset and a column reorder operation in a single command.
Update:
You can also use the select function from the dplyr package:
data = data %>% select(Time, out, In, Files)
I am not sure about the efficiency, but thanks to dplyr's syntax this solution should be more flexible, specially if you have a lot of columns. For example, the following will reorder the columns of the mtcars dataset in the opposite order:
mtcars %>% select(carb:mpg)
And the following will reorder only some columns, and discard others:
mtcars %>% select(mpg:disp, hp, wt, gear:qsec, starts_with('carb'))
Read more about dplyr's select syntax.
Solution 4
As mentioned in this comment, the standard suggestions for re-ordering columns in a data.frame
are generally cumbersome and error-prone, especially if you have a lot of columns.
This function allows to re-arrange columns by position: specify a variable name and the desired position, and don't worry about the other columns.
##arrange df vars by position
##'vars' must be a named vector, e.g. c("var.name"=1)
arrange.vars <- function(data, vars){
##stop if not a data.frame (but should work for matrices as well)
stopifnot(is.data.frame(data))
##sort out inputs
data.nms <- names(data)
var.nr <- length(data.nms)
var.nms <- names(vars)
var.pos <- vars
##sanity checks
stopifnot( !any(duplicated(var.nms)),
!any(duplicated(var.pos)) )
stopifnot( is.character(var.nms),
is.numeric(var.pos) )
stopifnot( all(var.nms %in% data.nms) )
stopifnot( all(var.pos > 0),
all(var.pos <= var.nr) )
##prepare output
out.vec <- character(var.nr)
out.vec[var.pos] <- var.nms
out.vec[-var.pos] <- data.nms[ !(data.nms %in% var.nms) ]
stopifnot( length(out.vec)==var.nr )
##re-arrange vars by position
data <- data[ , out.vec]
return(data)
}
Now the OP's request becomes as simple as this:
table <- data.frame(Time=c(1,2), In=c(2,3), Out=c(3,4), Files=c(4,5))
table
## Time In Out Files
##1 1 2 3 4
##2 2 3 4 5
arrange.vars(table, c("Out"=2))
## Time Out In Files
##1 1 3 2 4
##2 2 4 3 5
To additionally swap Time
and Files
columns you can do this:
arrange.vars(table, c("Out"=2, "Files"=1, "Time"=4))
## Files Out In Time
##1 4 3 2 1
##2 5 4 3 2
Solution 5
A dplyr
solution (part of the tidyverse
package set) is to use select
:
select(table, "Time", "Out", "In", "Files")
# or
select(table, Time, Out, In, Files)
Catherine
Updated on July 08, 2022Comments
-
Catherine almost 2 years
How would one change this input (with the sequence: time, in, out, files):
Time In Out Files 1 2 3 4 2 3 4 5
To this output (with the sequence: time, out, in, files)?
Time Out In Files 1 3 2 4 2 4 3 5
Here's the dummy R data:
table <- data.frame(Time=c(1,2), In=c(2,3), Out=c(3,4), Files=c(4,5)) table ## Time In Out Files ##1 1 2 3 4 ##2 2 3 4 5
-
Herman Toothrot over 10 yearsThis is ok when you have a limited number of columns, but what if you have for example 50 columns, it would take too much time to type all column numbers or names. What would be a quicker solution?
-
dalloliogm about 10 years@user4050: in that case you can use the ":" syntax, e.g. df[,c(1,3,2,4,5:50)].
-
kasterma almost 10 yearsto put the columns in idcols at the start: idcols <- c("name", "id2", "start", "duration"); cols <- c(idcols, names(cts)[-which(names(cts) %in% idcols)]); df <- df[cols]
-
MERose over 9 yearsThere are some reasons not to use
subset()
, see this question. -
dalloliogm over 9 yearsThank you. In any case I would now use the select function from the dplyr package, instead of subset.
-
Bram Vanroy over 9 yearsQuestion as a beginner, can you combine ordering by index and by name? E.g.
data <- data[c(1,3,"Var1", 2)]
? -
Terry Brown over 9 years@BramVanroy nope,
c(1,3,"Var1", 2)
will be read asc("1","3","Var1", "2")
because vectors can contain data of only one type, so types are promoted to the most general type present. Because there are no columns with the character names "1", "3", etc. you'll get "undefined columns".list(1,3,"Var1", 2)
keeps values without type promotion, but you can't use alist
in the above context. -
guyabel about 9 yearsWhen you want to bring a couple of columns to the left hand side and not drop the others, I find
everything()
particularly awesome;mtcars %>% select(wt, gear, everything())
-
landroni over 8 yearsWhy does the
mtcars[c(1,3,2)]
subsetting work? I would have expected an error relating to incorrect dimensions or similar... Shouldn't it bemtcars[,c(1,3,2)]
? -
petermeissner over 8 yearsdata.frames are lists under the hood with columns as first order items
-
arekolek about 8 years@user4050: you can also use
df[,c(1,3,2,4:ncol(df))]
when you don't know how many columns there are. -
landroni about 8 years@user4050 This answer proposes a solution that should be more convenient (and less error-prone) when dealing with large numbers of columns. It allows to specify the desired position of chosen variables, and not worry about the remaining variables, which will automatically be slotted in the remaining positions.
-
CoderGuy123 almost 8 yearsVery nice function. I added a modified version of this function to my personal package.
-
Chris almost 8 yearsYou can also use dput(colnames(df)), it prints column names in R character format. You can then rearrange the names.
-
richiemorrisroe over 7 years@landroni that is a really good answer. It's a little verbose (I would pre-filter at the repl and use that), and in general, I think that
df` > names(.) > grep "some_col_name_pattern >> df(names %in% .)"
(untested) is more elegant. But nonetheless, your answer is more general (but more obscure) so thank you for making this answer better :) -
landroni over 7 years@richiemorrisroe Thanks for the feedback. I've now simplified slightly the answer which should make it more readable.
-
Arthur Yip almost 7 yearsHere is another way to use the everything() select_helper function to rearrange the columns to the right/end. stackoverflow.com/a/44353144/4663008 github.com/tidyverse/dplyr/issues/2838 Seems like you will need to use 2 select()'s to move some columns to the right end and others to the left.
-
Garini almost 6 yearsThe best option for me. Even if I had to install it, it is clearly the clearest possibility.
-
Zachary Ryan Smith almost 6 years
!! WARNING !!
data.table
turnsTARGET
into an int vector:TARGET <- TARGET[ , order(colnames(TARGET), decreasing=TRUE)]
to fix that:TARGET <- as.data.frame(TARGET)
TARGET <- TARGET[ , order(colnames(TARGET), decreasing=TRUE)]
-
Paul Rougieux almost 6 yearsTidyverse (dplyr in fact) also has the option to select groups of columns, for example to move the Species variable to the front:
select(iris, Species, everything())
. Also note that quotes are not needed. -
Mike Dolan Fliss over 5 yearsHerman - if you've got 50 columns and you want to custom reorder them, use a helper csv file with a new column order, e.g.
name_df$new_order
(which you could construct bywrite_csv(data.frame(old_order = names(df), "name_df.csv"))
. Then mess with the order out of R and read it back in. Now you candf_reordered = df[, name_df$new_order]
. Referencing columns by position number doesn't scale well as the number of columns goes up. -
divibisan about 5 yearsIt's important to note that this will drop all columns which are not explicitly specified unless you include
everything()
as in PaulRougieux's comment -
Mrmoleje almost 5 yearsThis is really useful - it's going to save me a lot of time when I just want to move one column from the end of a really wide tibble to the beginning
-
Triamus almost 5 yearspls state the library you take the function
setcolorder
from. -
David Tonhofer over 4 years
dplyr
'sgroup
will also rearrange the variables, so watch out when using that in a chain. -
Arthur Yip about 4 yearsnew function dplyr::relocate is exactly for this. see H 1 's answer below
-
otteheng over 3 yearsAs of
dplyr
version1.0.0
they added arelocate()
function that's intuitive and easy to read. It's especially helpful if you just want to add columns after or before a specific column. -
Sandy almost 3 yearsThat's a very neat solution. Thanks!
-
Dominique Paul almost 2 yearsThis is probably the most flexible and simple solution. Thanks!