Using lapply to change column names of a list of data frames

24,899

Solution 1

You can also use setNames if you want to replace all columns

df1 <- data.frame(A = 1:10, B= 11:20)
df2 <- data.frame(A = 21:30, B = 31:40) 

listDF <- list(df1, df2)
new_col_name <- c("C", "D")

lapply(listDF, setNames, nm = new_col_name)
## [[1]]
##     C  D
## 1   1 11
## 2   2 12
## 3   3 13
## 4   4 14
## 5   5 15
## 6   6 16
## 7   7 17
## 8   8 18
## 9   9 19
## 10 10 20

## [[2]]
##     C  D
## 1  21 31
## 2  22 32
## 3  23 33
## 4  24 34
## 5  25 35
## 6  26 36
## 7  27 37
## 8  28 38
## 9  29 39
## 10 30 40

If you need to replace only a subset of column names, then you can use the solution of @Jogo

lapply(listDF, function(df) {
  names(df)[-1] <- new_col_name[-ncol(df)]
  df
})

A last point, in R there is a difference between a:b - 1 and a:(b - 1)

1:10 - 1
## [1] 0 1 2 3 4 5 6 7 8 9

1:(10 - 1)
## [1] 1 2 3 4 5 6 7 8 9

EDIT

If you want to change the column names of the data.frame in global environment from a list, you can use list2env but I'm not sure it is the best way to achieve want you want. You also need to modify your list and use named list, the name should be the same as name of the data.frame you need to replace.

listDF <- list(df1 = df1, df2 = df2)

new_col_name <- c("C", "D")

listDF <- lapply(listDF, function(df) {
  names(df)[-1] <- new_col_name[-ncol(df)]
  df
})

list2env(listDF, envir = .GlobalEnv)
str(df1)
## 'data.frame':    10 obs. of  2 variables:
##  $ A: int  1 2 3 4 5 6 7 8 9 10
##  $ C: int  11 12 13 14 15 16 17 18 19 20

Solution 2

try this:

lapply (listDF, function(x) { 
  names(x)[-1] <- todos[-length(x)]
  x 
})

you will get a new list with changed dataframes. If you want to manipulate the listDF directly:

for (i in 1:length(listDF)) names(listDF[[i]])[-1] <- todos[-length(listDF[[i]])]

Solution 3

I was not able to get the code used in these answers to work. I found some code from another forum which did work. This will assign the new column names into each dataframe, the other methods created a copy of the dataframes. For anyone else here is the code.

# Create some dataframes
df1 <- data.frame(A = 1:10, B= 11:20)
df2 <- data.frame(A = 21:30, B = 31:40)

listDF <- c("df1", "df2") #Notice this is NOT a list
new_col_name <- c("C", "D") #What do you want the new columns to be named?

# Assign the new column names to each dataframe in "listDF"
for(df in listDF) {
  df.tmp <- get(df)
  names(df.tmp) <- new_col_name
  assign(df, df.tmp)
}
Share:
24,899
user3310782
Author by

user3310782

Updated on July 13, 2022

Comments

  • user3310782
    user3310782 almost 2 years

    I'm trying to use lapply on a list of data frames; but failing at passing the parameters correctly (I think).

    List of data frames:

    df1 <- data.frame(A = 1:10, B= 11:20)
    df2 <- data.frame(A = 21:30, B = 31:40) 
    
    listDF <- list(df1, df2,df3)    #multiple data frames w. way less columns than the length of vector todos
    

    Vector with columns names:

    todos <-c('col1','col2', ......'colN')
    

    I'd like to change the column names using lapply:

    lapply (listDF, function(x) { colnames(x)[2:length(x)] <-todos[1:length(x)-1] }  )
    

    but this doesn't change the names at all. Am I not passing the data frames themselves, but something else? I just want to change names, not to return the result to a new object.

    Thanks in advance, p.

  • Pierre L
    Pierre L over 8 years
    Check the output vs the OP's code. The line colnames(x)[2:length(x)] indicates that the replacement begins at the second column.
  • dickoa
    dickoa over 8 years
    @PierreLafortune Thanks Pierre you are right, I made some adjustement
  • user3310782
    user3310782 over 8 years
    Sorry dickoa but lapply only changes the column names inside the list (so df1 and df2 still have the original col. names !!). I've also tried adding the 'x' from Pierre but still that doesn't do the trick. I only use the list to hold a long list of DFs, not that I want changes inside the list itself. Any ideas? thanks
  • user3310782
    user3310782 over 8 years
    Thanks jogo but why doesn't it change the data frame column names ? It only changes the column names inside the list, not in the independent DF.
  • jogo
    jogo over 8 years
    but you can do listDF <- lapply(...)
  • user3310782
    user3310782 over 8 years
    Sure, but if you wish to change the DF column names, is then lapply NOT the way to go?
  • jogo
    jogo over 8 years
    yes, but a function can never change the values recieving during the call. The function can only return an object. lapply() is calling a function for each element of the list.
  • jogo
    jogo over 8 years
    then you have to use a for-loop without a additional function to call
  • dickoa
    dickoa over 8 years
    @user3310782 Look at the updated answer to see if it does want you want. However, I think that a for loop will probably easier in this case.
  • dickoa
    dickoa over 8 years
    @jogo What do you mean by that ? can you elaborate please
  • jogo
    jogo over 8 years
    for (i in 1:length(listDF)) names(listDF[[i]])[-1] <- todos[-length(listDF[[i]])]
  • user3310782
    user3310782 over 8 years
    It seems the Edit by dickoa does the trick!! I didn't get why you apply it but data frames keep untouched. The key is in: listDF <- lapply(listDF, function(df) { names(df)[-1] <-new_col_name[-ncol(df)] df }) that passes the name of the frame, and in naming the data frames in the list. Thanks all.
  • user3310782
    user3310782 over 8 years
    Thanks jogo, your FOR ..loop also seems a good solution, I just thought of lapply first.