How do I set column names to lower case for multiple dataframes?

13,680

Solution 1

The following should work:

dfList <- lapply(lapply(dfs,get),function(x) {colnames(x) <- tolower(colnames(x));x})

Problems like this generally stem from the fact that you haven't placed all your data frames in a single data structure, and then are forced to use something awkward, like get.

Not that in my code, I use lapply and get to actually create a single list of data frames first, and then alter their colnames.

You should also be aware that your lowercols function is rather un-R like. R functions generally aren't called in such a way that they return nothing, but have side effects. If you try to write functions that way (which is possible) you will probably make your life difficult and have scoping issues. Note that in my second lapply I explicitly return the modified data frame.

Solution 2

@joran's answer overlaps mine heavily, both in style and in "you probably want to do this differently" message. However, in the spirit of "give a man a fish and you feed him for a day; give him a sharp stick, and he can poke himself in the eye" ...

Here's a function that does what you want in the way that (you think) you want to do it:

dfnames <- ls(pattern = "df[0-9]+")  ## avoid 'dfnames' itself
lowercolnames <- function(df) {
    x <- get(df)
    colnames(x) <- tolower(colnames(x))
    ## normally I would use parent.frame(), but here we
    ##  have to go back TWO frames if this is used within lapply()
    assign(df,x,sys.frame(-2))
    ## OR (maybe simpler)
    ## assign(df,x,envir=.GlobalEnv)

    NULL
}

Here are two alternate functions that lowercase column names and return the result:

lowerCN2 <- function(x) {
    colnames(x) <- tolower(colnames(x))
    x
}

I include plyr::rename here for completeness, although in this case it's actually more trouble than it's worth.

lowerCN3 <- function(x) {
    plyr::rename(x,structure(tolower(colnames(x)),
                             names=colnames(x)))
}

dflist <- lapply(dfnames,get)
dflist <- lapply(dflist,lowerCN2)
dflist <- lapply(dflist,lowerCN3)

Solution 3

This doesn't directly answer your question, but it may solve the problem you're trying to solve; you can merge data.frames by different names via something like:

df1 <- data.frame("A" = 1:10, "B" = 2:11, x=letters[1:10])
df2 <- data.frame("a" = 3:12, "b" = 4:13, y=LETTERS[1:10])
merge(df1, df2, by.x=c("A","B"), by.y=c("a","b"), all=TRUE)
Share:
13,680
William Gunn
Author by

William Gunn

Science, Scholarly Communication, and Mendeley

Updated on June 13, 2022

Comments

  • William Gunn
    William Gunn almost 2 years

    I have a set of dataframes with the same column headings, except that some of the column names are in upper case and some are in lower case. I want to convert all the column names to lowercase so that I can make one big dataframe of everything.

    I can't seem to get colnames() to work in any loop or apply I write. With:

    #create dfs
    df1<-data.frame("A" = 1:10, "B" = 2:11)
    df2<-data.frame("a" = 3:12, "b" = 4:13)
    df3<-data.frame("a" = 5:14, "b" = 6:15)
    #I have many more dfs in my actual data
    
    #make list of dfs, define lowercasing function, apply across df list
    dfs<-ls(pattern = "df")
    lowercols<-function(df){colnames(get(df))<-tolower(colnames(get(df)))}
    lapply(dfs, lowercols)
    

    I get the following error:

    Error in colnames(get(df)) <- tolower(colnames(get(df))) : 
      could not find function "get<-"
    

    How do I change all my dataframes to have lowercase column names?

  • William Gunn
    William Gunn about 12 years
    Why didn't it occur to me to make a list of the data frames themselves? Of course that's a better solution. I'll accept the answer as soon as I get a chance to try it out.
  • William Gunn
    William Gunn about 12 years
    That works perfectly, and then having the data frames as a list, getting all the separate data frames into one big df was as simple as data<-ldply(dfList, rbind.fill) Thanks and I'm so appreciative of the constructive and helpful community here.
  • William Gunn
    William Gunn about 12 years
    Thanks for the clear code showing me how I would do what I thought I wanted to do. I don't understand what the sys.frame(-2) in the assign() is doing, but that's probably because I don't understand assign all that well.
  • William Gunn
    William Gunn about 12 years
    With more than 2 dfs to deal with, merge isn't the answer, but thanks for the tips. I'm sure they'll come in handy in the future.
  • Joshua Ulrich
    Joshua Ulrich about 12 years
    @WilliamGunn: You said, "I want to convert all the column names to lowercase so that I can merge them." I was just pointing out that you don't have to change the column names in order to merge the data.frames. Perhaps you used merge when you meant append/rbind?
  • William Gunn
    William Gunn about 12 years
    I understood what you were answering and thanks! It was confusing how I used the word merge but didn't mean specifically using merge(), which only works on pairs of dataframes. I'll change that.
  • Nikos Alexandris
    Nikos Alexandris almost 8 years
    Is lapply(dfs, get) really necessary? Simply supplying the list of data.frames wouldn't suffice?
  • joran
    joran almost 8 years
    @NikosAlexandris No it isn't, which I talk about in my answer. The OP didn't have the data frames in a list in the first place, hence my discussion of that issue.