Merge multiple data.frames in R with varying row length

14,063

You want to merge the result with df3, i.e.:

merge(df3, merge(df1, df2, by="Year", all.x=TRUE, all.y=TRUE), by = "Year", all.x = TRUE, all.y = TRUE)
#  Year Site3 Site1 Site2
#1 2006  <NA>   2.3  <NA>
#2 2007  <NA>     1   2.7
#3 2008   1.3   3.1   4.1
#4 2009     2   2.9   1.1
#5 2010   3.6   1.4   2.6
#6 2011   1.7     3   3.1

Or if you have your data.frame's in a list, use Reduce to generalize the above:

Reduce(function(x,y) merge(x, y, by = "Year", all.x = TRUE, all.y = TRUE),
       list(df1, df2, df3))
#  Year Site1 Site2 Site3
#1 2006   2.3  <NA>  <NA>
#2 2007     1   2.7  <NA>
#3 2008   3.1   4.1   1.3
#4 2009   2.9   1.1     2
#5 2010   1.4   2.6   3.6
#6 2011     3   3.1   1.7
Share:
14,063
KKL234
Author by

KKL234

Updated on June 27, 2022

Comments

  • KKL234
    KKL234 almost 2 years

    I'm relatively new to R and trying to figure out how to merge multiple data.frames with varying numbers of rows but all with a common column, "Year". I've looked through similar questions, and this question: Merge dataframes, different lengths provided a great answer. However, when I applied it to my own data, I couldn't get it to work with multiple data.frames; I always receive an error message.

    Sample data:

    > df1 <- data.frame(Year=2006:2011, Site1=c("2.3", "1"  , "3.1", "2.9", "1.4", "3"))  
    > df2 <- data.frame(Year=2007:2011, Site2=c("2.7", "4.1", "1.1", "2.6", "3.1"))  
    > df3 <- data.frame(Year=2008:2011, Site3=c("1.3", "2"  , "3.6", "1.7"))  
    

    The goal is to produce a single data.frame where column 1 is the year, column 2 is site 1, column 3 is site 2, and so on. I have ~17 data.frames currently (there will be up to 40), corresponding to 17 sites with variable timelines/number of rows.

    Any help would be appreciated.

    Code I've tried:

    > NewDF <- merge(df1, df2, by="Year", all.x=TRUE, all.y=TRUE)  
    

    This worked great for 2 data.frames, but when I tried to add in another data.frame, I received the error message:

    > NewDF <- merge(list=c(df1, df2, df3), by="Year", all.x=TRUE, all.y=TRUE)  
     Error in as.data.frame(x) : argument "x" is missing, with no default
    
  • KKL234
    KKL234 about 10 years
    Thanks for a quick response! I just tried out both options, and have a couple questions. 1) For option 1, is there a way to easily run the merge with 17 data.frames? It works great for 3, but not sure the most efficient way to apply it to 17. 2) I tried the reduce function, and it worked perfectly for my first 3 data.frames. When I added a 4th, I received an error message: `Reduce(function(x,y) merge(x, y, by = "Year", all.x = TRUE, all.y = TRUE), list(Faye006WWmean, Faye006BWWmean, Faye007WWmean, Grin001WWmean)) Error in match.names(clabs, names(xi)) : names do not match previous names
  • eddi
    eddi about 10 years
    @user2296922 1) that's what second method does 2) something is weird about your 4th data.frame - try merging just that one with any other one and see what that does
  • KKL234
    KKL234 about 10 years
    I just tried reordering the data.frames within the list, and there was no issue. Must have been the way I entered the code the first time. Thanks for the help!