Read and rbind multiple csv files
Solution 1
Find files (list.files
) and read the files in a loop (lapply
), then call (do.call
) row bind (rbind
) to put all files together by rows.
myMergedData <-
do.call(rbind,
lapply(list.files(path = "N:/Ring data by cruise"), read.csv))
Update: There is a vroom package, according to the manuals it is much faster than data.table::fread and base read.csv. The syntax looks neat, too:
library(vroom)
myMergedData <- vroom(files)
Solution 2
If you're looking for speed, then try this:
require(data.table) ## 1.9.2 or 1.9.3
ans = rbindlist(lapply(filenames, fread))
Solution 3
Don't have enough rep to comment, but to answer Rafael Santos, you can use the code here to add params to the lapply in the answer above. Using lapply and read.csv on multiple files (in R)
helen.h
Updated on July 09, 2022Comments
-
helen.h almost 2 years
I have a series of csv files (one per anum) with the same column headers and different number of rows. Originally I was reading them in and merging them like so;
setwd <- ("N:/Ring data by cruise/Shetland") LengthHeight2013 <- read.csv("N:/Ring data by cruise/Shetland/R_0113A_S2013_WD.csv",sep=",",header=TRUE) LengthHeight2012 <- read.csv("N:/Ring data by cruise/Shetland/R_0212A_S2012_WD.csv",sep=",",header=TRUE) LengthHeight2011 <- read.csv("N:/Ring data by cruise/Shetland/R_0211A_S2011_WOD.csv",sep=",",header=TRUE) LengthHeight2010 <- read.csv("N:/Ring data by cruise/Shetland/R_0310A_S2010_WOD.csv",sep=",",header=TRUE) LengthHeight2009 <- read.csv("N:/Ring data by cruise/Shetland/R_0309A_S2009_WOD.csv",sep=",",header=TRUE) LengthHeight <- merge(LengthHeight2013,LengthHeight2012,all=TRUE) LengthHeight <- merge(LengthHeight,LengthHeight2011,all=TRUE) LengthHeight <- merge(LengthHeight,LengthHeight2010,all=TRUE) LengthHeight <- merge(LengthHeight,LengthHeight2009,all=TRUE)
I would like to know if there is a shorter/tidier way to do this, also considering that each time I run the script I might want to look at a different range of years.
I also found this bit of code by Tony Cookson which looks like it would do what I want, however the data frame it produces for me has only the correct headers but no data rows.
multmerge = function(mypath){ filenames=list.files(path=mypath, full.names=TRUE) datalist = lapply(filenames, function(x){read.csv(file=x,header=T)}) Reduce(function(x,y) {merge(x,y)}, datalist) mymergeddata = multmerge("C://R//mergeme")
-
Lmm almost 5 yearsi have a similar issue and use: zoneB = list.files(pattern="*.csv") for (i in 1:length(zoneB)) assign(zoneB[i], read.csv(zoneB[i])) and then ZB = lapply(zoneB, read_csv) %>% bind_rows() ....... but i am wonder is there a a way to exclude specific .csv files in the list?
-
Rafael Santos almost 5 yearsWhat if I need to pass parameters to the function on lapply? E.g. instead of read.csv I would like to use read.table and set some options on it.
-
Melania CB over 3 yearsHey zx8754 - Thank you for this! Is there a way to keep an id on each document that the folder has?
-
zx8754 over 3 years@MelaniaCB Certainly possible, please ask a new questions with reference to this post, and provide example input and expected outputs.