How can I concatenate all files in a directory together in one command line operation?
Solution 1
Yes it can, with the unimaginatively named cat
command:
$ cat *csv > all.csv
cat
does what it says on the bottle, it conCATenates its input and prints to standard output. The command above will give an error if a file called all.csv
already exists in the target directory:
$ cat *csv > all.csv
cat: all.csv: input file is output file
You can safely ignore that error, the contents of Apparently, on some systems (e.g. OSX according to the comments below this answer), you cannot ignore the error and this will enter a loop, catting all.csv
will be overwritten. all.csv
back into itself until you run out of disk space. If so, just delete all.csv
, if it exists, before running the command.
Solution 2
ls -1 *.csv | while read fn ; do cat "$fn" >> output.csv.file; done
If you want to concatenate them by alphabetic order :
ls -1 *.csv | sort | while read fn ; do cat "$fn" >> output.csv.file; done
If you want to concatenate them by time creation order :
ls -1t *.csv | while read fn ; do cat "$fn" >> output.csv.file; done
Related videos on Youtube
Comments
-
codecowboy over 1 year
I have 1000 csv files in a directory. I would like to concatenate them all together in order. They are named img_size_1.csv to approx img_size_1000.csv This answer is close but assumes a list file. Can this be done in a one-liner?
-
terdon over 10 yearsI'm voting to close this as a dupe of the linked question. This is exactly the same issue, the OP just did not know how to use globbing/wildcards.
-
Stéphane Chazelas over 10 yearsWith
zsh
:< *.csv(n) > all.csv
(n
for numeric sort)
-
-
suspectus over 10 yearsIf the command is carried out more that once (so all.csv will exist), one may not wish to concatenate all.csv with the other .csv files. rm all.csv first?
-
codecowboy over 10 years@terdon, thanks. Is there any way to affect the order in which the files are added to be sure that they will be processed in numeric order? Or by date?
-
terdon over 10 years@suspectus no, that is not needed. The
> all.csv
will truncateall.csv
(empty it) before anything else is run (shell comands are run right-to-left). Therefore,all.csv
will always be empty and you will not get the repetition you are thinking of. -
terdon over 10 years@codecowboy by default, the glob (the
*csv
) is expanded in alphanumeric order so it should do that already. If not, please edit your question to explain exactly what your file names look like. Do you have bothfile_N.csv
andfileN.csv
? -
suspectus over 10 years@terdon thanks for the illuminating explanation.
-
suspectus over 10 yearsinterestingly on OS X bash 3.2 the destination file is not overwritten first. If all.csv exists then do
cat *.csv > all.csv
the operation does not return and continues to add to all.csv until out of disk space. -
terdon over 10 yearsI fixed the quoting and format issues but this will break on files whose names contain newlines or backslashes and it will re-concatenate everything into the output file every time it is run, so you should make sure that
output.csv
does not exist before running it. Oh, and thesort
in undeeded,ls
with no options will already sort files alphabetically. -
Slyx over 10 yearsThanks ! The
sort
just prevent anyls
aliasing. -
Owen about 7 yearsThank you @suspectus for pointing this out. I've also encountered this on other platforms. It is definitely not "safe" to ignore "input file is output file". It may not have caused problem for the poster, but it invites a race condition and could fill up your hard drive if you are away making coffee. NOT SAFE!
-
terdon about 7 years@Owen fair enough. I've never seen this behavior, so thanks to you and suspectus for letting me know. I edited the answer accordingly.
-
balupton about 6 yearsRan this
cat ./**/*json > all.json
and got this errorbash: /bin/cat: Argument list too long
guess it doesn't like running on millions of files. Any suggestions? -
balupton about 6 yearsFigured it out: unix.stackexchange.com/a/437084/50703