How can I concatenate all files in a directory together in one command line operation?

84,192

Solution 1

Yes it can, with the unimaginatively named cat command:

$ cat *csv > all.csv

cat does what it says on the bottle, it conCATenates its input and prints to standard output. The command above will give an error if a file called all.csv already exists in the target directory:

$ cat *csv > all.csv
cat: all.csv: input file is output file

You can safely ignore that error, the contents of all.csv will be overwritten. Apparently, on some systems (e.g. OSX according to the comments below this answer), you cannot ignore the error and this will enter a loop, catting all.csv back into itself until you run out of disk space. If so, just delete all.csv, if it exists, before running the command.

Solution 2

ls -1 *.csv | while read fn ; do cat "$fn" >> output.csv.file; done

If you want to concatenate them by alphabetic order :

ls -1 *.csv | sort | while read fn ; do cat "$fn" >> output.csv.file; done

If you want to concatenate them by time creation order :

ls -1t *.csv | while read fn ; do cat "$fn" >> output.csv.file; done
Share:
84,192

Related videos on Youtube

codecowboy
Author by

codecowboy

Stackoverflow is my rubber duck.

Updated on September 18, 2022

Comments

  • codecowboy
    codecowboy over 1 year

    I have 1000 csv files in a directory. I would like to concatenate them all together in order. They are named img_size_1.csv to approx img_size_1000.csv This answer is close but assumes a list file. Can this be done in a one-liner?

    • terdon
      terdon over 10 years
      I'm voting to close this as a dupe of the linked question. This is exactly the same issue, the OP just did not know how to use globbing/wildcards.
    • Stéphane Chazelas
      Stéphane Chazelas over 10 years
      With zsh: < *.csv(n) > all.csv (n for numeric sort)
  • suspectus
    suspectus over 10 years
    If the command is carried out more that once (so all.csv will exist), one may not wish to concatenate all.csv with the other .csv files. rm all.csv first?
  • codecowboy
    codecowboy over 10 years
    @terdon, thanks. Is there any way to affect the order in which the files are added to be sure that they will be processed in numeric order? Or by date?
  • terdon
    terdon over 10 years
    @suspectus no, that is not needed. The > all.csv will truncate all.csv (empty it) before anything else is run (shell comands are run right-to-left). Therefore, all.csv will always be empty and you will not get the repetition you are thinking of.
  • terdon
    terdon over 10 years
    @codecowboy by default, the glob (the *csv) is expanded in alphanumeric order so it should do that already. If not, please edit your question to explain exactly what your file names look like. Do you have both file_N.csv and fileN.csv?
  • suspectus
    suspectus over 10 years
    @terdon thanks for the illuminating explanation.
  • suspectus
    suspectus over 10 years
    interestingly on OS X bash 3.2 the destination file is not overwritten first. If all.csv exists then do cat *.csv > all.csv the operation does not return and continues to add to all.csv until out of disk space.
  • terdon
    terdon over 10 years
    I fixed the quoting and format issues but this will break on files whose names contain newlines or backslashes and it will re-concatenate everything into the output file every time it is run, so you should make sure that output.csv does not exist before running it. Oh, and the sort in undeeded, ls with no options will already sort files alphabetically.
  • Slyx
    Slyx over 10 years
    Thanks ! The sort just prevent any ls aliasing.
  • Owen
    Owen about 7 years
    Thank you @suspectus for pointing this out. I've also encountered this on other platforms. It is definitely not "safe" to ignore "input file is output file". It may not have caused problem for the poster, but it invites a race condition and could fill up your hard drive if you are away making coffee. NOT SAFE!
  • terdon
    terdon about 7 years
    @Owen fair enough. I've never seen this behavior, so thanks to you and suspectus for letting me know. I edited the answer accordingly.
  • balupton
    balupton about 6 years
    Ran this cat ./**/*json > all.json and got this error bash: /bin/cat: Argument list too long guess it doesn't like running on millions of files. Any suggestions?
  • balupton
    balupton about 6 years