combine multiple text files, _+ filenames_, into a single text file

10,163

Solution 1

I am sure there is something more clever, but here is a powershell script will combine all files:

$files = (dir *.txt)
$outfile = "out.txt"

$files | %{
    $_.FullName | Add-Content $outfile
    Get-Content $_.FullName | Add-Content $outfile
}

Is it efficient? Not terribly... but it will work in a pinch.

Solution 2

Inspired by the structure of Mitch's script, I've written a version for Unix-based environments, such as GNU/Linux and OS X:

find -regex '.*\.\(docx?\|org\|rtf\|te?xt\)$' | while read file
do
    echo "* $file" >> target-file.org
    cat "$file" | pandoc -t org >> target-file.org
done

(If you don't want to install pandoc, simply remove the pipe and command, | pandoc -t org.)

This script will find all files in the current directory and its subdirectories which have file extensions as described (.docx, etc).

For example, if the list includes fileA.text and fileB.rtf in subdirectory subd/, targetfile.org will receive lines such as:

* ./subd/fileA.text
<fileA's contents converted to an org file by pandoc>
* ./subd/fileB.rtf
<fileB's contents converted to an org file by pandoc>

I think this will leave target-file.org in a pretty good state for improving from within Emacs, without the script being too complicated. (Especially if you include the pandoc step.)

Share:
10,163

Related videos on Youtube

Brady Trainor
Author by

Brady Trainor

Updated on September 18, 2022

Comments

  • Brady Trainor
    Brady Trainor over 1 year

    I would like to combine a handful of text files, but with titles (EDIT: filenames). Ideally, something like

    * a filename 
    contents of file
    ... 
    * another filename 
    contents of file 
    ... 
    
    etc... 
    

    I am in windows (not DOS), but have access to powershell, pandoc, emacs, cygwin, or anything else you recommend. (Clearly I'm a newb trying out org-mode.)

    I can easily put them all in one folder. But I would like to avoid typing the name of each file. If a bat file is recommended, I have never used one, but am willing to learn.

    • Rajib
      Rajib over 10 years
      Possible duplicate of this.
    • Brady Trainor
      Brady Trainor over 10 years
      @Rajib, I did not understand that that question wanted the titles interspersed with the combined text.
    • Rajib
      Rajib over 10 years
      Ah now i see you mean filename. Sorry I misunderstood "Title of file".
    • Brady Trainor
      Brady Trainor over 10 years
      ah, I will edit to adhere to nomenclature, but leave the term in question for search terms.
  • Brady Trainor
    Brady Trainor over 10 years
    That works tremendously. I simply navigated to the folder, right clicked the frame to paste your script, and hit <RET>. I can do a find replace to switch C:\txt` to **`.
  • Brady Trainor
    Brady Trainor over 10 years
    To clarify my comment for posterity, I navigated from within PowerShell.
  • Mitch
    Mitch over 10 years
    @BradyTrainor, if you only want the filename, but not the path, switch the line which is $_.FullName | Add-Content $outfile to read $_.Name | Add-Content $outfile.
  • Brady Trainor
    Brady Trainor over 10 years
    Thank you Mitch. So that I might continue to stave off learning the code, how might I add a string such as " ** "?
  • Mitch
    Mitch over 10 years
    You can arbitrarily format things using the -f operator. See blogs.technet.com/b/heyscriptingguy/archive/2013/03/12/… for details. To make the filename line be ** filename, you would specify ("** {0}" -f $_.Name) | Add-Content $outfile