How do you combine PDFs in ruby?

15,443

Solution 1

I wrote a ruby gem to do this — PDF::Merger. It uses iText. Here's how you use it:

pdf = PDF::Merger.new
pdf.add_file "foo.pdf"
pdf.add_file "bar.pdf"
pdf.save_as "combined.pdf"

Solution 2

As of 2013 you can use Prawn to merge pdfs. Gist: https://gist.github.com/4512859

class PdfMerger

  def merge(pdf_paths, destination)

    first_pdf_path = pdf_paths.delete_at(0)

    Prawn::Document.generate(destination, :template => first_pdf_path) do |pdf|

      pdf_paths.each do |pdf_path|
        pdf.go_to_page(pdf.page_count)

        template_page_count = count_pdf_pages(pdf_path)
        (1..template_page_count).each do |template_page_number|
          pdf.start_new_page(:template => pdf_path, :template_page => template_page_number)
        end
      end

    end

  end

  private

  def count_pdf_pages(pdf_file_path)
    pdf = Prawn::Document.new(:template => pdf_file_path)
    pdf.page_count
  end

end

Solution 3

After a long search for a pure Ruby solution, I ended up writing code from scratch to parse and combine/merge PDF files.

(I feel it is such a mess with the current tools - I wanted something native but they all seem to have different issues and dependencies... even Prawn dropped the template support they use to have)

I posted the gem online and you can find it at GitHub as well.

you can install it with:

gem install combine_pdf

It's very easy to use (with or without saving the PDF data to a file).

For example, here is a "one-liner":

(CombinePDF.load("file1.pdf") << CombinePDF.load("file2.pdf") << CombinePDF.load("file3.pdf")).save("out.pdf")

If you find any issues, please let me know and I will work on a fix.

Solution 4

Use ghostscript to combine PDFs:

 options = "-q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite"
 system "gs #{options} -sOutputFile=result.pdf file1.pdf file2.pdf"

Solution 5

Haven't seen great options in Ruby- I got best results shelling out to pdftk:

system "pdftk #{file_1} multistamp #{file_2} output #{file_combined}"
Share:
15,443
Paul Schreiber
Author by

Paul Schreiber

I like hockey, bicycle commuting, good design, discussing intellectual property law and baking chocolate cakes. I don’t have a land line, wear a watch or drink coffee. Here on SO, I usually help with HTML, CSS and JavaScript; Ruby, Rails, WordPress, PHP and Python; and the occasional Objective-C question.

Updated on June 15, 2022

Comments

  • Paul Schreiber
    Paul Schreiber almost 2 years

    This was asked in 2008. Hopefully there's a better answer now.

    How can you combine PDFs in ruby?

    I'm using the pdf-stamper gem to fill out a form in a PDF. I'd like to take n PDFs, fill out a form in each of them, and save the result as an n-page document.

    Can you do this with a native library like prawn? Can you do this with rjb and iText? pdf-stamper is a wrapper on iText.

    I'd like to avoid using two libraries (i.e. pdftk and iText), if possible.

  • Paul Schreiber
    Paul Schreiber over 13 years
    There's a much simpler way to do this — you can usePdfCopyFields and addDocument. See the gem I made.
  • Mark Storer
    Mark Storer over 13 years
    Granted, but PdfCopyFields won't rename fields... and given the "same name == same value" thing, I thought flattening to be the best route. I'd think field renaming would be right up CopyField's alley, but I don't see anything in the API ref: api.itextpdf.com. PdfStamper can change field names, but won't handle the importing for you. Sadly iText has this sort of "can't walk and chew gum" type problem fairly often, requiring that you create, 'save', and read the same PDF to apply it to some other thing. Not terribly efficient, but it works, and its hard to be the price.
  • taelor
    taelor over 12 years
    I'm curious as to the iText License. If you have a Rails Application, do you have to buy a License, or can you use it for free without open sourcing the entire application?
  • Paul Schreiber
    Paul Schreiber over 12 years
    iText <= 4.2 is MPL/LGPL. iText >= 5.0 is Affero GPL. pdf-merger uses 4.2.
  • ajbraus
    ajbraus almost 11 years
    Can I grab a remote pdf from an amazon bucket and merge it with your gem?
  • Paul Schreiber
    Paul Schreiber almost 11 years
    The gem only works on files in the local filesystem. If you have the S3 bucket mounted (say, with S3FS), then sure. Otherwise, no, you'd need to download it first.
  • Hendrik
    Hendrik over 10 years
    Thanks. Huge timesaver. Could replace the previous pdf-merger gem which made use of Java. yuck. This should be the accepted answer.
  • Hendrik
    Hendrik over 10 years
    Check the solution by Evan Closson if you want to avoid installing a JVM just for this gem.
  • barbolo
    barbolo over 10 years
    I have merged thousands of PDFs into one with this script. Thanks!
  • Yarin
    Yarin over 10 years
    Note that Prawn templates don't work with all PDFs- It's a known issue and they've considered dropping support for it altogether. So far though it's still the best Ruby solution.
  • Alec Sanger
    Alec Sanger about 10 years
    Just a note for everyone finding this answer - they have officially dropped templates now. You'll have to go back to version 0.14.0 to get them back.
  • belgoros
    belgoros over 9 years
    It will not work as Prawn dropped template support. See more about that here: github.com/prawnpdf/prawn/issues/376
  • Tim Baas
    Tim Baas over 9 years
    Can I use combine_pdf to merge multiple different sized pdfs into one with multiple pages, so for example merge 8 pdfs to a new pdf with 2 pages?
  • Myst
    Myst over 9 years
    I tried it with different page sizes and it merges the PDF files without an issue. the original page sizes remain persistent. I'm not sure what you mean by merging 8 files and getting 2 pages - I assume you meant 2 page sizes...?
  • Tim Baas
    Tim Baas over 9 years
    I mean merging 2 A5 sized PDF's into 1 A4 sized PDF for example.
  • Myst
    Myst over 9 years
    Hi Tim, CombinePDF doesn't support that level of editing. it's only meant to answer the need for simple operations. If you have an idea how to go about implementing such a feature using CombinePDF's codebase, feel free to open a pull request/issue on github and we'll work something out.
  • Tim Baas
    Tim Baas over 9 years
    I see, I need it for a project that's coming up, but I guess working with images and prawn for example would be easier. But I depend on a third party for the content so if PDF's are the only possibility than that is definitely an option. Thanks for your reply.
  • maikovich
    maikovich about 7 years
    Doesn't this approach use tons of memory?
  • Arup Rakshit
    Arup Rakshit over 6 years
    Can we also check if there is empty pages while merging? Like first pdf say has lot of empty spaces say to the last page, and thus start adding content from there while merging. Is this possible?
  • JellicleCat
    JellicleCat about 3 years
    Before moving ahead with this, check the list of known limitations on the README. The loss of form data was a deal-breaker for me.