VB.Net Merge multiple pdfs into one and export
Solution 1
I have a console that monitors individual folders in a designated folder then needs to merge all of the pdf's in that folder into a single pdf. I pass an array of file paths as strings and the output file i would like.
This is the function i use.
Public Shared Function MergePdfFiles(ByVal pdfFiles() As String, ByVal outputPath As String) As Boolean
Dim result As Boolean = False
Dim pdfCount As Integer = 0 'total input pdf file count
Dim f As Integer = 0 'pointer to current input pdf file
Dim fileName As String
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
Dim pageCount As Integer = 0
Dim pdfDoc As iTextSharp.text.Document = Nothing 'the output pdf document
Dim writer As PdfWriter = Nothing
Dim cb As PdfContentByte = Nothing
Dim page As PdfImportedPage = Nothing
Dim rotation As Integer = 0
Try
pdfCount = pdfFiles.Length
If pdfCount > 1 Then
'Open the 1st item in the array PDFFiles
fileName = pdfFiles(f)
reader = New iTextSharp.text.pdf.PdfReader(fileName)
'Get page count
pageCount = reader.NumberOfPages
pdfDoc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1), 18, 18, 18, 18)
writer = PdfWriter.GetInstance(pdfDoc, New FileStream(outputPath, FileMode.OpenOrCreate))
With pdfDoc
.Open()
End With
'Instantiate a PdfContentByte object
cb = writer.DirectContent
'Now loop thru the input pdfs
While f < pdfCount
'Declare a page counter variable
Dim i As Integer = 0
'Loop thru the current input pdf's pages starting at page 1
While i < pageCount
i += 1
'Get the input page size
pdfDoc.SetPageSize(reader.GetPageSizeWithRotation(i))
'Create a new page on the output document
pdfDoc.NewPage()
'If it is the 1st page, we add bookmarks to the page
'Now we get the imported page
page = writer.GetImportedPage(reader, i)
'Read the imported page's rotation
rotation = reader.GetPageRotation(i)
'Then add the imported page to the PdfContentByte object as a template based on the page's rotation
If rotation = 90 Then
cb.AddTemplate(page, 0, -1.0F, 1.0F, 0, 0, reader.GetPageSizeWithRotation(i).Height)
ElseIf rotation = 270 Then
cb.AddTemplate(page, 0, 1.0F, -1.0F, 0, reader.GetPageSizeWithRotation(i).Width + 60, -30)
Else
cb.AddTemplate(page, 1.0F, 0, 0, 1.0F, 0, 0)
End If
End While
'Increment f and read the next input pdf file
f += 1
If f < pdfCount Then
fileName = pdfFiles(f)
reader = New iTextSharp.text.pdf.PdfReader(fileName)
pageCount = reader.NumberOfPages
End If
End While
'When all done, we close the document so that the pdfwriter object can write it to the output file
pdfDoc.Close()
result = True
End If
Catch ex As Exception
Return False
End Try
Return result
End Function
Solution 2
the code that was marked correct does not close all the file streams therefore the files stay open within the app and you wont be able to delete unused PDFs within your project
This is a better solution:
Public Sub MergePDFFiles(ByVal outPutPDF As String)
Dim StartPath As String = FileArray(0) ' this is a List Array declared Globally
Dim document = New Document()
Dim outFile = Path.Combine(outPutPDF)' The outPutPDF varable is passed from another sub this is the output path
Dim writer = New PdfCopy(document, New FileStream(outFile, FileMode.Create))
Try
document.Open()
For Each fileName As String In FileArray
Dim reader = New PdfReader(Path.Combine(StartPath, fileName))
For i As Integer = 1 To reader.NumberOfPages
Dim page = writer.GetImportedPage(reader, i)
writer.AddPage(page)
Next i
reader.Close()
Next
writer.Close()
document.Close()
Catch ex As Exception
'catch a Exception if needed
Finally
writer.Close()
document.Close()
End Try
End Sub
Related videos on Youtube
Vikky
Updated on April 24, 2020Comments
-
Vikky about 4 years
I have to merge multiple PDFs into a single PDF.
I am using the iText.sharp library, and collect converted the code and tried to use it (from here) The actual code is in C# and I converted that to VB.NET.
Private Function MergeFiles(ByVal sourceFiles As List(Of Byte())) As Byte() Dim mergedPdf As Byte() = Nothing Using ms As New MemoryStream() Using document As New Document() Using copy As New PdfCopy(document, ms) document.Open() For i As Integer = 0 To sourceFiles.Count - 1 Dim reader As New PdfReader(sourceFiles(i)) ' loop over the pages in that document Dim n As Integer = reader.NumberOfPages Dim page As Integer = 0 While page < n page = page + 1 copy.AddPage(copy.GetImportedPage(reader, page)) End While Next End Using End Using mergedPdf = ms.ToArray() End Using End Function
I am now getting the following error:
An item with the same key has already been added.
I did some debugging and have tracked the problem down to the following lines:
copy.AddPage(copy.GetImportedPage(reader, copy.AddPage(copy.GetImportedPage(reader, page)))
Why is this error happening?
-
AStopher over 8 yearsFYI: Possible duplicate of An item with the same key has already been added to dictionary.
-
Vikky over 8 yearsit works like a charm @Sean Wessell. Thanks for such a great help
-
Vikky over 8 yearsand again here i am not able to understand the co-relation for duplicacy
-
Bruno Lowagie over 6 yearsDown-voting because you are misleading people into thinking this is the way to merge documents (see this question. As explained in chapter 6 of my book, you are throwing away all interactivity. If the original files contain links, annotations,... they will all be gone after merging. Because of answers like yours, many developers do the wrong thing (and it's so tiring for us having to explain over and over again what they are doing wrong).
-
G_Hosa_Phat about 6 years@BrunoLowagie - It depends on the requirements of the merge system. If the PDF files do not contain any interactive content, links, annotations, etc., or those elements are unimportant within the context of the merged document, then, if this code succeeds in merging multiple PDF files, it's an acceptable answer. The only thing I would suggest for improving the answer would be to mention the possibility/likelihood of functionality loss using this method.
-
Bruno Lowagie about 6 years@G_Hosa_Phat If you add the likelihood of functionality loss, also mention that each page is added to the new document as a Form XObject. When this operation is repeated many times (which is the case in some projects I helped debug), you end up with XObjects referring to XObjects referring to XObjects. Too many nested XObjects can cause performance problems and even hit implementation limits of the viewer that make the viewer fail to render the document or even crash.
-
G_Hosa_Phat about 6 years@BrunoLowagie - Absolutely. The answer should include any possible negative side-effects that might result, although we all know that we don't always get (or know of) such problems listed. I've read a part of the sample chapter you've linked, and, as I'm interested in the subject of how best to merge PDF files, I'll probably be looking more closely into that. However, if the code above works for a "proof of concept" solution, I'll probably start there.
-
mkl about 6 yearsa Closing the
Document
implicitly closes thePdfWriter
, it does not make sense to close the writer explicitly. b If you close theDocument
in theFinally
block anyways, it does not make sense also closing it in theTry
block. c If you want to make sure to close everything, why don't you explicitly close the file stream? One can after all disable the implicit closing of the file stream in the writer...