Create bookmarks into a PDF file via command line
Solution 1
You can also use pdftk
. It is also available for OS X.
I'm not going through all the details here and now, because it's been done elsewhere at great length already. Just briefly:
- Create a sample PDF from your original files (without bookmarks).
- Add some bookmarks with Adobe Acrobat (which you seem to have access to).
-
Run one of these commands:
pdftk my.pdf dump_data output - pdftk my.pdf dump_data output bookmarks+otherdata.txt
Study the format of the output.
- Modify the output .txt file by adding all the entries you want.
-
Run PDFTK again:
pdftk my.pdf update_info bookmarks.txt output bookmarked.pdf
Additional Information
This is the Bookmark format I noticed after inspecting in Step 4 above.
BookmarkBegin
BookmarkTitle: -- Your Title 1 --
BookmarkLevel: 1
BookmarkPageNumber: 1
BookmarkBegin
BookmarkTitle: -- Your Title 2 --
BookmarkLevel: 1
BookmarkPageNumber: 2
BookmarkBegin
BookmarkTitle: -- Your Title 3 --
...
...
and so on...
And replace the above.. in the appropriate place.
Solution 2
Here is another answer. This one uses Ghostscript to process PDF-to-PDF and the pdfmark
PostScript operator to insert the bookmarks.
For some introduction to the pdfmark topic, see also:
- Thomas Merz's PDFmark Primer.
This method involves two steps:
- Create a text file (a PostScript file, really), with a limited set of
pdfmark
commands, one per line and bookmark you want to add. - Run Ghostscript command that processes your current PDF file alongside the text file.
1.
The content on the text file should look something like this:
[/Page 1 /View [/XYZ null null null] /Title (This is page 1) /OUT pdfmark
[/Page 2 /View [/XYZ null null null] /Title (Dunno which page this is....) /OUT pdfmark
[/Page 3 /View [/XYZ null null null] /Title (Some other name) /OUT pdfmark
[/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark
[/Page 5 /View [/XYZ null null null] /Title (File 5) /OUT pdfmark
[/Page 6 /View [/XYZ null null null] /Title (File 6) /OUT pdfmark
[/Page 7 /View [/XYZ null null null] /Title (File 7) /OUT pdfmark
% more lines for more pages to bookmark...
[/Page 13 /View [/XYZ null null null] /Title (File 13) /OUT pdfmark
[/Page 14 /View [/XYZ null null null] /Title (Bookmark for page 14) /OUT pdfmark
% more lines for more pages to bookmark...
Name this file for example: addmybookmarks.txt
2.
Now run this command:
gs -o bookmarked.pdf \
-sDEVICE=pdfwrite \
addmybookmarks.txt \
-f original.pdf
The resulting PDF, bookmarked.pdf
now contains the bookmarks. See this screenshot:
Solution 3
Ok, here is a quick'n'dirty way to do three jobs at once:
- Merge your 400 single-page PDFs.
- Create a document top level ToC (Table of Contents).
- Create a PDF bookmark for each page.
It involves using a LaTeX installation.
You start with an empty LaTeX template like the following one:
\documentclass[]{article}
\usepackage{pdfpages}
\usepackage{hyperref}
\hypersetup{breaklinks=true,
bookmarks=true,
pdfauthor={},
pdftitle={},
colorlinks=true,
citecolor=blue,
urlcolor=blue,
linkcolor=magenta,
pdfborder={0 0 0}}
\begin{document}
{
\hypersetup{linkcolor=black}
\setcounter{tocdepth}{3}
% Comment next line in or out if you want a ToC or not:
\tableofcontents
}
%% Here goes your additional code:
%% 1 line per included PDF!
\end{document}
Now just before the last line of this template, you insert one line per external PDF file you want to include.
-
In case you want to generate a ToC, it has to be formatted like this:
\includepdf[pages={<pagenumber>},addtotoc{<pagenumber>,<section>,<level>,\ <heading>,<label>}]{pdffilename.pdf}
-
In case you are sure that each and every included PDF is a 1-page document, it simplifies to this:
\includepdf[addtotoc{<pagenumber>,<section>,<level>,\ <heading>,<label>}]]{pdffilename.pdf}
Here all of the following five parameters for addtotoc
are required, in the order given for the files to appear in the bookmarks and in the ToC. See further below for a specific example:
-
<pagenumber>
: Number of the page of inserted document to be linked to. (In your case always "1", because you insert 1-page documents only; you could insert a 5-page document and link to page 3 of the inserted PDF, though). -
<section>
: The LaTeX sectioning name. Could besection
,subsection
,subsubsection
... In your case "section". -
<level>
: The level of the LaTeX section. In your case "1". -
<heading>
: This is a string. Used for the text of the bookmark -
<label>
: This must be unique for each bookmark. Used in the PDF internally to jump to correct page when bookmark is clicked.
To test this quickly, I used Ghostscript to generate 20 1-page PDF documents:
for i in {1..20}; do
gs -o p${i}.pdf -sDEVICE=pdfwrite \
-c "/Helvetica findfont 30 scalefont setfont \
100 600 moveto \
(Page ${i}) show \
showpage";
done
With these test files I could make the lines to insert into the template look like these:
\includepdf[addtotoc={1,section,1,Page 1 (First),p1}]{p1.pdf}
\includepdf[addtotoc={1,section,1,Page 2,p2}]{p2.pdf}
\includepdf[addtotoc={1,section,1,Page 3,p3}]{p3.pdf}
[...]
\includepdf[addtotoc={1,section,1,Page 11 (In the Middle),p11}]{p11.pdf}
[...]
\includepdf[addtotoc={1,section,1,Page 20 (Last),p20}]{p20.pdf}
Save the template with the inserted lines, then run the following command twice:
pdflatex template.tex
pdflatex template.tex
The resulting file will have the bookmarks, looking like this in Preview.app:
Note: LaTeX is available for OSX via two methods:
I'll add one or two other methods to insert bookmarks on the command line too, later or in the next few days, if I have more time.
For now this one has to do, because I never showed it here on SO, AFAICR.
But I thought because you gave the background "I'm merging 1-page PDFs, and it is slow; now I want to add bookmarks too...", I could show how to do it with one single method.
HINT : One of the other methods will be to use pdftk
which IS available for Mac OS X!
Solution 4
Here's the python method for adding Bookmarks to the Table of Contents. Runs on MacOS without any other installations.
#!/usr/bin/python
from Foundation import NSURL, NSString
import Quartz as Quartz
import sys
# You will need to change these filepaths to a local test pdf and an output file.
infile = "/path/to/file.pdf"
outfile = "/path/to/output.pdf"
def getOutline(page, label):
# Create Destination
myPage = myPDF.pageAtIndex_(page)
pageSize = myPage.boundsForBox_(Quartz.kCGPDFMediaBox)
x = 0
y = Quartz.CGRectGetMaxY(pageSize)
pagePoint = Quartz.CGPointMake(x,y)
myDestination = Quartz.PDFDestination.alloc().initWithPage_atPoint_(myPage, pagePoint)
myLabel = NSString.stringWithString_(label)
myOutline = Quartz.PDFOutline.alloc().init()
myOutline.setLabel_(myLabel)
myOutline.setDestination_(myDestination)
return myOutline
pdfURL = NSURL.fileURLWithPath_(infile)
myPDF = Quartz.PDFDocument.alloc().initWithURL_(pdfURL)
if myPDF:
# Here's where you list your page index (starts at 0) and label.
outline1 = getOutline(0, 'Page 1')
outline2 = getOutline(1, 'Page 2')
outline3 = getOutline(2, 'Page 3')
# Create a root Outline and add each outline. (Needs a loop.)
rootOutline = Quartz.PDFOutline.alloc().init()
rootOutline.insertChild_atIndex_(outline1, 0)
rootOutline.insertChild_atIndex_(outline2, 1)
rootOutline.insertChild_atIndex_(outline3, 2)
myPDF.setOutlineRoot_(rootOutline)
myPDF.writeToFile_(outfile)
Related videos on Youtube
drmariod
Updated on June 25, 2022Comments
-
drmariod almost 2 years
I am searching for a command line tool to add bookmarks to a PDF file.
What I have is a
page number
and alabel
. Would love to create bookmark calledlabel
linking to pagepage number
.Does any one know a command line tool (preferably OSX) for doing this?
I have about 4000 pages PDF files and about 150 bookmarks and would love to automate it.
My plan is to use a system call within a r-script.
EDIT
I create about 4000 single PDF files with graphs and I am using the OSX system command
/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py
to join the PDFs together. Previously I was usingpdfjoin
frompdfjam
package, but this was way too slow. In the end, this is how I get my PDF where I add the bookmarks by hand with Adobe Acrobat Professional at the moment.-
Roland almost 9 yearsI'm not sure if you have code to produce a PDF or only the PDF file itself. If the former, we'd need much more details.
-
drmariod almost 9 yearsThanks @Roland, I added tome Information.
-
hrbrmstr almost 9 years
-
drmariod almost 9 yearspdftk is only available for Windows, so it will not fit my needs. Thanks anyways
-
Kurt Pfeifle almost 9 years"pdftk is only available for Windows..." Not true! See my answer. It includes a link to directly download an OSX .pkg installer (from the original
pdftk
-vendor, not from some rubbish third party provider)...
-
-
drmariod almost 9 yearsActually, I really like this solution, a very clear syntax and very easy to script... Thanks
-
drmariod almost 9 yearsI don't get the
addtotoc
command running... It says the command is not found, but it can find thepdfpages
package... So I don't understand the problem here :-( -
Kurt Pfeifle almost 9 years@dmariod: Without seeing your code I can't say what's wrong with it. Maybe s.th. very simple which the same eyes that stared on the line(s) while writing it can't recognize any more, but "third party" eyes easily can... Happened to me also, and not just once :)
-
Kurt Pfeifle almost 9 years@drmariod: Actually, I like this solution least of all the three :)
-
ihightower over 7 yearsi like this solution the best and the easiest i can understand.. as i have pdftk already.
-
Shamaoke over 2 yearsUse
dump_data_utf8
andupdate_info_utf8
in order to properly display characters in scripts other than Latin (e. g. Japanese).