How to crop PDF margins using pdftk and /MediaBox
Solution 1
use sed to replace any occurrence
sed 's/MediaBox \[0 0 612 792*/MediaBox \[100 0 512 792]/g'<in.pdf >out.pdf
or podofobox (inside podofo utils)
without needing to uncompress pdf streams first (as needed with pdftk)
podofobox in.pdf out.pdf media 10000 0 51200 79200
as you can see, podofobox uses MediaBox values multiplied by 100, since its scale is a sub multiple, so, you need simply to add two zeroes (00) to values you can read in MediaBox field
Solution 2
The string 100 has two more numbers in it than 0. When you use a text editor and add characters, that makes the file longer. That's why replacing with 9 or 2 or any other single digit works fine. While a text editor can theoretically be used to edit a pdf, it's not simple and you have to respect the internal structure of the file. The xref table is a table near the end of a pdf that tells the reader exactly where each object is located. It has to be changed whenever the length or location of anything is changed.
The reason the manual method above using pdftk
doesn't work is that you are adding two bytes in the center of the file. This breaks the xref
table. If you manually update all the xref
s, this will work, but it is potentially very tedious. Using sed
or any other text editing tool will not solve the problem. podofo
does the xref
calculation for you.
Solution 3
there are better ways to change the margin of a PDF:
- http://code.google.com/p/sopdf/
- http://pybrary.net/pyPdf/
- http://code.activestate.com/recipes/576837-crop-pdf-file-with-pypdf/
- for ghostscript see this page Cropping a PDF using Ghostscript 9.01
hope you found an answer to that since posting :-)
RockScience
Updated on July 17, 2022Comments
-
RockScience almost 2 years
I used
pdftk
to uncompress a PDF and then opened it as a text file.
I want to edit the /MediaBox field, which is in my case/MediaBox [0 0 612 792]
I would like to reduce the margins, for instance
/MediaBox [100 0 512 792]
Unfortunately it doesn't work. I can change the
0
into a2
or a9
but I cannot put100
for instance.Any idea why?
-
anddam over 10 yearsthis is not a programming question, should be moved to another site of the network
-
-
RockScience about 11 years1- What do you mean by "adding two bytes in the center of the file" and what is the xref table? 2-So what do you suggest?
-
James Duvall about 11 yearsI recommend doing what @Dingo and Dr Gorb already suggested, which is to use software or code that is designed to manipulate pdfs.
-
Ali over 9 yearsI have tried the last one, Ghostscript (9.10) and it didn't work for me. On the other hand, podofobox in the accepted answer does work.