How to crop PDF margins using pdftk and /MediaBox

17,449

Solution 1

use sed to replace any occurrence

sed 's/MediaBox \[0 0 612 792*/MediaBox \[100 0 512 792]/g'<in.pdf >out.pdf

or podofobox (inside podofo utils)

without needing to uncompress pdf streams first (as needed with pdftk)

podofobox in.pdf out.pdf media 10000 0 51200 79200

as you can see, podofobox uses MediaBox values multiplied by 100, since its scale is a sub multiple, so, you need simply to add two zeroes (00) to values you can read in MediaBox field

Solution 2

The string 100 has two more numbers in it than 0. When you use a text editor and add characters, that makes the file longer. That's why replacing with 9 or 2 or any other single digit works fine. While a text editor can theoretically be used to edit a pdf, it's not simple and you have to respect the internal structure of the file. The xref table is a table near the end of a pdf that tells the reader exactly where each object is located. It has to be changed whenever the length or location of anything is changed.

The reason the manual method above using pdftk doesn't work is that you are adding two bytes in the center of the file. This breaks the xref table. If you manually update all the xrefs, this will work, but it is potentially very tedious. Using sed or any other text editing tool will not solve the problem. podofo does the xref calculation for you.

Solution 3

there are better ways to change the margin of a PDF:

hope you found an answer to that since posting :-)

Share:
17,449
RockScience
Author by

RockScience

Updated on July 17, 2022

Comments

  • RockScience
    RockScience almost 2 years

    I used pdftk to uncompress a PDF and then opened it as a text file.
    I want to edit the /MediaBox field, which is in my case

    /MediaBox [0 0 612 792]
    

    I would like to reduce the margins, for instance

    /MediaBox [100 0 512 792]
    

    Unfortunately it doesn't work. I can change the 0 into a 2 or a 9 but I cannot put 100 for instance.

    Any idea why?

    • anddam
      anddam over 10 years
      this is not a programming question, should be moved to another site of the network
  • RockScience
    RockScience about 11 years
    1- What do you mean by "adding two bytes in the center of the file" and what is the xref table? 2-So what do you suggest?
  • James Duvall
    James Duvall about 11 years
    I recommend doing what @Dingo and Dr Gorb already suggested, which is to use software or code that is designed to manipulate pdfs.
  • Ali
    Ali over 9 years
    I have tried the last one, Ghostscript (9.10) and it didn't work for me. On the other hand, podofobox in the accepted answer does work.