How to crop a multi-page (image/scanned) pdf file (which won't crop with pdfcrop)?
Solution 1
Full credit is due to AlexG who incidentally en passant posted a solution to this problem here, which, for completeness sake and so it doesn't get lost (!), I quote below.
Relevant to the above question is the trimming option described in the man
:
Usage examples:
#default operation pdfcrop.sh orig.pdf cropped.pdf pdfcrop.sh -m 10 orig.pdf cropped.pdf pdfcrop.sh -hires orig.pdf cropped.pdf #trimming pages pdfcrop.sh -t "10 20 30 40" orig.pdf trimmed.pdf
Content of
pdfcrop.sh
:#!/bin/bash function usage () { echo "Usage: `basename $0` [Options] <input.pdf> [<output.pdf>]" echo echo " * Removes white margins from each page in the file. (Default operation)" echo " * Trims page edges by given amounts. (Alternative operation)" echo echo "If only <input.pdf> is given, it is overwritten with the cropped output." echo echo "Options:" echo echo " -m \"<left> [<top> [<right> <bottom>]]\"" echo " adds extra margins in default operation mode. Unit is bp. A single number" echo " is used for all margins, two numbers \"<left> <top>\" are applied to the" echo " right and bottom margins alike." echo echo " -t \"<left> [<top> [<right> <bottom>]]\"" echo " trims outer page edges by the given amounts. Unit is bp. A single number" echo " is used for all trims, two numbers \"<left> <top>\" are applied to the" echo " right and bottom trims alike." echo echo " -hires" echo " %%HiResBoundingBox is used in default operation mode." echo echo " -help" echo " prints this message." } c=0 mar=(0 0 0 0); tri=(0 0 0 0) bbtype=BoundingBox while getopts m:t:h: opt do case $opt in m) eval mar=($OPTARG) [[ -z "${mar[1]}" ]] && mar[1]=${mar[0]} [[ -z "${mar[2]}" || -z "${mar[3]}" ]] && mar[2]=${mar[0]} && mar[3]=${mar[1]} c=0 ;; t) eval tri=($OPTARG) [[ -z "${tri[1]}" ]] && tri[1]=${tri[0]} [[ -z "${tri[2]}" || -z "${tri[3]}" ]] && tri[2]=${tri[0]} && tri[3]=${tri[1]} c=1 ;; h) if [[ "$OPTARG" == "ires" ]] then bbtype=HiResBoundingBox else usage 1>&2; exit 0 fi ;; \?) usage 1>&2; exit 1 ;; esac done shift $((OPTIND-1)) [[ -z "$1" ]] && echo "`basename $0`: missing filename" 1>&2 && usage 1>&2 && exit 1 input=$1;output=$1;shift; [[ -n "$1" ]] && output=$1 && shift; ( [[ "$c" -eq 0 ]] && gs -dNOPAUSE -q -dBATCH -sDEVICE=bbox "$input" 2>&1 | grep "%%$bbtype" pdftk "$input" output - uncompress ) | perl -w -n -s -e ' BEGIN {@m=split /\s+/, $mar; @t=split /\s+/, $tri;} if (/BoundingBox:\s+([\d\.\s]+\d)/) { push @bbox, $1; next;} elsif (/\/MediaBox\s+\[([\d\.\s]+\d)\]/) { @mb=split /\s+/, $1; next; } elsif (/pdftk_PageNum\s+(\d+)/) { $p=$1-1; if($c){ $mb[0]+=$t[0];$mb[1]+=$t[1];$mb[2]-=$t[2];$mb[3]-=$t[3]; print "/MediaBox [", join(" ", @mb), "]\n"; } else { @bb=split /\s+/, $bbox[$p]; $bb[0]+=$mb[0];$bb[1]+=$mb[1];$bb[2]+=$mb[0];$bb[3]+=$mb[1]; $bb[0]-=$m[0];$bb[1]-=$m[1];$bb[2]+=$m[2];$bb[3]+=$m[3]; print "/MediaBox [", join(" ", @bb), "]\n"; } } print; ' -- -mar="${mar[*]}" -tri="${tri[*]}" -c=$c | pdftk - output "$output" compress
Solution 2
You could try briss. It's pretty simple, but does the job. It's a GUI app though.
Download the zip file and extract to a folder of your choice and start it:
java -jar briss-0.9.jar
To install it permanently and system-wide and be able to start it from anywhere with just briss
, you would unpack the download in /usr/local/lib/
, then create an executable file /usr/local/bin/briss
that contains:
#!/bin/sh
java -jar /usr/local/lib/briss-0.9/briss-0.9.jar
Solution 3
This here is the best and easiest and has a wonderful GUI: Krop
Download deb from the author: http://arminstraub.com/computer/krop
Review: http://www.hecticgeek.com/2013/08/crop-pdf-ubuntu-13-04-krop/
Edit: I am using krop since 13.10 and I noticed that the latest versions started to support opening a pdf with krop via right click. I also switched to the snap version since it became available and it supports also right click, confirmed on 18.10 - 20.04. The GUI is not as colorful with the snap version but functionality is the same:
sudo snap install krop
Related videos on Youtube
nutty about natty
Updated on September 18, 2022Comments
-
nutty about natty over 1 year
Usually, I'm pretty happy using
pdfcrop
, even though the cropped output usually consumes significantly more disk space. Note that comparable code does exist, which addresses and resolves this issue. However, if wanting to crop a scanned (image) pdf file, my impression is thatpdfcrop
simply fails. I imagine thatImageMagick
is capable of doing the trick, possibly by (also) making us ofpdftk
.I'm looking for an efficient one-liner of code (a multi-line script would also be ok...) to crop such a pdf file from Top-Bottom-Left-and-Right by x cm each (or, better yet, by a b c d cm, individually), going all the way from input.pdf to output.pdf.
ps: the solution needn't involve
ImageMagick
; I'm happy as long as it works (cleanly, reliably and efficiently)... ;) -
MrMartin over 7 yearsThis method fails on certain files, see this bug
-
MrMartin over 7 yearsWhen it fails, this can be resolved by first printing the pdf to file, using a document viewer like Evince
-
ryanjdillon over 7 yearsI really like this one. GUI is nice for lots of irregular crops. Allows cropping different selections from the same page into a multi-page pdf. Great! Thanks!