How to remove watermark from pdf using pdftk?
Solution 1
very simply task to perform:
use sed:
sed -e "s/watermarktextstring/ /g" <input.pdf >unwatermarked.pdf
but, after, be sure to repair resulting output pdf
pdftk unwatermarked.pdf output fixed.pdf && mv fixed.pdf unwatermarked.pdf
all into one command:
sed -e "s/watermarktextstring/ /g" <input.pdf >unwatermarked.pdf && pdftk unwatermarked.pdf output fixed.pdf && mv fixed.pdf unwatermarked.pdf
text watermarks are nothing else than a text between two tags inside the pdf compressed code
Solution 2
Just a little add-on to Dingo's answer as it did not work for me:
I had to first uncompress the PDF document in order to be able to find the watermark and replace it with sed
.
The first step involves uncompressing the PDF document using pdftk
:
pdftk original.pdf output uncompressed.pdf uncompress
now, the uncompressed.pdf
can be used as in Dingo's answer:
sed -e "s/watermarktextstring/ /" uncompressed.pdf > unwatermarked.pdf
I then repaired and recompressed the document:
pdftk unwatermarked.pdf output fixed.pdf compress
Related videos on Youtube
hnns
Updated on September 18, 2022Comments
-
hnns over 1 year
I need to remove some stupid email watermark that expands across all pages of a public domain book. I looked at pdftk man page and some examples but still can not figure out how to remove the watermarks. I appreciate your hints.
-
hnns almost 12 yearsFantastic! worked like a charm. please just rename the email address to a fictitious one. I don't want the guy how spoiled the book be targeted by spammers. Specially as he is probably the one who has made the pdf. Many thanks.
-
Admin almost 12 yearsdone! Changed specific string with a generic string
-
425nesp over 10 yearsDoes anyone know how to modify this solution to get rid of a link watermark? I got rid of the text, but there's still a small square left where the text used to be.
-
johndodo over 10 yearsYou are a life-saver! Thank you!!! :)
-
qed over 10 yearsThis is really awesome!
-
Alexander Garden about 10 yearsI took this process, made it slightly fancier, and wrapped it up in a Python script. It is on github here.
-
8bitjunkie over 8 years@Alexander Garden It doesn't work,
TypeError: str() takes at most 1 argument (2 given)
when used following the usage advice given -
Alexander Garden over 8 years@8bitjunkie Can you open a github issue with a full stack trace?
-
gdecaso about 7 yearsI was having issues with this approach due to pdftk not being able to open the unwatermarked.pdf file. What did the trick was to replace the watermarktextstring via sed using a replacement string which was just N number of space characters where N is the length of the original watermark. In other words, make sure your uncompressed.pdf and unwatermarked.pdf have the same length
-
David Foerster over 6 years+1 I used the sed command
/watermarktextstring/d
instead because my water mark string was interlaced with formatting instructions or typographic hints or something like that. -
Karlo about 6 yearsIs this a general solution? What is www.it-ebooks.info?
-
Karlo about 6 years@Philippe The second command gives an error: "sed: RE error: illegal byte sequence", what should I do?
-
Cerin over 5 yearspdftk crashed when I ran this.
-
akhan over 5 yearsSince qpdf is the default tool on many distros, here is how to uncompress using qpdf.
-
Clain Dsilva over 5 years@Philippe any idea on how to batch remove watermark?
-
Clain Dsilva over 5 years@Dingo how do batch process it? I mean multiple files
-
Dingo over 5 yearsMultiple files having same text string to replace or different strings for each file?
-
fccoelho over 5 yearsDidn't work to remove watermark added by Master PDF Editor.
-
graffe almost 4 yearsGenius :) Thank you.