How to convert PDF to image?
Solution 1
You can use pdftoppm
from the poppler-utils
package to convert a PDF to a PNG:
pdftoppm input.pdf outputname -png
This will output each page in the PDF using the format outputname-01.png
, with 01
being the index of the page.
Converting a single page or a range of pages of the PDF
pdftoppm input.pdf outputname -png -f {page} -singlefile
Change {page}
to the page number. It's indexed at 1, so -f 1
would be the first page.
If you'd like to work on a range of pages, you can also specify a number for the flag -l
(last page), so having -f 1 -l 30
would specify the pages from 1 to 30.
Note again that .png
will be appended to outputname
automatically, so there's no need to include the extension. Also, -singlefile
removes the -01
suffix cited above, since the output is known to have only one file.
Specifying the converted image's resolution
The default resolution for this command is 150 DPI. Increasing it will result in both a larger file size and more detail.
To increase the resolution of the converted PDF, add the options -rx {resolution}
and -ry {resolution}
. For example:
pdftoppm input.pdf outputname -png -rx 300 -ry 300
Solution 2
Install imagemagick.
-
Using a terminal where the PDF is located:
-
For the full document:
convert -density 150 input.pdf -quality 90 output.png
-
For a single page:
convert -density 150 input.pdf[666] -quality 90 output.png
-
Whereby:
PNG, JPG or (virtually) any other image format can be chosen.
-density xxx
will set the DPI toxxx
(common are 150 and 300).-quality xxx
will set the compression toxxx
for PNG, JPG and MIFF file formates (100 means no compression).[666]
will convert only the 667th page to PNG (zero-based numbering so[0]
is the 1st page).All other options (such as trimming, grayscale, etc.) can be viewed on the website of Image Magic.
Solution 3
IIRC GIMP is capable of using PDFs, i.e. converting them into images. So if you want to edit the images right away - GIMP is your friend.
Solution 4
The currently accepted answer does the job but results in an output which is larger in size and suffers from quality loss.
The method in the answer given here results in an output which is comparable in size to the input and doesn't suffer from quality loss.
TLDR - Use pdfimages
: pdfimages -j input.pdf output
Quoting the linked answer:
It's not clear what you mean by "quality loss". That could mean a lot of different things. Could you post some samples to illustrate? Perhaps cut the same section out of the poor quality and good quality versions (as a PNG to avoid further quality loss).
Perhaps you need to use
-density
to do the conversion at a higher dpi:convert -density 300 file.pdf page_%04d.jpg
(You can prepend
-units PixelsPerInch
or-units PixelsPerCentimeter
if necessary. My copy defaults to ppi.)Update: As you pointed out,
gscan2pdf
(the way you're using it) is just a wrapper forpdfimages
(from poppler).pdfimages
does not do the same thing thatconvert
does when given a PDF as input.
convert
takes the PDF, renders it at some resolution, and uses the resulting bitmap as the source image.
pdfimages
looks through the PDF for embedded bitmap images and exports each one to a file. It simply ignores any text or vector drawing commands in the PDF.As a result, if what you have is a PDF that's just a wrapper around a series of bitmaps,
pdfimages
will do a much better job of extracting them, because it gets you the raw data at its original size. You probably also want to use the-j
option topdfimages
, because a PDF can contain raw JPEG data. By default,pdfimages
converts everything to PNM format, and converting JPEG > PPM > JPEG is a lossy process.So, try
pdfimages -j file.pdf page
You may or may not need to follow that with a
convert
to.jpg
step (depending on what bitmap format the PDF was using).I tried this command on a PDF that I had made myself from a sequence of JPEG images. The extracted JPEGs were byte-for-byte identical to the source images. You can't get higher quality than that.
Solution 5
If your pdfs are scanned, the images are already stored as part of pdf. you will simply need to extract them with pdfimages
:
pdfimages my-file.pdf prefix
Related videos on Youtube
Deependra Solanky
I am from India, working on Microsoft Technologies like ASP.NET, SQL Server. I like to read new things about technology on internet on daily basis via Google Reader. I like open source and as a result have some knowledge of PHP/Ruby on Rails/Python. If Microsoft had not introduced ASP.NET MVC one year back, there were chances of me jumping into non-microsoft camp.
Updated on September 18, 2022Comments
-
Deependra Solanky over 1 year
I have requirement of converting PDF pages to images. There is a background image with some text in my file, and when I save it as an image only the background image gets saved.
Is there any software available for the same so that complete page can be converted to an image?
-
Philippe Paré about 7 yearsApparently it's also possible with inkscape: stackoverflow.com/a/15484727/32453
-
user3413723 over 4 yearsI don't have 10 rep to post an answer so here is another way, use MuPDF.
mutool convert -o file.png file.pdf
-
Anthony Ebert over 4 yearsOn bash:
pdftocairo file.pdf -png
-
Barna Kovacs almost 4 yearsPDFBox also does it nicely. pdfbox.apache.org
-
Eslam Sameh Ahmed about 3 yearsYou can use convertpdftojpg.net which is secure and fast PDF to JPG converter
-
Admin almost 2 yearsUsing GIMP is a great way to do this without using the command line.
-
-
mweber over 11 yearsThank you so much. Much better quality than with imagemagick or graphicsmagick!
-
zuo over 10 yearspdftoppm is much faster than convert
-
mx7 almost 10 yearscan you explain more about what is density and what It can do?
-
Arjun almost 10 years@AgentCool It specifies the horizontal and vertical image density (in ppi).
-
aroque over 9 yearswith only one pdf in a folder the specific name of the pdf file is not needed:
pdftoppm -png *.pdf prefix
-
Elijah Lynn over 9 yearsThe answer as is does work but the resolution is very poor. Therefore not currently an answer that is useful. Maybe if convert has some parameters that can be specified this could change.
-
OHLÁLÁ about 9 yearsYou can change the density by adding the
-density 300
parameter -
NoBackingDown over 8 yearsThis is really much better than imagemagick. Imagemagick actually changed the colors in an unexpected way in my case!
-
Petr R. over 8 yearsThe image in your answer is broken. Perhaps you should update it.
-
mlc over 8 yearsthis is good!, but it's a bit easier to write
-r 300
instead of specifying the x and y resolutions independently when you want to set them to the same value. -
Jose Gómez about 8 yearsThis is the perfect solution for scanned pdfs, as with this you can, with one command, extract the original jpgs, and without further recompressions.
-
user2364305 almost 8 yearsHow would we put them back into pdf? with this tool, to complete the circle.
-
Abbafei over 7 years
pdftohtml
(listed at end ofpdftoppm
manpage) worked better for my use-case; thanks for the hint :-) -
Philippe Paré about 7 yearsSo can anybody confirm that specifying density makes it "as good" as the other answers here, or not? Also as a note to followers, ImageMagick calls out to "ghostscript" to actually convert from pdf to png ex:
gs -q NOPROMPT ...-sDEVICE=pngalpha -r150x150 -sOutputFile=/var/tmp/Yf%d -f/var/tmp/L -f/var/tmp/Fic1
and if you getconvert: no images defined output.png
it means you don't have ghostscript installed... -
Forty-Two over 6 yearsIs that in the free or paid version? In my version, the option is greyed out? Does that mean I need to pay? Is there a paid version?
-
turdus-merula over 6 yearsAlso:
pdftocairo -png page.pdf page.png
-
Pavel Vlasov over 6 yearsWorks fine. To obtain this software you can use
brew install poppler
on macos. -
Michael Hays about 6 yearsI had much more success with pdftoppm than with imagemagick.
-
William about 6 yearsIs there a way to force max settings aka no compression?
-
mghaoui almost 6 yearsThis worked fine for me with the
-density 300
parameter. -
frozen-flame over 5 yearsUsing
-density 500 -quality 100
I still get much poorer image quality compared to pdftoppm. -
Gabriel Staples over 5 yearsAnd to convert back from images to pdf:
convert output-0.png output-1.png output-2.png output.pdf
. See: itsfoss.com/convert-multiple-images-pdf-ubuntu-1304 -
Joschua over 5 yearsI'm getting this error
convert-im6.q16: not authorized 'test.pdf' @ error/constitute.c/ReadImage/412.
-
hsandt over 5 yearsI get
convert-im6.q16: no images defined
output.png' @ error/convert.c/ConvertImageCommand/3258. I know @rogerdpack mentioned it already but I have ghostscript installed, I can use
gs` -
HD189733b over 5 yearsI made the pdf plot with python matplotlib or ROOT. When I use pdftoppm or convert module to convert the plot into png, the result is placed at the top-right corner and it leaves a wide white space. I solved the problem by adding
-cropbox
option. -
Jezor over 5 yearsParsing PDF in imagemagick has been disabled - bugs.archlinux.org/task/59778 - it can be enabled manually by editing
/etc/ImageMagick-7/policy.xml
file and removingPDF
from<policy domain="coder" rights="none" pattern="{PS,PS2,PS3,EPS,PDF,XPS}" />
-
Martin Thoma over 5 yearsYou might want to add
-background white -alpha off
to remove transparency. -
typeduke almost 5 yearsTo make it a CBZ (e.g. for reading in an ebook reader like Gnome Books) you can chain commands and use
pdftoppm myfile.pdf myfile -png && zip myfile.cbz myfile-*.png; rm myfile-*.png
. This will give a "myfile.cbz" in the same directory as "myfile.pdf" 🙂 -
typeduke almost 5 yearsOr, to make it easier to do multiple PDFs, use
FILE=filename-without-extension; pdftoppm $FILE.pdf $FILE -png && zip $FILE.cbz $FILE-*.png; rm $FILE-*.png
. This will give a "filename-without-extension.cbz" in the same directory as "filename-without-extension.pdf". -
Dan Dascalescu over 4 yearsGIMP can indeed open PDFs, each page as one layer. Choosing "Export As" seems to save only the current layer, but you can easily delete the layer after exporting and run "Export As" again.
-
Gabriel Staples over 4 years
pdftoppm
works extremely well and supports a bunch of output image formats, including PPM, PNG, JPEG, TIFF. You can also specify the resolution with-r 300
for example, as well as the JPEG compression (quality) level. See my full answer with examples here: askubuntu.com/questions/150100/… -
durette over 4 yearsI found GIMP produces a much higher quality conversion than imagemagick (as of the current respective versions packaged in Ubuntu 19.04)
-
durette over 4 yearsAs of the current respective versions packaged in Ubuntu 19.04, I found GIMP produces a much higher quality conversion than imagemagick.
-
Deependra Solanky about 4 years@ElijahLynn I have changed the accepted answer.
-
Zoltán about 4 yearsI first skipped this answer, because I didn't want to install extra software - only to find out I already had
pdftoppm
installed on Ubuntu 18.04 -
GuyPaddock almost 4 yearsThis is the incorrect solution for the OPs question if the PDF is a print-ready PDF created by something like Illustrator or Acrobat, since pdfimages extracts only the images from the PDF but does not flatten each entire page and export the full pages to images.
-
GuyPaddock almost 4 yearsThis is the incorrect solution for the OPs question if the PDF is a print-ready PDF created by something like Illustrator or Acrobat, since pdfimages extracts only the images from the PDF but does not flatten each entire page and export the full pages to images.
-
Anmol Singh Jaggi almost 4 years@GuyPaddock Thanks for pointing it out.
-
Roah over 3 yearsIs there any way to set transparent background in png? The background is white with
pdftoppm
and transparent withconvert
, butconvert
has problems with big pdfs even if I increase memory limit inpolicy.xml
. -
Manohar over 3 yearsis there any way to add password ?
-
somethis about 3 yearsUnfortunately, I couldn't make out a pragmatic, easy to follow routine with my favorite tool "convert". I'll have to agree with @ElijahLynn and point to solution askubuntu.com/a/50180/11929
-
justanoob about 3 years@turdus-merula Seemingly
cairo
is buggier thanppm
. -
Huseyin almost 3 yearsEasy and effective metheod.
-
cipricus over 2 yearscrashes with relatively large documents
-
cipricus over 2 years(In case it crashes at some point with pdf with many pages: print part of the original to pdf before extracting from the output with this tool)
-
Denilson Sá Maia over 2 years
-cropbox
exported the pages as I expected, so try using this option if you don't like your initial results. -
Avatar about 2 yearsProvided by: poppler-utils_0.24.5-2ubuntu4_amd64. Docs: manpages.ubuntu.com/manpages/trusty/man1/pdftocairo.1.html
-
Avatar about 2 yearsSidenote: To install the software on Ubuntu:
sudo apt update
thensudo apt install poppler-utils
-
Avatar about 2 yearsIf you want to resize the resulting PNG use e.g.
-scale-to 300
. This will give a PNG with max height of 300px. Parameter-r
is "kind of how blocky it will look, and-scale-to
is how big the overall image will be (on one side)." askubuntu.com/a/1179820/238253 -
Avatar about 2 yearsSee pdftoppm docs/manual: systutorials.com/docs/linux/man/1-pdftoppm
-
SurpriseDog about 2 yearsThanks! That preserved the fonts, unlike inkscape. Afterwards I used
convert -trim
to get rid of whitespace because -cropbox didn't work for me.