Finding Image resolution in PDF file?

30,172

Solution 1

slhck's answer and scruss' comment deserve to be updated : pdfimages now (at least since version 0.26.5) explicitely lists x-ppi and y-ppi. Here is an sample output :

$ pdfimages -list example.pdf 
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    2244  2244  cmyk    4   8  image  no       215  0   301   301  418K 2.1%
   2     1 image     900   600  rgb     3   8  image  no       324  0  1524  1525 35.5K 2.2%

On Debian (Wheezy) and Fedora (23), pdfimages is part of poppler-utils packages.

Solution 2

I know that you don't want to extract the image data, but this is probably the only way to find out the original resolution.


On *nix, if you have ImageMagick's identify and Xpdf installed1:

pdfimages -j test.pdf test && for file in $(find . -name "test*.jpg"); do identify "$file"; done

Where test.pdf is your input PDF. The output files are written to test-000.jpg, test-001.jpg, et cetera. This would give you the original size of all the contained images of that PDF2.

Example output for a PDF file that only contains one big image:

./test-000.jpg JPEG 2500x1961 2500x1961+0+0 8-bit DirectClass 1.022MB 0.000u 0:00.000

1) Windows has these too, but the script would be different of course.
2) Note that images don't really carry DPI information. Simply speaking: That's just something used for printing and images don't need an inherent measure of DPI.


What is the optimum resolution of converting text file into image PDF. 96dpi, 300dpi or more?

Generally, anything you want to print should be 300dpi or more. Most printers will handle a higher resolution too.

Solution 3

For some reason, the latest version of pdfimages that I can upgrade in my CentOS is version 3.04.

So, I don't have the -list option as stated by previous answers. However, the test image created from pdfimages based on slhck's answer contains the desired answer!

identify -verbose test-0000.jpg | more

Image: test-0000.jpg  
Format: JPEG (Joint Photographic Experts Group JFIF format)  
Mime type: image/jpeg  
Class: DirectClass  
Geometry: 6600x5100+0+0  
Resolution: 600x600  
Print size: 11x8.5

So the dpi is explicitly shown on the 6th line using the -verbose option in the identify command.

So, slhck's answer can be modified to the following.

pdfimages -j test.pdf test && for file in $(find . -name "test*.jpg"); do identify -verbose "$file" | awk 'NR==6'; done

On another note, I tried running

identify -verbose test.pdf

Format: PDF (Portable Document Format)  
Mime type: application/pdf  
Class: DirectClass  
Geometry: 792x612+0+0  
Resolution: 72x72  
Print size: 11x8.5  

It seems that Imagemagick always assumes a 72dpi and so the information printed here appears to be incorrect.

Solution 4

This worked with a pdf generated from a Kyocera mfp... This is probably only valid for full-page images like scans.

  1. Open the pdf w/ Reader-
  2. File>Properties -Description tab -Page size. My example said 8.5x11.0 in.

  3. Open the pdf with a text editor (notepad), look for /width and /height

  4. Take the height and width and divide them by the page height and width (in inches)

Example:

5100/8.5=600
6600/11.0=600

My PDF was scanned at a 600x600 resolution.

You can skip the first 2 steps if you know the document size (typically A4 is 8.27x11.69).

Solution 5

A PDF file doesn't have an inherent resolution, each raster-image within it (if any) will have it's own resolution. I don't know of a simple way to extract a single number for median/modal resolution of embedded image XObjects.

Share:
30,172

Related videos on Youtube

hk_
Author by

hk_

Updated on September 18, 2022

Comments

  • hk_
    hk_ over 1 year

    I have a problem of having some users creating very large PDFs. On the other hands I have PDF sent from our fax machines that are really small in size and totally printable. My question is

    • Is there any way I can find the resolution (DPI) of the PDF. I search the internet, could not find any answer. Checked the properties of the file, this information was not stored there, at least in my case.
    • What is the optimum resolution of converting text file into image PDF. 96dpi, 300dpi or more ?
    • Fun question. Can I resize a PDF which was scanned with high dpi into smaller dpi?

    I know some answers might not be available as I have already searched the internet and could not find answers.

    Note: My PDF are entirely images, text to images. I am also familiar with primoPDF (free) something you can experiment with

  • hk_
    hk_ over 12 years
    By the way I am not interested in extracting an image data from pdf, I just want to know what was the scan resolution and if it is very high unnecessarily would like to avoid that in future.
  • user5249203
    user5249203 over 12 years
    @Dave: Actually I meant extract the information about the embedded images not extract the image. But slhck's answer may solve your problem.
  • scruss
    scruss over 10 years
    A version of pdfimages (perhaps more recent than the original question) from the poppler project adds the -list option: pdfimages -list test.pdf. Rather than outputting files, this lists size and image type. Still doesn't explicitly give you resolution, but avoids creating output files.
  • Skippy le Grand Gourou
    Skippy le Grand Gourou over 7 years
    @scruss As of version 0.34.0, pdfimages -list provides explicitely x-ppi and y-ppi, as well as many other informations.
  • scruss
    scruss over 7 years
    Indeed it now does, @SkippyleGrandGourou : about five years after the question was asked. pdfimages still doesn't apply that resolution/size to images it extracts, though.
  • Skippy le Grand Gourou
    Skippy le Grand Gourou over 7 years
    @scruss Actually, it seems that the resolution given by pdfimages can be quite off (e.g. when the image is larger than its visible area, in a PDF produced by scribus). (Unfortunately I really don't have time to file a bug report now.)
  • theonlygusti
    theonlygusti over 5 years
    Mine are all empty