PDF to image using Java

115,873

Solution 1

You will need a PDF renderer. There are a few more or less good ones on the market (ICEPdf, pdfrenderer), but without, you will have to rely on external tools. The free PDF renderers also cannot render embedded fonts, and so will only be good for creating thumbnails (what you eventually want).

My favorite external tool is Ghostscript, which can convert PDFs to images with a single command line invocation.

This converts Postscript (and PDF?) files to bmp for us, just as a guide to modify for your needs (Know you need the env vars for gs to work!):

pushd 
setlocal

Set BIN_DIR=C:\Program Files\IKOffice_ACME\bin
Set GS=C:\Program Files\IKOffice_ACME\gs
Set GS_DLL=%GS%\gs8.54\bin\gsdll32.dll
Set GS_LIB=%GS%\gs8.54\lib;%GS%\gs8.54\Resource;%GS%\fonts
Set Path=%Path%;%GS%\gs8.54\bin
Set Path=%Path%;%GS%\gs8.54\lib

call "%GS%\gs8.54\bin\gswin32c.exe" -q -dSAFER -dNOPAUSE -dBATCH -sDEVICE#bmpmono -r600x600 -sOutputFile#%2 -f %1

endlocal
popd

UPDATE: pdfbox is now able to embed fonts, so no need for Ghostscript anymore.

Solution 2

In Ghost4J library (http://ghost4j.sourceforge.net), since version 0.4.0 you can use a SimpleRenderer to do the job with few lines of code:

  1. Load PDF or PS file (use PSDocument class for that):

        PDFDocument document = new PDFDocument();
        document.load(new File("input.pdf"));
    
  2. Create the renderer

        SimpleRenderer renderer = new SimpleRenderer();
    
        // set resolution (in DPI)
        renderer.setResolution(300);
    
  3. Render

        List<Image> images = renderer.render(document);
    

Then you can do what you want with your image objects, for example, you can write them as PNG like this:

            for (int i = 0; i < images.size(); i++) {
                ImageIO.write((RenderedImage) images.get(i), "png", new File((i + 1) + ".png"));
            }

Note: Ghost4J uses the native Ghostscript C API so you need to have a Ghostscript installed on your box.

I hope it will help you :)

Solution 3

Apache PDF Box can convert PDFs to jpg,bmp,wbmp,png, and gif.

The library even comes with a command line utility called PDFToImage to do this.

If you download the source code and look at the PDFToImage class you should be able to figure out how to use PDF Box to convert PDFs to images from your own Java code.

Solution 4

Take a look at the articles:

1) PdftoImage-Convert PDF to Image by using PdfRenderer library, direct link to source code
2) Java: Generating PDF and Previewing it as an Image – iText and PDF Renderer

Solution 5

jPDFImages is not free but a commercial library which converts PDF pages to images in JPEG, TIFF or PNG format. The output image size is customizable.

Share:
115,873
yohan.jayarathna
Author by

yohan.jayarathna

Updated on January 29, 2020

Comments

  • yohan.jayarathna
    yohan.jayarathna over 4 years

    I to want convert PDF pages into an image (PNG,JPEG/JPG or GIF). I want them in full-page sizes.

    How can this be done using Java? What libraries are available for achieving this?

  • yohan.jayarathna
    yohan.jayarathna over 13 years
    Hi Daniel, thank you for quick reply, Can I automate Ghostscript using Java ? If it is possible how can I do it ? Where I can find very good Ghostscript tutorial, Thanks again!
  • anergy
    anergy over 13 years
    May be have a look at Ghost4J ghost4j.sourceforge.net/coreapisamples.html
  • mtraut
    mtraut over 13 years
    It's not quite right that "the free renderers can't redner embedded fonts" - at least jPodRenderer does so...
  • Daniel
    Daniel over 13 years
    @mtraut: jPodRenderer: Commercial licensing is available for a very moderate flat fee per developer seat of 4.900€... very free :) It is just free for GPLed projects.
  • mtraut
    mtraut over 13 years
    @daniel maybe i'm not up to date - but GPL is still one of the most common free licenses. I simply don't get it why it seems to be silly that commercial use costs money. And this free version is not a crippled subset...
  • Daniel
    Daniel over 13 years
    I know they wnt to make money, and it is their right to do so, but if it is not LGPL I cannot use it, so it's not really free, like apache or the like.
  • yohan.jayarathna
    yohan.jayarathna over 13 years
    Hey I am getting an error saying "Exception in thread "main" java.lang.UnsatisfiedLinkError: Unable to load library 'gsdll32': The specified module could not be found." I have already installed Ghostscript latest version. Please help :(
  • zippy1978
    zippy1978 about 13 years
    This means that the Ghostscript library was not found... On which OS are you working? Make sure the .dll / .so is on the system library path.
  • Leigh
    Leigh about 12 years
    Are you affiliated with that product? Please be sure to read the faq's on promotion stackoverflow.com/faq#promotion
  • WelcomeTo
    WelcomeTo over 11 years
    Simply installing Ghostscript not work for me. I resolve this bu dropping gsdll32.dll into Eclipse Project folder.
  • Don Cheadle
    Don Cheadle about 9 years
    it's somewhat inconsistent for images. If there is a "ColorPattern" (not an image but similar.. confusing) in the source PDF, it will not be copied over to the destination image. stackoverflow.com/questions/28589477/…
  • Don Cheadle
    Don Cheadle about 9 years
    but there may be improvements in PDFBox's 2.x release! (hoping)
  • Don Cheadle
    Don Cheadle about 9 years
    is Ghost4J reliable in multi-threaded environments? I felt the documentation was vague
  • Don Cheadle
    Don Cheadle about 9 years
  • Wyetro
    Wyetro over 8 years
    For anyone having an issue, check this out: stackoverflow.com/questions/31996746/…
  • Zaid Amir
    Zaid Amir over 8 years
    I know this is old but I just want to say that suggesting Ghostscript as a free commercial friendly library is incorrect. Ghostscript is licensed under AGPL which is a more strict version of GPL
  • amdev
    amdev over 6 years
    Caused by: java.lang.ClassNotFoundException: com.lowagie.text.pdf.PdfTemplate
  • gordon613
    gordon613 over 6 years
    See pdfbox.apache.org/2.0/migration.html under PDF Rendering for details in how to do this in PDFBox 2.0.0
  • Daniel
    Daniel over 5 years
    @ZaidAmir: I know it became even older, but Ghostscript can be used if used "at arms length", and if the product is not a tool that mimics ghostscripts behaviour or functionality. It is a bit problematic, OK, but I understood the license as it being possible to use.