Using iText to convert HTML to PDF

134,922

Solution 1

I think this is exactly what you were looking for

http://today.java.net/pub/a/today/2007/06/26/generating-pdfs-with-flying-saucer-and-itext.html

http://code.google.com/p/flying-saucer

Flying Saucer's primary purpose is to render spec-compliant XHTML and CSS 2.1 to the screen as a Swing component. Though it was originally intended for embedding markup into desktop applications (things like the iTunes Music Store), Flying Saucer has been extended work with iText as well. This makes it very easy to render XHTML to PDFs, as well as to images and to the screen. Flying Saucer requires Java 1.4 or higher.

Solution 2

I have ended up using ABCPdf from webSupergoo. It works really well and for about $350 it has saved me hours and hours based on your comments above.

Solution 3

The easiest way of doing this is using pdfHTML. It's an iText7 add-on that converts HTML5 (+CSS3) into pdf syntax.

The code is pretty straightforward:

    HtmlConverter.convertToPdf(
        "<b>This text should be written in bold.</b>",       // html to be converted
        new PdfWriter(
            new File("C://users/mark/documents/output.pdf")  // destination file
        )
    );

To learn more, go to http://itextpdf.com/itext7/pdfHTML

Solution 4

Use itext libray:

Here is the sample code. It is working perfectly fine:

String htmlFilePath = filePath + ".html";
String pdfFilePath = filePath + ".pdf";

// create an html file on given file path
Writer unicodeFileWriter = new OutputStreamWriter(new FileOutputStream(htmlFilePath), "UTF-8");
unicodeFileWriter.write(document.toString());
unicodeFileWriter.close();

ConverterProperties properties = new ConverterProperties();
properties.setCharset("UTF-8");
if (url.contains(".kr") || url.contains(".tw") || url.contains(".cn") || url.contains(".jp")) {
    properties.setFontProvider(new DefaultFontProvider(false, false, true));
}

// convert the html file to pdf file.
HtmlConverter.convertToPdf(new File(htmlFilePath), new File(pdfFilePath), properties);

Maven dependencies

<dependency>
    <groupId>com.itextpdf</groupId>
    <artifactId>itext7-core</artifactId>
    <version>7.1.6</version>
    <type>pom</type>
</dependency>

<dependency>
    <groupId>com.itextpdf</groupId>
    <artifactId>html2pdf</artifactId>
    <version>2.1.3</version>
</dependency>

Solution 5

The answer to your question is actually two-fold. First of all you need to specify what you intend to do with the rendered HTML: save it to a new PDF file, or use it within another rendering context (i.e. add it to some other document you are generating).

The former is relatively easily accomplished using the Flying Saucer framework, which can be found here: https://github.com/flyingsaucerproject/flyingsaucer

The latter is actually a much more comprehensive problem that needs to be categorized further. Using iText you won't be able to (trivially, at least) combine iText elements (i.e. Paragraph, Phrase, Chunk and so on) with the generated HTML. You can hack your way out of this by using the ContentByte's addTemplate method and generating the HTML to this template.

If you on the other hand want to stamp the generated HTML with something like watermarks, dates or the like, you can do this using iText.

So bottom line: You can't trivially integrate the rendered HTML in other pdf generating contexts, but you can render HTML directly to a blank PDF document.

Share:
134,922
diegol
Author by

diegol

Updated on September 21, 2020

Comments

  • diegol
    diegol almost 4 years

    Does anyone know if it is possible to convert a HTML page (url) to a PDF using iText?

    If the answer is 'no' than that is OK as well since I will stop wasting my time trying to work it out and just spend some money on one of a number of components which I know can :)

    Thanks in advance for your responses!

  • Alex Stoddard
    Alex Stoddard over 13 years
    Link to flying saucer (xhtmlrenderer) should now be: code.google.com/p/flying-saucer
  • user584397
    user584397 about 11 years
    Does anybody know how to improve the image quality in the generated PDF files ?
  • ug_
    ug_ over 10 years
    @user584397 user a larger picture and scale it down, the image is embedded in the PDF.
  • Amedee Van Gasse
    Amedee Van Gasse almost 7 years
    HTMLWorker is deprecated. It's successor, XMLWorker, is being sunset. The current state of the art is iText 7 + pdfHTML.
  • Joris Schellekens
    Joris Schellekens over 6 years
    with iText pdfHTML, there is actually a method renderElements which does exactly what you claim is impossible. It renders HTML syntax to iText element blocks like Paragraph, Table, etc.