Converting HTML files to PDF
Solution 1
The Flying Saucer XHTML renderer project has support for outputting XHTML to PDF. Have a look at an example here.
Solution 2
Did you try WKHTMLTOPDF?
It's a simple shell utility, an open source implementation of WebKit. Both are free.
We've set a small tutorial here
EDIT( 2017 ):
If it was to build something today, I wouldn't go that route anymore.
But would use http://pdfkit.org/ instead.
Probably stripping it of all its nodejs dependencies, to run in the browser.
Solution 3
Check out iText; it is a pure Java PDF toolkit which has support for reading data from HTML. I used it recently in a project when I needed to pull content from our CMS and export as PDF files, and it was all rather straightforward. The support for CSS and style tags is pretty limited, but it does render tables without any problems (I never managed to set column width though).
Creating a PDF from HTML goes something like this:
Document doc = new Document(PageSize.A4);
PdfWriter.getInstance(doc, out);
doc.open();
HTMLWorker hw = new HTMLWorker(doc);
hw.parse(new StringReader(html));
doc.close();
Solution 4
If you have the funding, nothing beats Prince XML as this video shows
Solution 5
Is there maybe a way to grab the rendered page from the internet explorer rendering engine and send it to a PDF-Printer tool automatically?
This is how ActivePDF works, which is good means that you know what you'll get, and it actually has reasonable styling support.
It is also one of the few packages I found (when looking a few years back) that actually supports the various page-break CSS commands.
Unfortunately, the ActivePDF software is very frustrating - since it has to launch the IE browser in the background for conversions it can be quite slow, and it is not particularly stable either.
There is a new version currently in Beta which is supposed to be much better, but I've not actually had a chance to try it out, so don't know how much of an improvement it is.
Michael C
Updated on December 01, 2020Comments
-
Michael C over 3 years
I am trying to put a toolbar in the custom table after the below code. When ever i add this there is an issue that resource is not available. Any suggestions are appreciated I have implemented example from https://github.com/bhardwaj-rahul/Copy-ctrl-c-From-Excel-To-Table-SAPUI5/commit/1ef4521dda976ef92b65774beaeca00e2129a5ba which copy paste from excel to table.
<c:CopyPasteTable id="tableId" items="{/Data}" class="sapUiSizeCompact"> <headerToolbar> <OverflowToolbar> <Button text=”{i18n>btnTxtPrintCountSheet}” type=”Emphasized” icon=”sap-icon://print” iconFirst=”true” enabled=”true” visible=”true” iconDensityAware=”false” class=”sapUiTinyMargin”/> <Button text=” ” type=”Emphasized” icon=”sap-icon://add” iconFirst=”true” width=”auto” enabled=”true” visible=”true” press=”onAddPress” iconDensityAware=”false” class=”sapUiTinyMargin”/> </OverflowToolbar> </headerToolbar>
-
Boghyon Hoffmann over 3 yearsYou should be getting an error that the framework "Cannot add direct child without default aggregation defined for control …”. If that's the issue (which was confirmed by your comment), consider to mark this question as a duplicate of stackoverflow.com/q/59654209/5846045.
-
-
panschk about 15 yearsThanks for the helpful answer. I don't think ActivePDF is really suitable because of the price, but it's good to know something like that exists.
-
MGOwen over 14 yearsFor a straight html-page-to-pdf conversion, this is better than anything else I've seen, free or commercial.
-
Julie over 13 yearsIf you're looking for a cheaper alternative for Prince, try DocRaptor.com. It uses Prince as the engine.
-
Eran Medan about 13 yearsIt's AGPL, seems even worse than GPL, you need to be open source even if you just serve the PDF and iText is server side.
-
Eran Medan about 13 yearsDoes it work on a non Mac OS?
-
Mic about 13 years@Eran, we use it on linux. I think there's a windows version too
-
mP. about 13 yearsDoesnt sound like a very scalable solution if one needs to convert pages on the fly to pdf in parallel. If a few requests come thru that result in a conversion using FF your server will have lost a few GIG of memory just to serve a few converted pages. This would open your server to a DOS.
-
Nowaker about 13 years@Eran, Just use the last non-AGPL version (com.lowagie:itext:2.1.7 in Maven).
-
Viccari about 12 years@Mic Yes, there is a Windows version too.
-
David Hofmann over 11 yearsThe real problem with flying sauser is that it uses itext to render PDF, which is a AGPL v3 licenced lib
-
Gary - Stand with Ukraine about 11 yearsThe version of itext used by Flying Saucer is 2.0.8 which was available under LGPL. Only version numbers 5 or above are on the more restrictive license. stackoverflow.com/questions/2692000/…
-
user1914292 about 11 yearsAnd if you want to cheaper, but with more options, try htm2pdf.co.uk - it uses webkit and users real WYSIWIG
-
Lucas Meijer almost 11 yearstested on windows XP (version 0.9.9) and works very well. Also, does not require admin rights on the machine to install.
-
SteveT almost 11 yearsI'd say the real problem with Flying Saucer is that it requires a well-formed and valid XML document. It's easy to unwittingly break the PDF rendering by including something like an ampersand in your HTML, or some javascript code that makes your rendered HTML not strict XHTML. Though this can be mitigated with automated tests or some process that involves XML validation.
-
nafg over 10 yearsBetter but similar: github.com/ariya/phantomjs/wiki/Screen-Capture (according to we-love-php.blogspot.com/2012/12/… the pdf has real text, not rasterized)
-
Pino over 10 yearsHTMLWorker is deprecated in newer versions of IText in favor of XMLWorker; however CSS support is poor in both cases (see demo.itextsupport.com/xmlworker/itextdoc/…) and was not adequate for my needs. On the contrary Flying Saucer was perfect.
-
David Hofmann over 9 yearswhy can't we use the real browser for that instead of the fork of the (now unmantained) rendering engine ? See stackoverflow.com/q/25574082/39998
-
Mic over 9 years@DavidHofmann, probably because this question dates back to 2009. From the last check I did few months ago, there was still no comparable solution in JS
-
IcedDante over 9 yearsHow would this work in a threaded Enterprise environment that would be generating several hundred pdf files a minute?
-
Mic over 9 years@IcedDante, what makes you think there would be a problem?
-
IcedDante over 9 yearsI guess what I am wondering is if this shell utility creates its own memory space for each invocation or if it operates like a utility in headless mode where each thread would be using a shared resource
-
Mic over 9 years@IcedDante, we have a similar load of pdf as yours, but we queue them in a background job, to preserve server performances. And run them one by one. However if I remember well, in the beginning we made some tests, and there was no collision on concurrent calls.
-
Jossef Harush Kadouri over 8 yearsi love you for this reference. great utility
-
Gray about 8 yearsHow is this a Java solution? This is a windows print driver.
-
PhiLho about 8 yearsThe OP explicitly mentioned Windows. And I suppose there are similar drivers for other systems. The OP only mentioned Java as a possible solution...
-
Vova Rozhkov over 7 yearsYou may use LGPL version which could be found at github.com/albfernandez/itext2
-
user1474090 over 7 yearsGrabzIt's HTML to PDF API: grabz.it/html-to-pdf-image-api.aspx Works in the same way it renders the HTML in a browser and then creates the PDF this ensures that there is much more accurate PDF conversions.
-
Cardinal System over 5 yearsIt's JavaScript, not Java....
-
Mic over 5 years@CardinalSystem it's neither JS nor Java, just a command line tool over the library of WKHTMLTOPDF written in c
-
kommradHomer over 4 yearsFor many simple cases , I still do recommend using a wkhtmltopdf binary
-
Kenny Cason over 3 yearsCan confirm wkhtmltopdf is a great tool, and easy to use. I've been using it for years and still use it frequently.
-
Emmanuel Bourg almost 3 yearsHTMLWorker supports very simple HTML documents, with basic elements and no CSS. It is too limited to be useful. But the more recent iText html2pdf works really great kb.itextpdf.com/home/it7kb/ebooks/…
-
Daniel almost 3 yearsFrom Java, you can use github.com/wooio/htmltopdf-java which is a wrapper around wkhtmltopdf
-
ayan ahmedov about 2 years@Danielany may I ask, if you have any experience using it in a web server environment? I mean I think, it won't play nicely with a web server spawning new process for each client request.
-
Mic about 2 years@ayanahmedov, yes we do that for about 13 years now, on an Ubuntu server with nginx