Convert docx file into PDF with Java
There are lot of methods to do conversion One of the used method is using POI and DOCX4j
InputStream is = new FileInputStream(new File("your Docx PAth"));
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
.load(is);
List sections = wordMLPackage.getDocumentModel().getSections();
for (int i = 0; i < sections.size(); i++) {
wordMLPackage.getDocumentModel().getSections().get(i)
.getPageDimensions();
}
Mapper fontMapper = new IdentityPlusMapper();
PhysicalFont font = PhysicalFonts.getPhysicalFonts().get(
"Comic Sans MS");//set your desired font
fontMapper.getFontMappings().put("Algerian", font);
wordMLPackage.setFontMapper(fontMapper);
PdfSettings pdfSettings = new PdfSettings();
org.docx4j.convert.out.pdf.PdfConversion conversion = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(
wordMLPackage);
//To turn off logger
List<Logger> loggers = Collections.<Logger> list(LogManager
.getCurrentLoggers());
loggers.add(LogManager.getRootLogger());
for (Logger logger : loggers) {
logger.setLevel(Level.OFF);
}
OutputStream out = new FileOutputStream(new File("Your OutPut PDF path"));
conversion.output(out, pdfSettings);
System.out.println("DONE!!");
This works perfect and even tried on multiple DOCX files.
Ferguson
Updated on June 13, 2022Comments
-
Ferguson almost 2 years
I'am looking for some "stable" method to convert DOCX file from MS WORD into PDF. Since now I have used OpenOffice installed as listener but it often hangs. The problem is that we have situations when many users want to convert SXW,DOCX files into PDF at the same time. Is there some other possibility? I tryed with examples from this site: https://angelozerr.wordpress.com/2012/12/06/how-to-convert-docxodt-to-pdfhtml-with-java/ but the output result is not good (converted documents have errors and layout is quite modified).
here is "source" docx document:
here is document converted with docx4j with some exception text inside document. Also the text in upper right corner is missing.
this one is PDF created with OpenOffice as converter from docx to pdf. Some text is missing "upper right corner"
Is there some other option to convert docx into pdf with Java?
-
Stefan Hegny over 7 yearsNot on SO; when you would be asking "to recommend a tool or library" - but why not just try to get you openoffice setup stable?
-
Davide over 7 yearsYou can use JODConverter (code.google.com/archive/p/jodconverter) or docx4j (docx4java.org/trac/docx4j)
-
Ferguson over 7 yearsJODConverter uses OpenOffice in background.. The problem is that OpenOffice sometimes hangs (crash) without any reason. I also tryed docx4j (look at my question)
-
JasonPlutext over 7 yearsThat's a 4 year old article you reference there. These days, the recommended way to do it from docx4j is with Plutext's commercial PDF Converter. You can try that online at converter-eval.plutext.com
-
-
Ferguson over 7 yearsTryed with your method but stil get some exception: WARN org.apache.fop.image.loader.batik.PreloaderSVG .preloadImage line 76 - Batik not in class path java.lang.NoClassDefFoundError: org/apache/batik/bridge/UserAgent at org.apache.fop.image.loader.batik.PreloaderSVG.preloadImage(PreloaderSVG.java:69)
-
KishanCS over 7 yearsimport org.apache.log4j.Level; import org.apache.log4j.LogManager; import org.apache.log4j.Logger; import org.docx4j.convert.out.pdf.viaXSLFO.PdfSettings; import org.docx4j.fonts.IdentityPlusMapper; import org.docx4j.fonts.Mapper; import org.docx4j.fonts.PhysicalFont; import org.docx4j.fonts.PhysicalFonts; import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
-
Ferguson over 7 yearsstill get the same malformed PDF as in docx4j... here is: s5.postimg.org/ptxrxtfyf/screenshot_1540.jpg
-
KishanCS over 7 years//To turn off logger List<Logger> loggers = Collections.<Logger> list(LogManager .getCurrentLoggers()); loggers.add(LogManager.getRootLogger()); for (Logger logger : loggers) { logger.setLevel(Level.OFF); } This turns off those messages
-
Ferguson over 7 yearsWill try to remove log but text (upper right corner), footer etc is missing in PDF document...
-
KishanCS over 7 yearsIs it an originally created docx or converted . Please check
-
KishanCS over 7 yearsIf possible provide the docx file .
-
Ferguson over 7 yearsIt's a document created in MS WORD - Office professional 2013.. s5.postimg.org/63a55ovlz/screenshot_1541.jpg If you can try here is my document: drive.google.com/file/d/0B6Z9wNTXyUEeOUtFRVhZeWtnZ3M/…
-
KishanCS over 7 yearsCheck all dependencies once and rebuild the project . IT works charm!! Thank you
-
Ferguson over 7 yearsCan you please send me a link with all included libraries? I have download librarires from this site: angelozerr.wordpress.com/2012/12/06/…
-
Ferguson over 7 yearsAlso if I download latest library from docx4java I can't find Class org.docx4j.convert.out.pdf.PdfConversion
-
JasonPlutext over 7 yearsThe code sample in this answer uses docx4j, not POI :-)
-
JasonPlutext over 7 yearsIn the most recent docx4j, the export via XSL FO is a separate library, so you'd need that jar and its dependencies. Or use our commercial PDF Converter I recommended in my other comment :-)
-
Ferguson over 7 yearsHI JasonPlutext.. Have tryed your online converter but in generated PDF there is no image in the lower left corner... s5.postimg.org/k5w2ko0zr/screenshot_1542.jpg ant this is original document: s5.postimg.org/8utewau4n/screenshot_1543.jpg any idea?
-
JasonPlutext over 7 yearsWould need to see the source docx. Can you email it to me, or drag it to ndoc.it and paste the resulting link here?