Converting docx to pdf using openxml and pdfcreator in c#

20,410

Solution 1

You can use docx4j.NET to convert a docx to XSL FO, and from there, to PDF. Or, indeed, to any of the other output formats supported by Apache FOP.

See this sample.

docx4j.NET is an IKVM'd DLL of docx4j, an ASL v2 licensed open source project.

Solution 2

I think you're trying to do two different things here. OpenXML works with the DOCX file - Word is not used in any way in this case. PDFCreator appears to pretend to be a printer and when Word "prints" to it, it generates a PDF file.

Because you say you want to convert DOCX to PDF on the server, I am assuming you do not want to use Word. So your best shot, if you want all free software, is to use OpenXML to read the file and then call iText to create the PDF. Your code is basically going to convert from reading the OpenXML content to feeding that to iText.

Keep in mind that there are a lot of complexities to this. It's not just read a paragraph from OpenXML, write it to iText. You have to pass to iText all paragraph and run properties as well as any applied styles, lists, etc. The rules for how to indent the first line of a paragraph alone are quite complex.

If you're open to commercial software there are a number of products that can easily do this. If so, add that to your question and I'll list those (including my company).

Share:
20,410
user1135690
Author by

user1135690

Updated on June 27, 2020

Comments

  • user1135690
    user1135690 almost 4 years

    I need to convert docx to pdf file in server. I have seen PDFCreator will do, based on below link(http://sourceforge.net/projects/pdfcreator/).

    I need some suggestions on this as listed below:

    1. can i use PDF Creator in server side.
    2. without creating word object, can i convert docx to pdf with openxml by using pdfcreator API.

    Please give me reply soon.

    • Rup
      Rup over 12 years
      You'll need something to render the docx as an actual document. If you can't use Word, you could try OpenOffice? It's not brilliant for automating but it can be made to work. Or there's probably plenty of third-party components you can buy to do this. See this old question
    • user1135690
      user1135690 over 12 years
      i need to use only ms word only. Third party tools are there but cost. we used wordDocument.ExportAsFixedFormat(...) in our application, but word object required(it is not recommend by Microsoft.). I saw PDFCreator, is it good way?.
  • Rup
    Rup about 12 years
    Hi - the answer below is an attempt to reply to you; user rosirosss would like to hear your list of commercial products. Thanks!
  • David Thielen
    David Thielen about 12 years
    @Rup - Well my favorite is www.windward.net (disclaimer - I'm the CTO there). Others are DevExpress (limited but inexpensive), Crystal Reports (expensive and complex), and Sql Server Reporting Services (free but is part of Sql Server and that is expensive).
  • Liladhar
    Liladhar almost 9 years
    This is good but timely complex. taking almost 10 min to convert.
  • JasonPlutext
    JasonPlutext almost 9 years
    How many pages is your docx? Are you debugging in Visual Studio? That's very slow..