c# - pdf to word programmatically

23,663

Solution 1

We offer a solution called EasyConverter SDK that you may wish to give a try:

http://www.pdfonline.com/easyconverter/sdk/index.htm

If you want to get a quick idea of what the results would look like before trying the evaluation version, you can use the online converter here first:

http://www.pdfonline.com/pdf2word/index.asp

There are indeed many considerations when converting a mostly static format like PDF to Word. EasyConverter SDK works nicely for most business documents while marketing documents (which typically utilize fancier layouts) are usually more challenging.

Solution 2

As in "solution", a way to do it, probably, but you'd have to digg into this yourself:

The PDF file format is... quite hard to understand. First of all, it can't be compared to Word format at all. It's format is designed to produce a consistent look on all platforms and printers, Word therein, is a little less strict.

Editing PDF files, first, is quite hard too: because you don't have "text" like in Word; it's more like chunks of letters. These are all positioned individually.

The only doable solution I see is the following:

  1. Render the PDF to an image. (Thus requires a PDF rendering library!)
  2. Append this image into a .doc. (Thus requires a .DOC writing library!)

I think it's what SautinSoft is doing too; that's the reason of it's bad quality. Images can get quite huge if you want good quality (i.e. you can't get the optimization like generic fonts or repeating graphics, like you have with PDF files).

Solution 3

Convert the PDF to SVG and embed the SVG in the Word document.

Share:
23,663
Peanut
Author by

Peanut

Updated on February 25, 2020

Comments

  • Peanut
    Peanut about 4 years

    Does anyone know of a good solution for converting PDF files to a word .doc files (not docx) programmatically? I've tried SautinSoft's solution but even though it does the job, it's not the best quality.

  • Uday
    Uday over 7 years
    thanks. this is helpful to me.