Best way to export html to Word without having MS Word installed?

40,678

Solution 1

If you have only simple HTML pages as you said, it can be opened with Word.

Otherwise, there are some libraries which can do this, but I don't have experience with them.

My last idea is that if you are using ASP.NET, try to add application/msword to the header and you can save it as a Word document (it won't be a real Word doc, only an HTML renamed to doc to be able to open).

Solution 2

There's a tool called JODConverter which hooks into open office to expose it's file format converters, there's versions available as a webapp (sits in tomcat) which you post to and a command line tool. I've been firing html at it and converting to .doc and pdf succesfully it's in a fairly big project, haven't gone live yet but I think I'm going to be using it. http://sourceforge.net/projects/jodconverter/

Solution 3

There is an open source project called HTMLtoWord that that allows users to insert fragments of well-formed HTML (XHTML) into a Word document as formatted text.

HTMLtoWord documentation

Solution 4

While it is possible to make a ".doc" Microsoft Word file, it would probably be easier and more portable to make a ".rtf" file.

Solution 5

If you are working in Java, you can convert HTML to real docx content with code I released in docx4j 2.8.0. I say "real", because the alternative is to create an HTML altChunk, which relies on Word to do the actual conversion (when the document is first opened).

See the various samples prefixed ConvertInXHTML. The import process expects well formed XML, so you might have to tidy it first.

Share:
40,678
Robert Dean
Author by

Robert Dean

Updated on March 23, 2020

Comments

  • Robert Dean
    Robert Dean about 4 years

    Is there a way to export a simple HTML page to Word (.doc format, not .docx) without having Microsoft Word installed?

  • Andrew Hancox
    Andrew Hancox over 13 years
    I didn't end up using it, it turned out that it leaked memory too badly for production use.
  • nullnvoid
    nullnvoid over 8 years
    Unfortunately Aspose.Words has an ImportHTML process, but it doesn't support CSS. So you will have to manually recreate all formatting in the resulting doc. This includes table formatting, lists, and text styles.
  • Nathan Prather
    Nathan Prather over 7 years
    This method refers to emailing the html, but applies to ms word too: 4guysfromrolla.com/articles/122006-1.aspx
  • michal krzych
    michal krzych over 6 years
    this does not answer the question