Handle Doc/Docx Templates on a headless server to produce PDFs preferably without using OpenOffice.org

8,687

Solution 1

To my knowledge there is no application that can do this without some dependency from Libre Office.

However you don't need to install the whole office suite when only performing commandline conversions.

You can try if the tool unoconv Install unoconv meets your needs. It has python and python-uno as a dependency. The latter will also install libreoffice-core as a dependency but not the whole office suite.

Solution 2

AbiWord will convert between any formats it knows from the command line, which includes all those you mention. E.g,. to convert odt to pdf:

abiword --to=pdf filename.odt

to convert .docx to .doc:

abiword --to=doc filename.docx

(If you want to search it it, just convert to something plain-text based like HTML or RTF or even TXT and search in there; convert back if need be.)

But what exactly are the obvious reasons not to install OpenOffice so you can use its libraries with, e.g., unoconv?

Solution 3

You could try AbiWord server side example given in this link http://www.advogato.org/person/msevior/diary.html?start=65

Share:
8,687

Related videos on Youtube

Matthew Merryfull
Author by

Matthew Merryfull

Updated on September 17, 2022

Comments

  • Matthew Merryfull
    Matthew Merryfull over 1 year

    On a production web server I have to produce letters based on a template I got in MS-Word binary format. I use PHP and for the search and replace task I found PHPWord, which can handle Docx files, so I converted the template to OpenXML on my local workstation. Unfortunately the output also is Docx.

    The goal is to produce a single PDF for the user to download so she can print out a bunch of letters at once very easily.

    Now I need to find a way to either:

    • Search and replace text in a PDF file
    • Convert Docx to PDF without loss of formatting
    • Edit the original Doc template without loss of formatting and without using COM
    • Convert Docx to Doc without loss of formatting (which seems nearly impossible for the template looks good in word but technically how the formatting is done is a big pile of...) so I could convert it using wvPDF

    What I don't want to use besides OpenOffice.org are web services. I'm aware of PHPLiveDocx but I don't want to depend on an external service for performance, availability, security reasons. Also buying a piece of software isn't an option in this case (can't influence that).

    Running on a public facing web server I don't want to pull OpenOffice.org - not even headless, as it will pull around 160MB of compressed(!) binaries and best practice is not no load binaries you don't really need on a public facing server. Though it's a last resort to use oo.o I want to make sure I have ruled out any other options there may have been.

    The host OS is CentOS 5.5.

    Where can I go from here?

    Regards, luxifer

    • Jorge Castro
      Jorge Castro over 13 years
      Can you edit your question and integrate your updates into the question? It would make it easier to read, see here, thanks: meta.askubuntu.com/questions/908/…
    • frabjous
      frabjous over 13 years
      If you're using CentOS, why is this on AskUbuntu rather than Unix and Linux SE?
    • Matthew Merryfull
      Matthew Merryfull over 13 years
      because our new web servers will run ubuntu plus in my experience the ubuntu community tends to be more exerted in finding a good answer...
  • Matthew Merryfull
    Matthew Merryfull over 13 years
    it's using openoffice.org to do the trick, so unfortunately this is not for me :(
  • Matthew Merryfull
    Matthew Merryfull over 13 years
    it depends on python-uno, which depends on openoffice.org-core :(
  • Takkat
    Takkat over 13 years
    that's the black magic of dependencies ;) sorry to hear.
  • Ryan C. Thompson
    Ryan C. Thompson over 13 years
    You haven't explained why you don't want to install openoffice.org. Obviously, you wouldn't want to install the GUI components, but are you so prejudiced against it that you can't allow a headless install?
  • Matthew Merryfull
    Matthew Merryfull over 13 years
    even headless will pull around 160mb of data including java and that's its compressed size! it's just good practice not to pull loads of executable code on a production server if you have an alternative (which I'm seeking :-))... so pulling openoffice is kind of like a last resort for me
  • Matthew Merryfull
    Matthew Merryfull over 13 years
    I'll try that one... obviously abiworld will pull gui dependencies I don't need or want, too, but not as many as openoffice... I think I'll look in custom compiling both of them today to see if it could be stripped down a bit more
  • Matthew Merryfull
    Matthew Merryfull over 13 years
    that looks promising at the first glance... I'll look into that one, too, today
  • Matthew Merryfull
    Matthew Merryfull over 13 years
    just downloaded the latest stable sources and there's no abicommand plugin... it's listed in the plugin matrix page at the abiword wiki though... still, I have yet to figure why it's missing and where to get it :-) Until that i'll give it a try without that plugin
  • frabjous
    frabjous over 13 years
    Unless you're really hurting for disk space, I don't see what the big deal is for installing GUI dependencies. You don't need to use them or load them, so they shouldn't slow down your system.
  • Matthew Merryfull
    Matthew Merryfull over 13 years
    still I cannot find the abicommand plugin in the current source... do I have to get it from somewhere else? the abiwiki isn't exactly clear about this
  • Matthew Merryfull
    Matthew Merryfull over 13 years
    I finally gave up on the topic... It just seems impossible. I gave in on using OpenOffice and Odt files as templates. At least it works and it wouldn't be economic to hunt for another solution after trying all this... sigh thanks to all of you offering solutions!