How to convert PDF to DOCX on linux

14,986

Managed to do it with soffice. I had to install this package: libreoffice-pdfimport And don't forget to use --infilter="writer_pdf_import"

Share:
14,986

Related videos on Youtube

Splinteer
Author by

Splinteer

Updated on August 02, 2022

Comments

  • Splinteer
    Splinteer over 1 year

    I try to convert pdf file to word, excel and powerpoint. I already tried a lot of command like these:

    soffice -env:UserInstallation=file:///$HOME/.libreoffice-headless/ --convert-to docx:"Microsoft Word 2007/2010/2013 XML" file.pdf
    /usr/bin/soffice --headless --invisible --convert-to docx file.pdf
    soffice --infilter="writer_pdf_import" --convert-to doc file.pdf
    
    /usr/bin/libreoffice --headless --invisible --convert-to doc file.pdf
    /usr/bin/soffice --headless --convert-to docx:"Microsoft Word 2007/2010/2013 XML" file.pdf
    
    abiword --to=doc file.pdf
    unoconv -f doc file.pdf
    lowriter --invisible --convert-to doc 'file.pdf'
    

    Always got this error message from soffice/libreoffice/unoconv:

    :1: parser error : Document is empty
    %PDF-1.7
    

    And this one for abiword

    Unable to init server: Could not connect: Connection refused
    
    ** (abiword:6477): WARNING **: clutter failed 0, get a life.
    Unable to init server: Could not connect: Connection refused
    

    With every command but abiword. I got a doc file with bad character inside. But never get a proper file.

    I try to create a file converter so I only want command line method. Don't want to use someone API.

    Thank you

  • Tom G.
    Tom G. almost 4 years
    Thanks, I was looking for a long time for the correct infilter option to PDFs. May I ask how you knew it?
  • Splinteer
    Splinteer almost 4 years
    @TomG. can't remember now but I did a lot of searches
  • Ankur Thakur
    Ankur Thakur almost 4 years
    Thanks a lot. It worked like charm. I used: libreoffice --invisible --infilter="writer_pdf_import" --convert-to docx:"MS Word 2007 XML" input_file.pdf
  • Tony Tan
    Tony Tan over 3 years
    It converts PDF into tons of text box in order to keep the layout. Any way to improve upon this?
  • Csaba Tenkes
    Csaba Tenkes almost 3 years
    My problem is the same: the tons of textboxes Why it is not possible to convert to real doc, docx, odf in 2021 ? Or Libre Office open in writer with normal formatting instead of draw ?