How to convert PDF to DOCX on linux
14,986
Managed to do it with soffice. I had to install this package: libreoffice-pdfimport And don't forget to use --infilter="writer_pdf_import"
Related videos on Youtube
Author by
Splinteer
Updated on August 02, 2022Comments
-
Splinteer over 1 year
I try to convert pdf file to word, excel and powerpoint. I already tried a lot of command like these:
soffice -env:UserInstallation=file:///$HOME/.libreoffice-headless/ --convert-to docx:"Microsoft Word 2007/2010/2013 XML" file.pdf /usr/bin/soffice --headless --invisible --convert-to docx file.pdf soffice --infilter="writer_pdf_import" --convert-to doc file.pdf /usr/bin/libreoffice --headless --invisible --convert-to doc file.pdf /usr/bin/soffice --headless --convert-to docx:"Microsoft Word 2007/2010/2013 XML" file.pdf abiword --to=doc file.pdf unoconv -f doc file.pdf lowriter --invisible --convert-to doc 'file.pdf'
Always got this error message from soffice/libreoffice/unoconv:
:1: parser error : Document is empty %PDF-1.7
And this one for abiword
Unable to init server: Could not connect: Connection refused ** (abiword:6477): WARNING **: clutter failed 0, get a life. Unable to init server: Could not connect: Connection refused
With every command but abiword. I got a doc file with bad character inside. But never get a proper file.
I try to create a file converter so I only want command line method. Don't want to use someone API.
Thank you
-
Tom G. almost 4 yearsThanks, I was looking for a long time for the correct infilter option to PDFs. May I ask how you knew it?
-
Splinteer almost 4 years@TomG. can't remember now but I did a lot of searches
-
Ankur Thakur almost 4 yearsThanks a lot. It worked like charm. I used:
libreoffice --invisible --infilter="writer_pdf_import" --convert-to docx:"MS Word 2007 XML" input_file.pdf
-
Tony Tan over 3 yearsIt converts PDF into tons of text box in order to keep the layout. Any way to improve upon this?
-
Csaba Tenkes almost 3 yearsMy problem is the same: the tons of textboxes Why it is not possible to convert to real doc, docx, odf in 2021 ? Or Libre Office open in writer with normal formatting instead of draw ?