How can I open PDF files in LibreOffice Writer rather than Draw?

23,836

Solution 1

Well, apparently - you can do so just fine! LibreOffice has three separate input filters for PDFs, which open them in LO Writer, Impress and Draw, respectively. When you open a PDF file, you just have to make sure and scroll down the list of possible filters and choose the right one.

I've recently opened this LibreOffice bug for having each app default to using its own PDF import filter.

Solution 2

LibreOffice does not have the native capability to open or import random PDF files into a Writer document (and there are no extensions available that add this). As you noted, it can import into Draw and then save as ODT, but the results leave a lot to be desired. Some years ago, there was a PDF Importer extension, and that was included by default starting around version 4; it's what now lets you open a PDF in Draw and handles hybrid PDFs.

If you are creating the PDF yourself, you can export it from Writer as a "hybrid" PDF. This embeds an ODT copy of the document within the PDF. In that case, the PDF can be opened and edited in Writer and all of the formatting is preserved. Hybrid PDFs are described here: https://wiki.documentfoundation.org/Faq/Writer/PDF_Hybrid.

Otherwise, there are third-party applications and web services that will convert the PDF to ODT, which you can then open in Writer. Here are a few:

  • Online2PDF, CloudConvert, Zamzar, and file-converter-online all convert to and from PDF for a range of formats, including ODT.

  • Convertio also includes the ability to run OCR on a PDF image file to recover the document and formatting.

  • Sejda and Smallpdf don't convert to ODT, but they can convert to MS Office formats, which LibreOffice can open and use.

I have not used all of these services and it's been ages since I used any of them, so I can't advise you on how well any of the services perform. I suspect that different services may do better on documents with different characteristics, so you might want to experiment with several services to see which does the best job on your document.

There are also some downloadable conversion applications, some free, some free to try on a limited basis, and some paid. I haven't used any so I can't offer advice, but the options are easily found with a Google search.

Share:
23,836

Related videos on Youtube

einpoklum
Author by

einpoklum

Made my way from the Olympus of Complexity Theory, Probabilistic Combinatorics and Property Testing to the down-to-earth domain of Heterogeneous and GPU Computing, and now I'm hoping to bring the gospel of GPU and massive-regularized parallelism to DBMS architectures. I've post-doc'ed at the DB architecture group in CWI Amsterdam to do (some of) that. I subscribe to most of Michael Richter's critique of StackOverflow; you might want to take the time to read it. If you listen closely you can hear me muttering "Why am I not socratic again already?"

Updated on September 18, 2022

Comments

  • einpoklum
    einpoklum over 1 year

    Some websites offer conversion of PDFs int DOCX or ODT files; and I think Adobe Acrobat (at least the full version) offers an export functionality to all sorts of formats. But in LibreOffice, if I open a PDF files, it opens in Draw. Now, Draw is fine sometimes, not always.

    So, can I somehow open PDF files into an LO Writer document?

    Note: I'm obviously interested in PDFs which can be legitimately perceived as Writer documents, e.g. having been exported from a word processor. Thus opening them as dozens of frames scattered across the page is not what I'm after. That can be achieved with opening in Draw, copying everything and pasting in Writer. I want the text in nice consecutive paragraphs, hopefully with consistent styles (even if synthesized) etc.

    • DrMoishe Pippik
      DrMoishe Pippik over 6 years
      Apparently, there is no direct way to open a PDF document in Writer, nor to save it in ODT format from Draw. However, there are numerous tools for conversion of PDF to ODT documents, both online and as discrete applications. That said, conversion is always "iffy" because PDF is a page description format, losing the line breaks of the original document.
    • einpoklum
      einpoklum over 6 years
      @DrMoishePippik: But often the PDF is the output of a conversion/print-out of a document, which you then want to work on. See my edit of the question. Also, do you suggest I ask on SR.SX?
    • Ravindra Bawane
      Ravindra Bawane over 6 years
      The fact you're using LibreOffice may make this moot or awkward, but Word 2016 can both open and convert PDF files AND save files to ODT.
    • DrMoishe Pippik
      DrMoishe Pippik over 6 years
      In the conversion from ODT to PDF much is (intentionally) lost. The PDF file, for example, may lose all the original CR/LF (paragraph symbols), and add its own line breaks at the end of each line of text in the PDF document as displayed, rather than at the end of a paragraph.
    • einpoklum
      einpoklum over 6 years
      @DrMoishePippik: Most of that can rather easily be recovered, and online tools do this. Also, PDFs can include meta-data so practically none of this stuff is lost (but I'm not sure what LibreOffice saves).
    • DrMoishe Pippik
      DrMoishe Pippik over 6 years
      Actually, it's often not recovered, but synthesized through optical character recognition (OCR) that recreates the actual paragraph format based on the page layout. The extreme case is a PDF document that contains no text, only the images of text. OCR is the only way to recover text from such a file.
    • einpoklum
      einpoklum over 6 years
      @DrMoishePippik: I accept the distinction between recovery and synthesis. However, I'm not talking about scanned images which require OCR, I'm taking about PDFs generated on a computer, often originally being MS-Word or LO Writer documents.
  • einpoklum
    einpoklum over 6 years
    So, you're saying the code behind these tools is proprietary, or at least - not part of the LO codebase?
  • fixer1234
    fixer1234 over 6 years
    @einpoklum, I'm not sure of the reason why that's not a feature, like it is in MS Office. But as far as I can tell, none of the major open source, free office suites offers it. WPS Office includes a PDF to MS Office conversion in their premium product, but not their free version.
  • mirh
    mirh about 5 years
    Just for the records, in addition to embedding pdfimport (which is based on xpdf/Poppler), LO is now also using PDFium.
  • einpoklum
    einpoklum almost 5 years
    @mirh: How is that relevant to the answer though?
  • mirh
    mirh almost 5 years
    It was more relevant to your comment. Put aside their FOSS status, which they certainly are, whether they could count as "part of the codebase" or not kind of depends on your definition of it.