Parse Pdf File and write content in word file using java

17,207

Solution 1

For parsing a PDF file in Java, you can use Apache PDFBox: http://incubator.apache.org/pdfbox/

For reading/writing Word (or other Office) file formats in Java, try POI: http://poi.apache.org/

Both are free.

Solution 2

Try the iText java library:

iText is an ideal library for developers looking to enhance web- and other applications with dynamic PDF document generation and/or manipulation.

It can be used for your parsing step.

As for generating word documents - the OpenOffice Java API might be able to generate Word compatible docs (no personal experience with this API).

Share:
17,207
kedar kamthe
Author by

kedar kamthe

I am a self-motivated software engineer. Programming is not just a job to me; it is a passion. I started my career early upon obtaining my Bachelor's Degree in Computer Science, but my eagerness to program started way before my academic training. I live in Pune, India and work as Sr. Java developer with a reputed firm.

Updated on June 05, 2022

Comments

  • kedar kamthe
    kedar kamthe almost 2 years

    how to Parse a PDF file and write the content in word file using Java?

  • JasonPlutext
    JasonPlutext over 13 years
    Alternatively, you can use docx4j to write the docx. Like POI, its free.