From PDf to String
13,714
Solution 1
use iText. The following snippet for example will extract the text.
PdfTextExtractor parser =new PdfTextExtractor(new PdfReader("C:/Text.pdf")); parser.getTextFromPage(3);
Solution 2
PDFBox barfs on many newer PDFs, especially those with embedded PNG images.
I was very impressed with PDFTextStream
Solution 3
JPedal
and Multivalent
also offer text extraction in Java
or you could access xpdf
using Runtime.exec
Author by
Ankur
A junior BA have some experience in the financial services industry. I do programming for my own personal projects hence the questions might sound trivial.
Updated on June 09, 2022Comments
-
Ankur almost 2 years
What is the easiest way to get the text (words) of a PDF file as one long String or array of Strings.
I have tried pdfbox but that is not working for me.