Search text within multiple pdfs and docs

25,122

Solution 1

Recoll is probably the most versatile document search engine you will find on Linux:

enter image description here

It supports a plethora of different formats and is very customizable.

For installation instructions and other pointers please check out this answer. The official documentation is very useful, too.

Solution 2

Install the package pdfgrep

sudo apt-get install pdfgrep

then use the command:

find /path -iname '*.pdf' -exec pdfgrep pattern {} +

Solution 3

Use DocFetcher, it is a native Linux application that indexes and searches through multiple document types.

Solution 4

I understand that Adobe Reader is proprietary software, but it has well designed Search in Files functionality (accessible from Edit->Search menu or by pressing Ctrl+Shift+F).

Simple Search Options

The Simple search window is shown below:

Simple search in Adobe Reader

You need to set Where would you like to search? to All PDF Documents in and then select location from drop-down menu (Browse for Location).

You can enter search item to the What word or phrase would you like to search for? field and specify search options: Whole words only, Case-Sensitive, Include Bookmarks, Include Comments.

Advanced Search Options

Advanced search is more configurable - see image below:

Advanced Search

The search path is set in Look In.
Search item - in What word or phrase would you like to search for?.
Return results containing has options: Match Exact word or phrase, Match Any of the words, Match All of the words, Boolean query.

Other options include: Whole words only, Case-Sensetive, Proximity, Stemming, Include Bookmarks, Include Comments, Include Attachments.


Note: you can still install native Adobe Reader version 9.5.5 as described in other thread.

Solution 5

rga (or ripgrep-all) is a command line tool to recursively search all files in a directory for a regex pattern, that runs on Linux, macOS and Windows. It's a wrapper for ripgrep, the line-oriented recursive search program, on top of which it enables search in a multitude of file types like PDF, DOCX, ODT, EPUB, SQLite databases, movies subtitles embedded in MKV or MP4 files, archives like ZIP or GZ, and more.

Share:
25,122

Related videos on Youtube

Rabbit
Author by

Rabbit

Updated on September 18, 2022

Comments

  • Rabbit
    Rabbit over 1 year

    I got a bunch of notes written by other students, but they are from an old textbook that didn't deal with everything in the same order, so I need to search through the notes for every chapter (each individual chapter is in a different .pdf or .doc) for "trace conditioning" for example.

    I used to use Google Desktop for this, I have Launchy now and I told it to search pdfs, but it only searches the titles, not the content.

    Thanks for any help.

    • cremefraiche
      cremefraiche over 9 years
      Once you find a match, what are you trying to do?
    • Jacob Vlijm
      Jacob Vlijm over 9 years
      Do you need to know if a string occurs in a file, or where that is as well?
    • Rabbit
      Rabbit over 9 years
      Once i find which chapter covers the topic i am looking for I can read the notes on that topic in that chapter, so I just need to know IF and not where. (Please remember when answering that I can't comment on your answers because I haven't got 50 reputation points, I can only comment on my own question).
    • Alaa Ali
      Alaa Ali over 9 years
      You can comment on answers to your question, we're not that harsh.
    • αғsнιη
      αғsнιη over 9 years
      @Rabbit with your edit summary you blocked me to editing your question to removing "Thanks" ;)
    • Rabbit
      Rabbit over 9 years
      I don't see where it says that I can't say thanks?
  • Rabbit
    Rabbit over 9 years
    Thanks! That works well, can't seem to execute it without using the terminal though. I'd vote you up but.. can't vote yet ;)
  • Alaa Ali
    Alaa Ali over 9 years
    @Rabbit Um, I think you can also vote on answers to your question.
  • Glutanimate
    Glutanimate over 9 years
    @AlaaAli No, the reputation limit applies to the OP also.
  • Rabbit
    Rabbit over 9 years
    Yup, I couldn't. I can now though! :) I just needed 15
  • Sri
    Sri over 9 years
    Down-voted for suggesting wine (which means windows), when Linux solutions exist.
  • Virbhadrasinh Gohil
    Virbhadrasinh Gohil over 9 years
    sorry bro but when I need it that times I use this that's why Give that suggestion.
  • A Umar Mukthar
    A Umar Mukthar about 9 years
    Running windows application under ubuntu is generally not suggested. As linux is virtually virus free. I go with @Sri 's idea
  • A Umar Mukthar
    A Umar Mukthar about 9 years
    Can we able to configure it with gnome search engine ??
  • yuranos
    yuranos about 7 years
    Amazing app. So fast!
  • 6005
    6005 about 7 years
    Thank you! This worked. If anyone is wondering, "pattern" is what you would replace with specific text. If the text has spaces in it, you can enclose it in double quotes.
  • lenooh
    lenooh over 6 years
    Don't forget to install antiword in order to search .doc files as well.
  • LondonRob
    LondonRob about 6 years
    If you know where your PDF files are, you can simplify the command to just pdfgrep -r "my expression" where -r searches recursively through directories.
  • Stefan
    Stefan about 3 years
    Not in the software centre for 20.04.
  • Stefan
    Stefan about 3 years
    Worked great for me (on ubuntu 20.04).
  • Tejas Shetty
    Tejas Shetty almost 3 years
    Good to know about it