storing a scanned (pdf,tiff,jpeg) file in MongoDB .

17,378

Solution 1

You can store files by using MongoDb GridFs as described in this question and extract texts from a PDF file by using some features those described in this question. ;).

HTH

Solution 2

I think that you should save the files on file system of the server and the path of the file and the string from the file inside of MongoDB, It's more efficient to read the file from the servers filesystem then to load them from MongoDB.

The other option is to save the file as binary data but then you won't be able to search inside the file.

Share:
17,378
Waqas Rana
Author by

Waqas Rana

Updated on June 07, 2022

Comments

  • Waqas Rana
    Waqas Rana almost 2 years

    I have to store a tiff(tag image file format) or pdf scanned file in mongodb that should be Text search able . like if we want to search "on base of text" it should be able to search .

    I am going to use .net mvc or java with mongodb .

    so how can i store this pdf file and then can retrieve from database .

    any suggestion will be appreciated .

    thanks

  • Waqas Rana
    Waqas Rana over 7 years
    all right . but if i follow the first way that you have mentioned above , would i be able to search in file ? main purpose is to search in file .
  • Pini Cheyni
    Pini Cheyni over 7 years
    In case this is pdf with text you sould extract all the text and save it seperatly , tiff and images you will have to do OCR and process them seperatly to extract all the text on which you will make your search queries.