How to implement Tesseract to run with project in Visual Studio 2010

35,291

Solution 1

OK, I figured it out but it works for Release and Win32 configuration only (No debug or x64). There are many linking errors under Debug configuration.

So,

1. First of all, download prepared library folder(Tesseract + Leptonica) here:

Mirror 1(Google Drive)

Mirror 2(MediaFire)


2. Extract tesseract.zip to C:\


3. In Visual Studio, go under C/C++ > General > Additional Include Directories

Insert C:\tesseract\include


4. Under Linker > General > Additional Library Directories

Insert C:\tesseract\lib


5. Under Linker > Input > Additional Dependencies

Add:

liblept168.lib
libtesseract302.lib

Sample code should look like this:

#include <tesseract\baseapi.h>
#include <leptonica\allheaders.h>
#include <iostream>

using namespace std;

int main(void){

    tesseract::TessBaseAPI api;
    api.Init("", "eng", tesseract::OEM_DEFAULT);
    api.SetPageSegMode(static_cast<tesseract::PageSegMode>(7));
    api.SetOutputName("out");

    cout<<"File name:";
    char image[256];
    cin>>image;
    PIX   *pixs = pixRead(image);

    STRING text_out;
    api.ProcessPages(image, NULL, 0, &text_out);

    cout<<text_out.string();

    system("pause");
}

For interaction with OpenCV and Mat type images look HERE

Solution 2

It has been a lot since the last reply but it may be help to others;

  1. I think you must also add "liblept168.lib" and "liblept168d.lib" to Additional Dependencies
  2. Add "liblept168.dll" and "liblept168d.dll" to the destination of your exe.
  3. Add #include to your code.

(This answer must be a comment to Bruce's answer. Sorry for confusion. )

Share:
35,291
OpenMinded
Author by

OpenMinded

Updated on July 30, 2022

Comments

  • OpenMinded
    OpenMinded almost 2 years

    I have a C++ project in Visual Studio 2010 and wish to use OCR. I came across many "tutorials" for Tesseract but sadly, all I got was a headache and wasted time.

    In my project I have an image stored as a Mat. One solution to my problem is to save this Mat as an image (image.jpg for example) and then call Tesseract executable file like this:

    system("tesseract.exe image.jpg out");
    

    Which gets me an output out.txt and then I call

    infile.open ("out.txt");
    

    to read the output from Tesseract.

    It is all good and works like a chair but it is not an optimal solution. In my project I am processing a video so save/call .exe/write/read at 10+ FPS is not what I am really looking for. I want to implement Tesseract to existing code so to be able to pass a Mat as an argument and immediately get a result as a String.

    Do you know any good tutorial(pref. step-by-step) to implement Tesseract OCR with Visual Studio 2010? Or your own solution?

  • OpenMinded
    OpenMinded almost 11 years
    Downloaded the libs. In C/C++>General>Additional Include Directories: Added \include folder. In Linker>General>Additional Library Directories: Added \lib folder. In Linker>Input>Additional Dependencies: Added libtesseract302.lib and libtesseract302d.lib. Wrote a simple program and can't build because of linking errors for every method called on object. For example: Error 9 error LNK2019: unresolved external symbol "public: char * __cdecl tesseract::TessBaseAPI::GetUTF8Text(void)" (?GetUTF8Text@TessBaseAPI@tesseract@@QEAAPEADXZ) referenced in function main. What am I missing?
  • Bruce
    Bruce almost 11 years
    Good news : compilation step is working. Bad news, linking step is failing. It looks like it's not finding the correct library for linking. I would advise to use libtesseract302.lib in release and libtesseract302d.lib in debug. You can go to ConfigurationPropertis / Linked / command line in your visual project to make sure the command line is pointing to correct location
  • OpenMinded
    OpenMinded almost 11 years
    Been using x64 configuration because of OpenCV...so I switched to x86. No more Tesseract linking errors. Now I have similar linking errors but with OpenCV functions. So I threw away OpenCV and tried to build Tesseract only to see if it works. Switched imread(OpenCV) for pixRead(Leptonica?). Apparently, it does not recognize this function pixRead. I think I need leptonica headers? allheaders.h or what? I slowly give up on everything :-/
  • İsmail Kocacan
    İsmail Kocacan almost 9 years
    I said language data path like that it worked. api.Init("C:\\tessdata", "eng", tesseract::OEM_DEFAULT);
  • MarcioPorto
    MarcioPorto over 8 years
    When you say "add #include to your code", what exactly must be included?
  • j.doe
    j.doe over 7 years
    I cant find folder that "tessdata" ?? Should I create it orit must be in folder @İsmailKocacan
  • j.doe
    j.doe over 7 years
    I download folder your link but it did not tessdata?? @OpenMinded
  • OpenMinded
    OpenMinded about 7 years
    Custom "tessdata" language folder was used by user @İsmailKocacan so ask him.
  • pourjour
    pourjour almost 7 years
    you should add: #include <tesseract/baseapi.h> #include <leptonica/allheaders.h>