Error while installing tesseract-ocr
13,586
I had the same exact issue. Using Visual studio 2017, on windows 10 machine and python 3.6 installed. What worked for me was to:
- Download and Install tesseract-ocr executable from https://github.com/UB-Mannheim/tesseract/wiki (Script assumes running from a windows system and saved tesseract installation to the default location suggested I.e. C:\Program Files (x86)\Tesseract-OCR) See https://github.com/tesseract-ocr/tesseract/wiki for more information on installing on different OS types (including windows), using the pre-built binary package.
- Ensure you have Python Imaging Library('PIL') or 'pillow' package installed for opening images. (installing PIL didn't work in my setting but pillow did i.e. pip install pillow). The reason you need this is because it is required by pytesseract. See https://pypi.org/project/pytesseract/0.2.5/ for more info on that.
-
Then to use it successfully in your code simply set the tesseract_cmd path within your code as follows:
from PIL import Image import pytesseract try: img = Image.open(path/to/image.png) pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract' text = pytesseract.image_to_string(path/to/image.png) Print(text)
Hope it helps.
Comments
-
Harsh Vardhan almost 2 years
I want to use pytesseract for ocr. So installed it. But before that i needed to install tesseract-ocr. I am using windows 8.1. I opened the command line and ran the command pip install tesseract-ocr. The following lines are the results of that command.
I am not able to understand whats happening here. How can I understand this and help me to successfully install tesseract on my pc?
C:\Users\HarshLaptop>pip install tesseract-ocr Collecting tesseract-ocr Using cached https://files.pythonhosted.org/packages/e2/0d/dcee3dd0fc4c7bcd181 25a98f8ba6d9db7aecaa40770595203e312649587/tesseract-ocr-0.0.1.tar.gz Requirement already satisfied: cython in c:\users\harshlaptop\anaconda3\lib\site -packages (from tesseract-ocr) (0.25.2) Building wheels for collected packages: tesseract-ocr Running setup.py bdist_wheel for tesseract-ocr ... error Complete output from command c:\users\harshlaptop\anaconda3\python.exe -u -c " import setuptools, tokenize;__file__='C:\\Users\\HARSHL~1\\AppData\\Local\\Temp\ \pip-install-x8nz3uhm\\tesseract-ocr\\setup.py';f=getattr(tokenize, 'open', open )(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __f ile__, 'exec'))" bdist_wheel -d C:\Users\HARSHL~1\AppData\Local\Temp\pip-wheel-s j29zfyo --python-tag cp36: running bdist_wheel running build running build_py file tesseract_ocr.py (for module tesseract_ocr) not found file tesseract_ocr.py (for module tesseract_ocr) not found running build_ext building 'tesseract_ocr' extension creating build creating build\temp.win-amd64-3.6 creating build\temp.win-amd64-3.6\Release C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -Ic:\users\harshlaptop\anaconda3\include -Ic:\ users\harshlaptop\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual S tudio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10 240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Pro gram Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /EHsc /Tptesseract_ocr.cpp /Fobuild\temp.win-amd64-3.6\R elease\tesseract_ocr.obj tesseract_ocr.cpp tesseract_ocr.cpp(463): fatal error C1083: Cannot open include file: 'leptonic a/allheaders.h': No such file or directory error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN \\x86_amd64\\cl.exe' failed with exit status 2 ---------------------------------------- Failed building wheel for tesseract-ocr Running setup.py clean for tesseract-ocr Failed to build tesseract-ocr Installing collected packages: tesseract-ocr Running setup.py install for tesseract-ocr ... error Complete output from command c:\users\harshlaptop\anaconda3\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\HARSHL~1\\AppData\\Local\\Tem p\\pip-install-x8nz3uhm\\tesseract-ocr\\setup.py';f=getattr(tokenize, 'open', op en)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, _ _file__, 'exec'))" install --record C:\Users\HARSHL~1\AppData\Local\Temp\pip-rec ord-vnlr99lk\install-record.txt --single-version-externally-managed --compile: running install running build running build_py file tesseract_ocr.py (for module tesseract_ocr) not found file tesseract_ocr.py (for module tesseract_ocr) not found running build_ext building 'tesseract_ocr' extension creating build creating build\temp.win-amd64-3.6 creating build\temp.win-amd64-3.6\Release C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -Ic:\users\harshlaptop\anaconda3\include -Ic :\users\harshlaptop\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0. 10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\P rogram Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Window s Kits\8.1\include\winrt" /EHsc /Tptesseract_ocr.cpp /Fobuild\temp.win-amd64-3.6 \Release\tesseract_ocr.obj tesseract_ocr.cpp tesseract_ocr.cpp(463): fatal error C1083: Cannot open include file: 'lepton ica/allheaders.h': No such file or directory error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\B IN\\x86_amd64\\cl.exe' failed with exit status 2 ---------------------------------------- Command "c:\users\harshlaptop\anaconda3\python.exe -u -c "import setuptools, tok enize;__file__='C:\\Users\\HARSHL~1\\AppData\\Local\\Temp\\pip-install-x8nz3uhm\ \tesseract-ocr\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.rea d().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" insta ll --record C:\Users\HARSHL~1\AppData\Local\Temp\pip-record-vnlr99lk\install-rec ord.txt --single-version-externally-managed --compile" failed with error code 1 in C:\Users\HARSHL~1\AppData\Local\Temp\pip-install-x8nz3uhm\tesseract-ocr\`enter code here`
-
Harsh Vardhan almost 6 yearsWhat is leptonica ?
-
A.s.e almost 6 yearsLeptonica is a library and dependency for tesseract.github.com/DanBloomberg/leptonica
-
Harsh Vardhan almost 6 yearsI have visual studio installed already. still need leptonica ?
-
A.s.e almost 6 yearsYes my friend, it is a library for image processing which tesseract uses and dependent for it.