Use pytesseract OCR to recognize text from an image
Solution 1
Here is my solution:
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
im = Image.open("temp.jpg") # the second one
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = pytesseract.image_to_string(Image.open('temp2.jpg'))
print(text)
Solution 2
Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, its important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. From here, we can apply morphological operations to remove noise. Finally we invert the image. We perform text extraction using the --psm 6
configuration option to assume a single uniform block of text. Take a look here for more options.
Here's a visualization of the image processing pipeline:
Input image
Convert to grayscale ->
Gaussian blur ->
Otsu's threshold
Notice how there are tiny specs of noise, to remove them we can perform morphological operations
Finally we invert the image
Result from Pytesseract OCR
2HHH
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Morph open to remove noise and invert image
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
invert = 255 - opening
# Perform text extraction
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.imshow('invert', invert)
cv2.waitKey()
Solution 3
I have something different pytesseract approach for our community. Here is my approach
import pytesseract
from PIL import Image
text = pytesseract.image_to_string(Image.open("temp.jpg"), lang='eng',
config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)
Solution 4
To extract the text directly from the web, you can try the following implementation (making use of the first image)
:
import io
import requests
import pytesseract
from PIL import Image, ImageFilter, ImageEnhance
response = requests.get('https://i.stack.imgur.com/HWLay.gif')
img = Image.open(io.BytesIO(response.content))
img = img.convert('L')
img = img.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(img)
img = enhancer.enhance(2)
img = img.convert('1')
img.save('image.jpg')
imagetext = pytesseract.image_to_string(img)
print(imagetext)
Solution 5
Here is my small advancement with removing noise and arbitrary line within certain colour frequency range.
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
im = Image.open(img) # img is the path of the image
im = im.convert("RGBA")
newimdata = []
datas = im.getdata()
for item in datas:
if item[0] < 112 or item[1] < 112 or item[2] < 112:
newimdata.append(item)
else:
newimdata.append((255, 255, 255))
im.putdata(newimdata)
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = pytesseract.image_to_string(Image.open('temp2.jpg'),config='-c tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz -psm 6', lang='eng')
print(text)
Related videos on Youtube
![Smith John](https://i.stack.imgur.com/I9Zvc.jpg?s=256&g=1)
Smith John
Updated on July 09, 2022Comments
-
Smith John almost 2 years
I need to use Pytesseract to extract text from this picture:
and the code:
from PIL import Image, ImageEnhance, ImageFilter import pytesseract path = 'pic.gif' img = Image.open(path) img = img.convert('RGBA') pix = img.load() for y in range(img.size[1]): for x in range(img.size[0]): if pix[x, y][0] < 102 or pix[x, y][1] < 102 or pix[x, y][2] < 102: pix[x, y] = (0, 0, 0, 255) else: pix[x, y] = (255, 255, 255, 255) img.save('temp.jpg') text = pytesseract.image_to_string(Image.open('temp.jpg')) # os.remove('temp.jpg') print(text)
and the "temp.jpg" is
Not bad, but the result of print is
,2 WW
Not the right text2HHH
, so how can I remove those black dots? -
MAK over 6 yearsHi,when i use this code i am getting below error "UnicodeEncodeError: 'charmap' codec can't encode characters in position 11-12: c haracter maps to <undefined>". can you suggest a way to over come this
-
Moon Cheesez over 6 years@MAK You will need to install win-unicode-console on your windows
-
David about 6 yearssomething never worked with the image, can you edit and try again?
-
nishit chittora about 6 years@David can you please elaborate. What's not working?
-
David about 6 yearsmhm, don't remeber in the moment, but I'm sure it was not related to the code but to an uploaded image here propably. Did you remove an upload? Don't see it anymore.
-
RAno almost 5 yearsI have tried
-psm
and nothing worked, but after seeing your post I tried--psm
and it solved everything. great -
Md. Rezaul Karim over 2 yearsthis is one of the most accurate and neatly explained answers I have seen in SO! thanks!
-
Hariharan AR over 2 yearsThis will not work when The text in the image is not English. when i Tried this with Japanese and Arabic, The result is not good