TESSERACT OCR: TEXT DETECTION

INTRODUCTION.

    OCR also known as Optical Character Recognition is the conversion of two-dimensional images of text which could contain typed, printed text or handwritten text into machine-encoded text.

OCR software identifies and capture all the unique words using different languages from written text characters.

When the text occurs in unconstrained environments, like natural scenes, due to geometrical distortions, complex backgrounds, and diverse fonts like reading license plates from a moving vehicle, OCR remains a challenging problem.

TESSERACT OCR.

Tesseract is an opensource text recognition (OCR) Engine. It can be used to extract printed text from images. It supports a wide variety of languages. Tesseract is compatible with many programming languages and frameworks through wrappers like Python-Tesseract/pytesseract.

OCR Process flow.

Tesseract includes a neural network system configured as a text line recognizer. The input image is processed in rectangular boxes, line by line feeding into the LSTM model and giving output.

The python wrapper for tesseract is known as pytesseract. After installing Tesseract in the system, pytesseract can be installed using pip.

$ pip install pytesseract

Tesseract library comes with a command-line tool called tesseract. This tool can be used to perform OCR on images and the output is stored in a text file.

Implementation using pytesseract.

#Importing required packages.
import cv2
import os,argparse
import pytesseract
from PIL import Image

#Construct an Argument Parser
ap=argparse.ArgumentParser()
ap.add_argument(“-i”,”–image”,
                                   required=True,
                                   help=”Path to the image folder”)
ap.add_argument(“-p”,”–pre_processor”,
                                   default=”thresh”,
                                   help=”the preprocessor usage”)
args=vars(ap.parse_args())

images=cv2.imread(args[“image”])

#Convert to grayscale image
gray=cv2.cvtColor(images, cv2.COLOR_BGR2GRAY)

 

#Checking whether thresh or blur
if args[“pre_processor”]==”thresh”:
      cv2.threshold(gray, 0,255,cv2.THRESH_BINARY| cv2.THRESH_OTSU)[1]if args[“pre_processor”]==”blur”:
      cv2.medianBlur(gray, 3)

# Adding image to memory
filename = “{}.jpg”.format(os.getpid())
cv2.imwrite(filename, gray)
text = pytesseract.image_to_string(Image.open(filename))
os.remove(filename)
print(text)

# Output.
cv2.imshow(“Image Input”, images)
cv2.imshow(“Output In Grayscale”, gray)
cv2.waitKey(0)

Follow the steps to read the text from the image:
            •  Save the image and the code in the same folder.
            • Open the command prompt from the same folder where the image and                     code are saved.
            • Execute the command mentioned in the output section.

OUTPUT:

python tesseract.py –image Images/(title).jpg

 

Leave a Comment

Your email address will not be published. Required fields are marked *