This is the first day of my November challenge.

Use OpenCV and Python to recognize numbers

This article demonstrates how to use OpenCV and Python to recognize numbers in an image.

In the first part of this tutorial, we will discuss what seven-segment displays are and how we apply computer vision and image processing operations to identify these types of numbers (no machine learning required!).

Seven segment display

You may already be familiar with seven-segment displays, even if you don’t know the specific terms. A good example of this display is your classic digital alarm clock:

Each number on the alarm clock is represented by a seven-segment component, as follows:

A total of 128 possible states can be displayed on the seven-segment display:

We’re only interested in 10 of them — the numbers 0 to 9:

Our goal is to write OpenCV and Python code to recognize each of these ten numeric states in the image.

Design OpenCV digital recognizer

We will use the thermostat image as input:

Steps to identify:

Step 1: Position the LCD on the thermostat. This can be done using edge detection because there is enough contrast between the plastic case and the LCD.

Step 2: Extract the LCD. Given an input edge diagram, I can find the outline and look for the outline of the rectangle — the largest rectangular area should correspond to the LCD. The perspective transform will give me a good LCD extraction.

Step 3: Extract the numeric region. Once I had the LCD itself, I could focus on extracting numbers. Since there seems to be a contrast between the digital area and the LCD background, I believe that threshold and shape manipulation can achieve this.

Step 4: Identify numbers. Identifying the actual number using OpenCV will involve dividing the numeric ROI into seven parts. From there I can apply pixel counts on the threshold image to determine whether a given fragment is “on” or “off”.

So to see how we can complete this four-step process for number recognition using OpenCV and Python, read on.

Identify numbers using computer vision and OpenCV

Let’s continue with this example. Create a new file, name it identify_digits.py, and insert the following code:

# import the necessary packages from imutils.perspective import four_point_transform from imutils import contours import  imutils import cv2 # define the dictionary of digit segments so we can identify # each digit on the thermostat DIGITS_LOOKUP = { (1, 1, 1, 0, 1, 1, 1): 0, (0, 0, 1, 0, 0, 1, 0): 1, (1, 0, 1, 1, 1, 1, 0): 2, (1, 0, 1, 1, 0, 1, 1): 3, (0, 1, 1, 1, 0, 1, 0) : 4, (1, 1, 0, 1, 0, 1, 1), 5 (1, 1, 0, 1, 1, 1, 1) : 6, (1, 0, 1, 0, 0, 1, 0) : 7, (1, 1, 1, 1, 1, 1, 1) : 8, (1, 1, 1, 1, 0, 1, 1) : 9}Copy the code

Import the Python packages we need. Introduce Mutils, my series of convenience functions that make it easier to use OpenCV + Python. If you haven’t already installed Imutils, it’s time to take a moment to install the package on your system using PIP: Recognize numbers using OpenCV and Python

pip install imutils
Copy the code

Define a Python dictionary called DIGITS_LOOKUP. Their key to the table is the seven-segment array. A 1 in the array indicates that the given segment is open, and a zero indicates that the segment is closed. The value is the actual number itself: 0-9.

Once we have identified the segments in the thermostat display, we can pass the array to our DIGITS_LOOKUP table and get the numeric values. For reference, the dictionary uses the same segment order as in Figure 2 above. Let’s continue with our example:

# load the example image
image = cv2.imread("example.jpg")
# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)
Copy the code

Load our image.

Then we preprocess the image by: –

  • Resize.
  • Convert the image to grayscale.
  • Use a 5×5 kernel to apply Gaussian blur to reduce high-frequency noise.
  • The edge graph was calculated by Canny edge detector.

After applying these preprocessing steps, our edge graph looks like this:

Notice how the outline of the LCD is clearly visible — this completes step #1. We can now proceed to step 2, extracting the LCD itself:

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
    cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None
# loop over the contours
for c in cnts:
    # approximate the contour
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.02 * peri, True)
    # if the contour has four vertices, then we have found
    # the thermostat display
    if len(approx) == 4:
        displayCnt = approx
        break
Copy the code

To find the LCD region, we need to extract the contour of the region in the edge graph (that is, the contour).

We then sorted the contours by area, making sure to put the larger contours first in the list.

Given a list of contours we sort, loop through them one by one and apply the contour approximation.

If our approximate contour has four vertices, then we assume we have found the thermostat display. This is a reasonable assumption because the largest rectangular area in our input image should be the LCD itself.

After obtaining four vertices, we can extract the LCD through a four-point perspective transform:

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))
Copy the code

Applying this perspective transformation gives us a top-down LCD aerial view:

This view of the OBTAINED LCD satisfies step 2 — we are now ready to extract the number from the LCD:

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
    cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
Copy the code

To get the number itself, we need to threshold the distorted image to show the dark area (the number) against a brighter background (the background of the LCD display) :

Then we apply a series of morphological operations to clean the threshold image:

Now that we have a nicely segmented image, we need to apply contour filtering again, only this time we are looking for the actual numbers:

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
    cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
digitCnts = []
# loop over the digit area candidates
for c in cnts:
    # compute the bounding box of the contour
    (x, y, w, h) = cv2.boundingRect(c)
    # if the contour is sufficiently large, it must be a digit
    if w >= 15 and (h >= 30 and h <= 40):
        digitCnts.append(c)
Copy the code

To do this, we find the contour in the threshold image. Initialize the digitsCnts list — this list will store the outline of the number itself.

Loop over each contour.

For each contour, we calculate the bounding box to ensure that the width and height are acceptable sizes, and if so, update the digitsCnts list.

If we loop through the outline inside digitsCnts and draw bounding boxes on the image, the result will look like this:

Sure enough, we found the number on the LCD screen! The final step is to actually identify each number:

# sort the contours from left-to-right, then initialize the
# actual digits themselves
digitCnts = contours.sort_contours(digitCnts,
    method="left-to-right")[0]
digits = []
Copy the code

Here, we just sort the number contours from left to right according to (x, y) coordinates.

This sorting step is necessary because there is no guarantee that the contours have been sorted from left to right (in the same direction as we read the numbers).

Here’s the actual number recognition process:

# loop over each of the digits for c in digitCnts: # extract the digit ROI (x, y, w, h) = cv2.boundingRect(c) roi = thresh[y:y + h, x:x + w] # compute the width and height of each of the 7 segments # we are going to examine (roiH, Shape (dW, dH) = (int(roiW * 0.25), Int (roiH * 0.05)) dHC = int(roiH * 0.05) # define the set of 7 segments = [(0, 0), (w, dH)), # top ((0, 0), (dW, h // 2)), # top-left ((w - dW, 0), (w, h // 2)), # top-right ((0, (h // 2) - dHC) , (w, (h // 2) + dHC)), # center ((0, h // 2), (dW, h)), # bottom-left ((w - dW, h // 2), (w, h)), # bottom-right ((0, h - dH), (w, h)) # bottom ] on = [0] * len(segments)Copy the code

Iterate over each numeric contour. For each of these regions, we compute the bounding box and extract the numeric ROI.

I have included GIF animations of each number ROI below:

Given the numeric ROI, we now need to locate and extract the seven parts of the digital display.

Calculate the approximate width and height of each segment based on ROI dimensions. And then we define a list of (x, y) coordinates that correspond to seven line segments. This list follows the same segment order as figure 2 above. Here is a sample GIF animation that draws a green box on the current segment being investigated:

Finally, we initialize our on list — a value of 1 in the list means that the given segment is “open” and a value of zero means that the segment is “closed.” Given the (x, y) coordinates of seven display segments, it is fairly easy to identify whether a segment is open or closed: Finally, we initialize our ON list — a value of 1 in this list means that the given segment is “open” and a value of zero means that the segment is “closed.” Given the (x, y) coordinates of seven display segments, it is fairly easy to identify whether a segment is open or closed:

# loop over the segments for (i, ((xA, yA), (xB, yB))) in enumerate(segments): # extract the segment ROI, count the total number of # thresholded pixels in the segment, and then compute # the area of the segment segROI = roi[yA:yB, xA:xB] total = cv2.countNonZero(segROI) area = (xB - xA) * (yB - yA) # if the total number of non-zero pixels is greater Than # 50% of the area, mark the segment as "on" if total/float(area) > 0.5: on[i]= 1 # lookup the digit and draw it on the image digit = DIGITS_LOOKUP[tuple(on)] digits.append(digit) cv2.rectangle(output, (x, y), (x + w, y + h), (0, 255, 0), 1) cv2.putText(output, str(digit), (x - 10, y - 10), Cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 255, 0), 2Copy the code

We’re going to loop through the (x, y) coordinates of each segment.

We extract the ROI of the fragment and then calculate the number of non-zero pixels (that is, the number of pixels “on” in the fragment).

If the ratio of non-zero pixels to the total area of the segment is greater than 50%, then we can assume that the segment is “on” and update our ON list accordingly. After looping seven paragraphs, we can pass the list to DIGITS_LOOKUP to get the number itself.

We then draw a bounding box around the number and display the number on the output image. Finally, our last block of code prints the number to our screen and displays the output image:

# display the digits
print(u"{}{}.{} \u00b0C".format(*digits))
cv2.imshow("Input", image)
cv2.imshow("Output", output)
cv2.waitKey(0)
Copy the code

Note how we correctly recognize numbers on the LCD screen using Python and OpenCV:

conclusion

In today’s blog post, I demonstrated how to use OpenCV and Python to recognize numbers in images.

This approach is dedicated to seven-segment displays (the kind you would normally see on a digital alarm clock).

By extracting each of the seven segments and applying basic threshold and morphological operations, we can determine which segments are “on” and which are “off.”

From there, we can look up on/off segments in Python dictionary data structures to quickly determine the actual number — no machine learning required!

As I mentioned at the beginning of this post, using computer vision to recognize numbers in a thermostat image tends to overcomplicate the problem — thermometers are more reliable and require much less work to record using data.

I hope you enjoyed today’s post!