This article was originally published in: Walker AI

OpenCV is a cross-platform computer vision and machine learning software library distributed under the BSD license (open source) that runs on Linux, Windows, Android, and Mac OS operating systems. It is lightweight and efficient — it consists of a series of C functions and a small number of C++ classes. It also provides interfaces to Python, Ruby, MATLAB and other languages and implements many common algorithms in image processing and computer vision.

OpenCV is widely used in image segmentation, face recognition, object recognition, motion tracking, motion analysis, machine vision and other fields.

The following are the basic operations of OpenCV and their application cases.

1. Basic Operations of OpenCV

1.1 Read, display and save operations

import cv2
image = cv2.imread("test.jpg") # read operation
cv2.imshow("test", image) # display operation
cv2.waitKey() # wait button
cv2.imwrite("save.jpg") # save operation
Copy the code

1.2 Change the color space

image = cv2.imread("test.jpg")
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV) # Convert to HSV space
hls = cv2.cvtColor(image, cv2.COLOR_BGR2HLS) # Convert to HLS space
lab = cv2.cvtColor(image, cv2.COLOR_BGR2Lab) # Switch to Lab space
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # Transform to GRAY space
Copy the code

The parameters of color in HSV model are hue (H), saturation (S) and brightness (V). This model is commonly used for green screen segmentation.

In image detection, samples can be transformed into color space to realize data enhancement, such as directly converting training data into HSV space, or adjusting the size of V(lightness) channel, changing the light and shade of the picture, and then transferring to BGR format.

1.3 Geometric transformation — Zooming, translation, rotation

A. the zoom
image = cv2.imread("test.jpg")
resize = cv2.resize(image, (), fx=0.5, fy=0.5, interpolation=cv2.INTER_AREA) Reduce length and width to 0.5 times
Copy the code
B. translation

In the image translation operation, it is necessary to create 2 rows and 3 columns transformation matrix, M matrix represents the horizontal translation is X, the vertical translation distance is Y.

import cv2
import numpy as np
image = cv2.imread("test.jpg")
rows, cols, channels = image.shape
M = np.float32([[1.0.100], [0.1.50]])
res = cv.warpAffine(image, M, (cols, rows))
Copy the code
C. rotating

The transformation matrix required for the rotation can be obtained by using the function cv2.getrotationMatrix2d.

image = cv2.imread('test.jpg')
rows, cols, channels = image.shape
rotate = cv2.getRotationMatrix2D((rows*0.5, cols*0.5), 45.1) # First parameter: rotation center point second parameter: rotation Angle third parameter: scale
res = cv2.warpAffine(image, rotate, (cols, rows))
Copy the code

1.4 Smoothing processing — blur and filter

The fuzzy filtering operation removes the salt and pepper noise in the image, improves the contrast of the image, sharpens the image and improves the stereoscopic sense.

image = cv2.imread('test.jpg')
blur = cv2.blur(image, (5.5)) The second parameter is the convolution kernel size
median_blur = cv2.medianBlur(image, 5) # Median filtering
gaussian_blur = cv2.GussianBlur(image, (5.5)) # Gaussian blur
Copy the code

1.5 Expansion and erosion

A. Image morphology operation

Image morphological operation is a collection of a series of image processing operations based on shape, which is mainly based on set theory.

  • Morphology has four basic operations: erosion, expansion, open and close
  • Swelling and corrosion are the most common morphological manipulation methods in image processing
  • Bloating is when the highlighted part of the image expands, “domain expansion”, and the rendering has a larger highlighted area than the original image. Erosion is when the highlights in the original image corrode, “eating into the territory”, and the rendering has a smaller highlight area than the original
B. Expansion and erosion

They can achieve a variety of functions, mainly as follows:

  • Eliminate noise
  • Separate image elements and connect adjacent elements in the image
  • Look for obvious regions of maxima or minima in the image
  • Find the gradient of the image
image = cv2.imread("test.jpg")
kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(3.3)) # Get convolution kernel
eroded = cv2.erode(image, kernel) # Corrosion image
dilated = cv2.dilate(image, kernel) # Inflated image
Copy the code
C. Open and closed operations
  • Open operation: first corrosion and then expansion, used to remove the spots formed by image noise
  • Closed operation: expansion before corrosion, used to join objects that have been mistakenly divided into many small pieces

1.6 Searching for a drawing contour

A. Find contours

Contour search is widely used in the field of image detection, such as searching for obvious color blocks, stripes, object edges and so on in the image. The image should be binarized before contour search.

# OpencV version greater than 3
contours, hierarchy = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE) # The first parameter: binary image searched; the second parameter: contour retrieval mode; the third parameter: contour approximation method
Contours is the list of contours found. Hierarhy is the hierarchy between contours
Copy the code
B. Draw the outline

After the contour is found, the drawContours function can be used to draw the contour

cv2.drawContours(temp,contours,-1, (0.255.0),3) # The first parameter: canvas, which can be the original picture. The second parameter: the contour found. The third parameter: -1 indicates the full painting
Copy the code

2. OpenCV engineering application – color block detection

2.1 Background

Game screen due to lack of art resources, program bugs will produce a variety of color blocks, common white, purple, etc., how to screen this kind of abnormal screen through the program, speed up the testing process?

2.2 Problem Analysis

Through observation, we found that the color block type anomalies are some regular rectangular images with obvious color difference. Based on these characteristics, we can easily screen out the color block.

2.3 Program Design

A. Image binarization

Other colors are eliminated by the value size of RGB channel to obtain black and white binary graph

import cv2
import numpy as np
image = cv2.imread("test.jpg")
b, g, r = cv2.split(self.image) # Separate B, G and R channels
b = np.where(b >= 250.1.0) # Set the pixel point that meets the requirement of G channel to 1, set the pixel point that does not meet the requirement to 0
g = np.where(g >= 250.1.0) # Set the pixel point that meets the requirement of G channel to 1, set the pixel point that does not meet the requirement to 0
r = np.where(r >= 250.1.0) # Set the pixel point that meets the requirement of G channel to 1, set the pixel point that does not meet the requirement to 0
gray = b + g + r # Stack the three channels into one channel
gray = np.where(gray==3.255.0).astype(np.uint8) # If the pixel point is 3, set it to white for the points that meet the requirements, and black for the points that do not meet the requirements
Copy the code

B. Find contours

In addition to the color block, other image positions that are close to the color block are retained in the black-and-white binary graph, which needs to be removed

By searching the contour of binary graph we can screen out the separated small color blocks

contours, hierarchy = cv2.findContours(gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE) Get the outline and hierarchy
Copy the code

C. Contour screening

By eliminating the small color blocks by contour circumference and area, these color blocks are likely to be extracted from the normal area of the more similar color points.

def screen_contour(contour) :
    contour_area = cv2.contourArea(contour)
    if contour_area > self.area_limit:
        return True
    return False
pass_contours = []
for contour in contours:
    if screent_contour(contour):
        pass_contours.append(contour)
Copy the code

Since the color block is closer to the rectangle, we can calculate the width and height of the outstanding block by contour, and the area of the color block can be estimated. The closer the area of the color block is to the estimated area, it means that the detected color block is closer to the rectangle. Through this method, most irregular color blocks can be screened out.

def screent_contour(contour) :
    width = np.max(contour[:,:,0]) - np.min(contour[:,:,0])
    height = np.max(contour[:,:,1]) - np.min(contour[:,:,1])
    block_area = width * heigh
	contour_area = cv2.contourArea(contour)
 	slimier = cv2.contourArea(contour) / block_area
    if slimier > self.simliar_rate:
       return True
    return False
pass_contours = []
for contour in contours:
    if screent_contour(contour):
        pass_contours.append(contour)
Copy the code

3. OpenCV and matrix

How does OpenCV do all this?

3.1 Image Reading

The default matrix is (height, width, channel). Channel corresponds to CHANNEL B, G, and R. The color of each pixel is determined by the three channels together. The size of B, G and R represents the color ratio.

3.2 Color space conversion

The conversion of color space is to transform image data from one representation relation to another. For example, when BGR is converted to HSV color space, the original three primary color representation is converted to hue (H), saturation (S) and lightness (V) representation, and the meaning of each channel is changed.

Considering from the perspective of information, the camera will be converted to digital form after light information collection (image matrix), color space conversion is image data from one representation to another representation, information transformation can also cause the loss of information or introduce noise, such as the camera when collecting light is easy to collect the salt and pepper noise, Some frequency bands in the illumination information are discarded, resulting in reduced clarity, etc. Information loss may also occur during color space conversion, such as from color image to gray image. Image detection and face recognition in deep learning are to extract the information we want from these image information.

3.3 Mathematical meaning of image rotation

When we do image geometry transformation, we need to provide transformation matrix, how does the matrix accomplish these operations?

We know that matrix represents is a kind of space mapping, NXM matrix (column vector linear) to represent a n d m dimensional space mapping, as shown in the above, the first by a 3 x3 cube of space matrix mapped to the second cube of space, the shape of a cube has changed, Another example is that the space of the first cube is mapped to the space of its shadow by a 3×2 matrix, and the cube is compressed into a plane.

Can a cube be mapped to a sphere by a matrix (real range)? Because it represents a linear transformation, it doesn’t work.

When we deal with image problems, we need to map information from one space to another. Due to the complexity of the problem, linear mapping cannot meet the requirements, which is why activation functions need to be added in deep learning.

OpenCV and machine learning

Analysis from the previous section, we found that the image processing is to find law from data, the image information from a representation transform to another, said machine learning strengths is just a job, then the extensions, whether the image data, text, audio, data, to find law from data, can use of machine learning, from the perspective of information, The questions are essentially indistinguishable.

OpenCV has integrated many machine learning algorithms, such as K-nearest Neighbor (KNN), support vector machine (SVM), decision tree, random forest, Boost, logistic regression, ANN, etc.

Here is an image from SciKit-Learn, which illustrates how different machine learning algorithms work with data, in a macro sense, transforming information into different Spaces.

5. Video book recommendations

There are too many interesting things to list. I recommend some videos and books for you to savor.

5.1 Linear Algebra

3BLUE1Brown’s “The Nature of Linear Algebra” video series visually explains the magic of linear algebra through graphic animations, and there are other great series as well.

“Programmer’s mathematics 3 linear algebra” is also a very good linear algebra teaching material, read carefully, benefit a lot.

5.2 probability theory

“Programmer’s mathematics 2 probability statistics” very systematic explained probability statistics knowledge, not very understand linear algebra, recommend to see linear algebra first.

5.3 OpenCV

Learning OpenCV is a great reference book.


PS: more dry technology, pay attention to the public, | xingzhe_ai 】, and walker to discuss together!