“This is the 30th day of my participation in the First Challenge 2022. For details: First Challenge 2022”
Tag-free augmented reality
Based on unmarked augmented reality, camera pose estimation can be derived from images to find the corresponding relationship between known points in the environment and their camera projection. In this section, we will see how to extract features from an image to obtain camera attitude. Based on these features and their matching, we will see how the camera pose estimation can ultimately be derived.
Feature detection
A feature can be described as a small area in an image that keeps the scaling, rotation, and lighting of the image as constant as possible. Therefore, the same features can be detected in different images from different perspectives of the same scene. To sum up, a good feature should have the following characteristics:
- Repeatable (can extract the same features from different images of the same object)
- Distinguishable (images with different structures have different features)
OpenCV provides a number of algorithms to detect image features, including: Harris Corner Detection, Shi-Tomasi Corner Detection, SIFT (Scale Invariant Feature) Transform), SURF (Speeded Up Robust Features), FAST (Features from Accelerated Segment Test), BRIEF (Binary Robust) Independent Elementary Features) and ORB (Oriented FAST and Rotated BRIEF).
Next, take ORB algorithm as an example to detect and describe image features. ORB can be thought of as a combination of FAST keypoint detector and BRIE descriptor, with key improvements to improve performance.
First, the ORB detects keypoints (keypoints, 500 by default) using the modified fast-9 algorithm (radius = 9, and stores the direction of detected keypoints). Once a key point is detected, the next step is to compute the descriptor to obtain the information associated with each detected key point. ORB uses the improved BRIEF-32 descriptor to get the descriptor for each detected key point. The descriptor structure of the detected key points is as follows:
[43 106 98 30 127 147 250 72 95 68 244 175 40 200 247 164 254 168 146 197 198 191 46 255 22 94 129 171 95 14 122 207]Copy the code
As explained above, the first step is to create the ORB detector:
orb = cv2.ORB_create()
Copy the code
Then it is to detect key points in the image:
# load image
image = cv2.imread('example.png')
# Detect key points in the image
keypoints = orb.detect(image, None)
Copy the code
After a key point is detected, the next step is to calculate the descriptor of the detected key point:
keypoints, descriptors = orb.compute(image, keypoints)
Copy the code
Note that you can also execute the orb.detectandCompute (image, None) function to simultaneously detect key points and calculate the descriptors of detected key points.
Finally, draw the detected key points using the cv2.drawkeypoints () function:
image_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255.0.255), flags=0)
Copy the code
The running results are shown below. The image on the right shows the key points detected by the ORB key point detector (the image has been partially enlarged for better observation) :
Feature matching
Next, you’ll see how to match the detected features. OpenCV provides two matchers:
- A brute force (
Brute-Force
.BF
) matcher: This matcher matches all the descriptors in the second group with the descriptors calculated for the features detected in the first group. Finally, it returns the closest match. - Fast Library for Approximate Nearest Neighbors (
FLANN
) matcher: For large data sets, this matcher is faster than BF matcher and utilizes the optimization algorithm of nearest neighbor search.
Next, we use the BF matcher to see how to match the detected features. The first step is to detect key points and calculate descriptors:
# load image
image_1 = cv2.imread('example_1.png')
image_2 = cv2.imread('example_2.png')
# ORB detector initialization
orb = cv2.ORB_create()
Use the ORB to detect key points and calculate descriptors
keypoints_1, descriptors_1 = orb.detectAndCompute(image_1, None)
keypoints_2, descriptors_2 = orb.detectAndCompute(image_2, None)
Copy the code
The next step is to create the BF matcher object with cv2.bfmatcher () :
bf_matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
Copy the code
NormType, the first parameter to the cv2.bfmatcher () function, is used to set the distance measurement method, which defaults to Cv2.norm_l2. If you use the ORB descriptor (or another binary-based descriptor, such as BRIEF or BRISK), The distance measurement method to be used is Cv2. NORM_HAMMING; The second argument, crossCheck (which defaults to False), can be set to True so that only matching features from the two collections are returned during matching.
After the matcher object is created, use the bfMatcher.match () method to match the detected descriptor:
bf_matches = bf_matcher.match(descriptors_1, descriptors_2)
Copy the code
Descriptors_1 and Descriptors_2 are the computed descriptors; The return value is that the best match is obtained between the two images, and we can sort the matches in ascending order of distance:
bf_matches = sorted(bf_matches, key=lambda x: x.distance)
Copy the code
Finally, we can use the cv2.drawmatches () function to drawMatches, showing only the first 20 matches for better visualization:
result = cv2.drawMatches(image_query, keypoints_1, image_scene, keypoints_2, bf_matches[:20].None, matchColor=(255.255.0), singlePointColor=(255.0.255), flags=0)
Copy the code
The cv2.drawmatches () function horizontally concatenates two images and draws a line from the first image to the second to show the matched feature pairs.
Use feature matching and homography calculations to find objects
Finally we will use the above feature matching and homography calculation to find the object. To achieve this goal, after feature matching is completed, the next step is to use the cv2.findHomography() function to find the perspective transformation between the positions of matching key points in the two images.
OpenCV provides several methods for calculating homography matrices, including RANSAC, minimum median (LMEDS), and PROSAC (RHO). Let’s use the RANSAC method as an example:
Extract matching key points
pts_src = np.float32([keypoints_1[m.queryIdx].pt for m in best_matches]).reshape(-1.1.2)
pts_dst = np.float32([keypoints_2[m.trainIdx].pt for m in best_matches]).reshape(-1.1.2)
# Calculate the homography matrix
M, mask = cv2.findHomography(pts_src, pts_dst, cv2.RANSAC, 5.0)
Copy the code
Pts_src is the position of the matched key points in the source image, and PTS_dst is the position of the matched key points in the query image. The fourth parameter ransacReprojThreshold sets the maximum reprojection error to treat point pairs as interior points, and if the reprojection error is greater than 5.0, the corresponding point pairs are treated as outliers. This function computes and returns the perspective transformation matrix M between the source plane and the target plane defined by the positions of key points.
Finally, based on the perspective transformation matrix M, the four angles of the object in the query image are calculated and used to draw the matched target boundary box. To do this, the four angles of the original image are calculated from their shape and converted to the target Angle using the cv2.perspectiveTransform() function:
Get the angular coordinates of the "query" image
h, w = image_query.shape[:2]
pts_corners_src = np.float32([[0.0], [0, h - 1], [w - 1, h - 1], [w - 1.0]]).reshape(-1.1.2)
# Use the matrix M and the corners of the "query" image to perform a perspective transform to get the corners of the "detect" object in the "scene" image:
pts_corners_dst = cv2.perspectiveTransform(pts_corners_src, M)
Copy the code
Pts_corners_src contains the four corner coordinates of the original image, and M is the perspective transformation matrix. Pts_corners_dst output contains the four corners of the object in the query image. After that, we can use the cv2.polyline() function to draw the outline of the object to be detected:
img_obj = cv2.polylines(image_scene, [np.int32(pts_corners_dst)], True, (0.255.255), 10)
Copy the code
Finally, use the cv2.drawmatches () function to drawMatches:
img_matching = cv2.drawMatches(image_query, keypoints_1, img_obj, keypoints_2, best_matches, None, matchColor=(255, 255, 0), singlePointColor=(255, 0, 255), flags=0)
Copy the code