Previous portal:
“Python Image Processing OpenCV (1) : Getting Started”
“Python Image Processing OpenCV (2) : Pixel Processing with Numpy manipulation and Matplotlib Display images”
“Python Image Processing OpenCV (3) : Image Properties, ROI Regions of Interest and Channel Processing”
“Python Image Processing OpenCV (4) : Image Arithmetic and Color Space Modification”
“Python Image Processing OpenCV (5) : Geometric Transformations of Images”
“Python Image Processing OpenCV (6) : Image Threshold Processing”
“Python Image Processing OpenCV (7) : Image Smoothing (Filtering) Processing”
“Python Image Processing OpenCV (8) : Image Decay and Image Bloat”
“Python Image Processing OpenCV (9) : Image processing Morphology open, Closed, and gradient operations”
“Python Image Processing OpenCV (10) : Top hat and Black Hat Operations for Image Processing Morphology”
“Python Image Processing OpenCV (11) : Canny Operator Edge Detection
“Python Image Processing OpenCV (12) : Edge Detection Techniques for Roberts, Prewitt, Sobel, and Laplacian Operators”
“Python Image Processing OpenCV (13) : Edge Detection techniques using Scharr and LOG Operators”
The introduction
In the previous article, we used the image method or the zoom function resize(), which can either zoom in on an image or zoom out on an image, where:
- Zoom out image: one edition for use
CV_INETR_AREA
(Regional interpolation) to interpolate. - Enlarge image for general use
CV_INTER_LINEAR
(linear interpolation) for interpolation.
In addition to the resize() function, there is another way to scale an image – the “image pyramid”.
What is an image pyramid?
Before we get to the picture pyramid, we need to introduce another concept: scale.
Scale: Literally means size and resolution.
In image processing, we often enlarge or reduce the size of the source image, and then convert it into the target image of the size we need.
The process of enlarging and shrinking an image is called scaling.
Image pyramid is an important way of image multi-scale adjustment.
Image pyramid is a kind of multi-scale representation of image, which is an effective but simple conceptually structure to interpret image with multi-resolution. A pyramid of an image is a collection of images arranged in pyramid shape with progressively reduced resolution and derived from the same original image. It is obtained by cascading down sampling until a certain termination condition is reached. We compare the layers of the image to a pyramid, the higher the level, the smaller the image and the lower the resolution.
The general idea of image pyramid method is to decompose each image involved in the fusion into multi-scale pyramid image sequence, with low-resolution image in the upper layer and high-resolution image in the lower layer, and the size of the upper image is 1/4 of the size of the previous one. The number of layers is 0, 1, 2… N. The pyramid of all images is fused on the corresponding layer with certain rules to get the composite pyramid, and then the composite pyramid is reconstructed according to the inverse process of pyramid generation to get the fusion pyramid.
implementation
In general, we talk about two types of graphic pyramids: the “Gaussian Pyramid” and the “Laplacian pyramid.”
The Gaussian Pyramid
A Gaussian pyramid is a series of images sampled downward from the maximum resolution image at the bottom. The lowest image has the highest resolution, and the higher you go, the lower the resolution.
Sampling down from gauss pyramid:
This process is actually a process of repeating gaussian smoothing and resampling the image.
- For the original image, a Gaussian smoothing process is carried out first, and a Gaussian kernel (
5 * 5
) for a convolution process. The following is5 * 5
Gaussian kernel of.
- The next step is to sample the image, which removes even rows and odd columns from the image to produce an image.
- Then repeat the above two steps until you get the final target image.
As can be seen from the above steps, in each cycle, the resulting image is only 1/4 of the size of the original image (interlaced sampling is performed horizontally and vertically).
Note: downward sampling will gradually lose image information, which belongs to nonlinear processing, irreversible process, and belongs to lossy processing.
Gauss pyramid sampling up:
- Enlarge the image twice as large in each direction, filling the new rows and columns with zeros.
- Using a Gaussian kernel (
5 * 5
) perform a Gaussian smoothing process on the resulting image to obtain the approximate value of “added pixels”.
Note: this process, like the downward sampling process, belongs to nonlinear processing, irreversible, and lossy processing.
The image obtained in this process is the enlarged image, which will be fuzzy compared with the original image, because some image information is lost in the zooming process. If you want to reduce the loss of information in the whole process of zooming and zooming.
If you want to reduce the loss of image information during scaling, this leads to a second image pyramid, the “Laplace pyramid”.
Laplacian Pyramid
Laplace’s pyramid can be thought of as a residual pyramid, which stores the difference between the original image and the downsampled image.
Above, we introduced an original image Gi based on Gauss pyramid. Firstly, G(i-1) was obtained by downward sampling, and then Up(Down(Gi) was obtained by upward sampling of G(i-1). Finally, Up(Down(Gi)) was different from the original Gi.
This is because the information lost by downward sampling cannot be recovered by upward sampling, and gaussian pyramid is a lossy sampling method.
If we want to restore the original image completely, we need to retain the difference information during sampling.
This is the core idea of The Laplacian pyramid. After each Down sampling, it will sample Up again and get Up sampling (Down(Gi)), and record the difference between Up(Down(Gi)) and Gi.
The following formula is the difference recording process:
OpenCV function
OpenCV provides two functions for upsampling and downsampling: pyrDown() and pyrUp().
The antiderivative of pyrDown() looks like this:
def pyrDown(src, dst=None, dstsize=None, borderType=None)
Copy the code
- SRC: indicates the input image.
- DST: indicates the output image, which is of the same type and size as SRC.
- Dstsize: indicates the size of the target image after downsampling.
- BorderType: Represents the processing of image boundaries.
Cols (src.cols+1)/2, (src.rows+1)/2). And no matter how to specify this parameter, certainly must make the pledge that we shall meet the following formula: | dstsize. Width * 2 – SRC. Cols | 2 or less. | dstsize. Height * 2 – SRC. Rows | 2 or less. In other words, downsampling means actually reducing the size of the image by half, both rows and columns by half.
The original function of pyrUp() is as follows:
def pyrUp(src, dst=None, dstsize=None, borderType=None)
Copy the code
- SRC: indicates the input image.
- DST: indicates the output image, which is of the same type and size as SRC.
- Dstsize: indicates the size of the target image after downsampling.
- BorderType: Represents the processing of image boundaries.
Parameter definition is the same as pyrDown() above.
Here are code examples for the Gauss and Laplacian pyramids:
import cv2 as cv
# Gauss Pyramid
def gaussian_pyramid(image):
level = 3 # set the number of layers of pyramid to 3
temp = image.copy() # copy image
gaussian_images = [] Create an empty list
for i in range(level):
dst = cv.pyrDown(temp) # Gaussian smoothing is performed on the image first, and then down-sampling is performed (image size is halved in row and column directions)
gaussian_images.append(dst) Add a new object at the end of the list
cv.imshow("gaussian"+str(i), dst)
temp = dst.copy()
return gaussian_images
# Laplacian pyramid
def laplacian_pyramid(image):
gaussian_images = gaussian_pyramid(image) You must use the result of the Gauss pyramid to make the Laplacian pyramid
level = len(gaussian_images)
for i in range(level- 1.- 1.- 1) :if (i- 1) < 0:
expand = cv.pyrUp(gaussian_images[i], dstsize = image.shape[:2])
laplacian = cv.subtract(image, expand)
# Show the difference image
cv.imshow("laplacian_down_"+str(i), laplacian)
else:
expand = cv.pyrUp(gaussian_images[i], dstsize = gaussian_images[i- 1].shape[:2])
laplacian = cv.subtract(gaussian_images[i- 1], expand)
# Show the difference image
cv.imshow("laplacian_down_"+str(i), laplacian)
src = cv.imread('maliao.jpg')
print(src.shape)
Convert the image to a square, otherwise an error will be reported
input_image = cv.resize(src, (560.560))
# Set to WINDOW_NORMAL for arbitrary scaling
cv.namedWindow('input_image', cv.WINDOW_AUTOSIZE)
cv.imshow('input_image', src)
laplacian_pyramid(src)
cv.waitKey(0)
cv.destroyAllWindows()
Copy the code
One thing to note about this program is that my current version of Opencv-Python is 4.3.0.36. In theory, during the upward sampling process, Only need to meet the target size relations | dstsize. Width – SRC. Cols * 2 | (dstsize. Width mod 2) or less.
In fact, after testing, the input image must use square, rectangular image will directly burst the following error:
error: (-215:Assertion failed) std::abs(dsize.width - ssize.width*2) == dsize.width % 2 && std::abs(dsize.height - ssize.height*2) == dsize.height % 2 in function 'cv::pyrUp_'
Copy the code
Specific reason did not think through, hope which know big guy can explain.
reference
Blog.csdn.net/zhu_hongji/…
zhuanlan.zhihu.com/p/80362140