Datawhale learning \

Author: Tong Yao, Datawhale Excellent Learner

Message: this paper summarizes the principles of nearest neighbor interpolation, bilinear interpolation and cubic spline interpolation, and takes image scaling as an example to implement the principles in C++ and Python.

In image processing, geometric transformation is the operation of mapping an image to another image, which can be roughly divided into expansion and contraction, inversion, affine (translation, rotation), perspective, remapping.

In geometric transformation, some pixels cannot be directly assigned values. For example, if the image is magnified twice, there will be more pixels that cannot be directly mapped. For these pixels, their values are determined by interpolation. The results of different interpolation methods are different.

In an input image [u, v], gray values are defined only in integer positions. However, the coordinates of the output image are generally non-integer coordinates when mapped back to the original image. Therefore, the gray value of the output image [x, Y] is generally determined by non-integer coordinates, and the pixel value of non-integer coordinates needs interpolation algorithm to process. Common interpolation algorithms include nearest neighbor interpolation, bilinear interpolation and cubic spline interpolation.

In this paper, the target

  • Understand the relationship between interpolation algorithm and common geometric transformation
  • Understand the principles of interpolation algorithms
  • Master the use of interpolation algorithm API under OpenCV framework

The principle of interpolation algorithm is introduced

Nearest neighbor interpolation algorithm

1. Principle introduction

After the point in the target image is mapped to the original image, the pixel value of the nearest integer coordinate point is found and output as the pixel value of the point.

As shown in the figure above, a point in the target image is projected into the original image at point P, and the nearest point to P is Q11. At this point, it is easy to know that F (P)= F (Q11).

2. Examples

As shown in the figure:

To enlarge a 3*3 image to 4*4, f(x,y) represents the original image, and H (x,y) represents the target image, we have the following formula:

3. The shortcomings

By the nearest neighbor interpolation method, the enlarged image has very serious Mosaic, which will appear obvious block effect. There is a serious distortion in the reduced image.

This is the most basic and simplest way to scale an image. The pixel value of each pixel after transformation is determined by only one pixel in the original image. For example, above, the pixel of point (0,0.75) is only determined by (0,1), which is obviously not good. The pixel of point (0,0.75) is not only related to (0,1), but also related to (0,0), only (0,1) has more influence. If several nearby pixels can be allocated according to the weight to jointly determine the pixel of a certain point in the target image, the effect will be better. The following bilinear interpolation solves this problem.

Bilinear interpolation algorithm

1. Linear interpolation

Let’s talk about linear interpolation before we talk about bilinear interpolation. Linear interpolation: The use of a line connecting two known quantities to determine the value of an unknown quantity between them. Linear interpolation form:

As shown below: \

Linear interpolation polynomial:

And actually, this is true even if x is not between x0 and x1. In this case, the method is called linear interpolation.

Error of linear interpolation: linear interpolation is actually the Lagrangian interpolation has 2 nodes. Interpolation remainder is:

It can be seen from the interpolation remainder that the error of linear interpolation increases with the increase of the second derivative. That is, the greater the curvature of the function, the greater the error of linear interpolation approximation.

Let me give you an example. In the following figure, the original image is on the left. After stretching, the ideal pixel distribution of the output image should be pointed by the green arrow, but according to linear interpolation, the result will be pointed by the red arrow. \

2. Bilinear interpolation

Bilinear interpolation form:

Bilinear interpolation is a generalization of linear interpolation in two dimensions. A hyperbolic paraboloid is defined to fit four known points.

The specific operation is to perform two linear interpolation calculations in the X direction, and then one interpolation calculation in the Y direction. As shown below:

First,f(x,y) is a binary function, assuming we know the values of the four points f(x0,y0),f(x1,y1),f(x0,y1), and f(x1,y0). These four points define a rectangle, and we want to interpolate the values of the function at any point in the rectangle.

Firstly, linear interpolation is performed twice in the x direction to obtain:

Then linear interpolation is performed in the y direction to obtain:

Together, this is the result of bilinear interpolation:

If you choose a coordinate system such that the coordinates of the four points known to f(x) are (0,0),(0,1),(1,0),(1,1), then determine a unit square where the four points are the four vertices of the square:

  • First, the linear interpolation of the two vertices on the upper end is worth:

  • The linear interpolation of the two vertices at the bottom is worth:

  • Finally, linear interpolation in the vertical direction is done to determine:

  • The simplified form of interpolation formula is as follows:

3. Align the geometric center of the original image and the target image

When calculating the virtual coordinate points corresponding to the original image in the target image, the general transformation is as follows:

Under this transformation, some points of the original image are not involved in the calculation. For example, if the original image of 9∗9 is reduced to 3∗3, the origin of the original image (0,0) and the origin of the target image (0,0) are both at the upper left corner, and the coordinates of the upper right corner of the target image are (0,2), corresponding to the coordinates of the original image are (0∗(9/3),2∗(9/3))=(0,6). There are no more dots to the right of the target image, and the pixels to the right of (0,6) are no longer needed.

The corresponding relationship between the pixels of the original image and the target image is as follows:

As you can see from the picture, only the red part circled is involved in the calculation. The gray value of each pixel in the target image is upper left relative to the original image, and the element in the lower right corner does not actually participate in the operation.

To align the center of the original image with the target image, we specify another transformation:

That is to add a modifier to the original transformation:

0.5 (src_width/dst_width – 1)

Under this transformation, the center point of the target image (1,1) corresponds to the center point of the original image (4,4). The geometric centers of the two images overlap, making full use of the points of the original image. In addition, each pixel of the target image is equally spaced and has a certain margin with both sides. In fact, in openCv, this is the same kind of transformation.

4. The calculation process of CV. Resize () \

For the reduced image, each point in the target image can find the four adjacent points surrounding it in the original image, and each point can be bilinear interpolation.

For magnified images, points near the boundary may be beyond the range of the original image after coordinate transformation. For example, the original image of 3∗3 is enlarged to 4∗4.

  • Intermediate point: bilinear interpolation \

All the points in the middle can be found in the original image surrounded by four adjacent points, which can be done by bilinear interpolation.

  • Points on boundaries (except vertices) : linear interpolation

For example, the point (1,3) in the target image corresponds to the point (0.625,2.125) of the original image. The maximum ordinate of the original image is 2. The four points (0.625,2.125) cannot be found, so the two nearest points (0,2) and (1,2) are used for linear interpolation (interpolation). Get the pixel value of (1,3) in the target image.

  • Four vertices: nearest neighbor interpolation

For example, the vertex (0,3) in the upper right corner of the target image can be directly used as its value for the point (0,2.125) in the original image.

Calculation process:

H (x,y) represents the target image, and F (x,y) represents the original image

  • Intermediate points: Bilinear interpolation

  • Points on boundaries (except vertices) : linear interpolation

  • Four vertices: nearest neighbor interpolation

This can be tested with code examples: \

import cv2
import numpy as np
src = np.array([[56.23.15], [65.32.78], [12.45.62]],dtype=np.uint8)
print(src)
dst = cv2.resize(src,dsize=(4.4),interpolation=cv2.INTER_LINEAR)
print(dst)
Copy the code

Cubic spline interpolation algorithm

Given n+1 points, a=x_0<x_1<… <x_n=b, and their function values f(x_i), I =0,1,2… On n, determine a cubic polynomial:

Each cubic polynomial has four unknown parameters, n intervals, n polynomials, a total of 4n unknown parameters. We know that “n unknowns need n known conditions to determine the unique solution”, so a total of 4N known conditions are needed to determine the 4n unknown parameters.

Each cubic polynomial satisfies the following conditions:

There are a total of 4N −2 conditions above, and 2 more conditions are needed, which are determined by the following three boundary conditions:

With 4N conditions, we can determine the cubic polynomial over each interval.

For each point in the interval, Si(x) can be used to obtain the interpolation result. Cubic spline interpolation has good convergence, stability and smoothness, so it is a very important interpolation tool.

The function of cubic spline interpolation is mainly understood here, and the specific derivation process is rather tedious. If you want to know more, you can refer to the information. \

Two ways of mapping

Both forward mapping and backward mapping are the process of getting an image into another image through geometric transformation. Their purpose is to get the pixels of the target image, but in different ways.

Forward mapping

The essence of image transformation is to map the coordinate of pixel point to another position through some kind of function relation.

The forward mapping process can be decomposed into two steps: coordinate transformation + allocation of pixel values

Coordinate transformation of forward mapping: the position of the pixel in the target image is calculated from the original image coordinates.

For example, we know that the coordinates of a pixel point of the original image (x,y) are (x ‘, Y ‘) in the new image after transformation. The coordinates after transformation are generally non-integer, and the non-integer coordinates are meaningless. Therefore, the pixels of this point are allocated to the four surrounding pixels according to the weight. For the point whose coordinate is still integer after transformation, the pixel value can be directly assigned to the corresponding point in the target image. By performing this coordinate transformation and assigning pixel values to all pixels of the original image, a new image is obtained.

So, the pixel value of each pixel of the new image is assigned to it by the pixels of the non-integer coordinates surrounding it and superimposed (or directly equal to the pixel value of an integer coordinate point). Because of this distributive, superimposed nature, forward mapping is sometimes called pixel transfer mapping.

For the forward mapping, although the sum of the allocation coefficients for each point in the original image is 1. However, the pixel value of each point on the target image is superimposed by multiple assigned values, so it is not guaranteed that the sum of all assigned weights is 1. Therefore, it is necessary to record all the weights assigned to it and add them up, and finally use the accumulated weights to normalize, so as to get the correct interpolation results. Therefore, to determine the pixel value of a certain point in the target image, it is necessary to traverse all pixel values of the original image, carry out coordinate transformation and allocate pixel values. This is the disadvantage of forward mapping. \

Backward mapping

The backward mapping process can be decomposed into two steps: coordinate transformation + interpolation.

Coordinate transformation of backward mapping: the position of the pixel in the original image is calculated in turn from the coordinates of the output image

The previous interpolation methods are examples of backward mapping. It is to calculate the coordinates of the original image by the coordinates of the target image, and then determine which points of the original image its pixel value is allocated according to the weight. Then the pixel value of the point is obtained by interpolation operation. The pixel value of a point can be obtained by a single operation, without traversing all the pixels. Backward mapping is also called pixel fill algorithm. Backward mapping method solved the problem of missing points, and Mosaic appeared.

implement

C + + implementation

1. Function prototype

void cv::resize(InputArray src, OutputArray dst, Size dsize, double fx=0.double fy=0.int interpolation=INTER_LINEAR )
Copy the code

2. Interpolation

Note: The target image size can be determined by “parameter Dsize” and “parameter FX and fy”. Both parameters cannot be 0 at the same time.

  • This parameter is determined by parameter dsize

The first argument to dsize is the number of columns and the second argument is the number of rows, both integers. If dsize is specified, regardless of whether fx and fy are specified, the dsize parameter determines the size of the target image.

  • Determined by the parameters FX and fy

If dsize is None, the size of the target image is determined by fx and fy. Fx is the scale multiple of the column number, fy is the scale multiple of the row number.

3. code implementation \

#include <opencv2/opencv.hpp>
#include <iostream>


using namespace cv;
using namespace std;


int main(int argc, char* argv[])
{
  Mat img = imread(""C:/Users/94890/Desktop/picture/luelue.jpg"");
  if (img.empty())
  {
    cout << "Unable to read image" << endl;
    return 0;
  }


   int height = img.rows;// The number of lines in the original image
   int width = img.cols;// The number of columns in the original image
   // Zoom out the image to (0.2, 0.2) and the number of rows and columns must be an integer
  Size dsize = Size(round(0.2 * width), round(0.2 * height));
  Mat shrink;
      // Use bilinear interpolation
  resize(img, shrink, dsize, 0.0, INTER_LINEAR);


  // Zoom in on the image at the scale of (1.5, 1.5)
  float fx = 1.5;
  float fy = 1.5;
  Mat enlarge1, enlarge2;
  resize(shrink, enlarge1, Size(), fx, fy, INTER_NEAREST);
  resize(shrink, enlarge2, Size(), fx, fy, INTER_LINEAR);


  / / show
  imshow("src", img);
  imshow("shrink", shrink);
  imshow("INTER_NEAREST", enlarge1);
  imshow("INTER_LINEAR", enlarge2);


   // Save the image
   imwrite("C:/Users/94890/Desktop/picture/shrink2.jpg",shrink);
   imwrite("C:/Users/94890/Desktop/picture/INTER_NEAREST2.jpg", enlarge1);
   imwrite("C:/Users/94890/Desktop/picture/INTER_LINEAR2.jpg", enlarge2);
  waitKey(0);
      return 0;
}
Copy the code

The original image,

0.2 fold reduction, bilinear interpolation \

After reducing the image 1.5 times magnification, nearest neighbor interpolation \

After reducing the image 1.5x magnification, bilinear interpolation \

Python implementation \

1. Function prototype

# DST is the output image of the same type as the original image
dst = cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]])
Copy the code

2. Interpolation

In general, region interpolation (Cv.inter_area) is used to shrink the image and cubic spline interpolation (Cv.inter_cubic) and bilinear interpolation (Cv.inter_linear) are used to enlarge the image. The cubic spline interpolation is slow, while the bilinear interpolation is fast and effective.

3. code implementation \

import cv2


if __name__ == "__main__":
    img = cv2.imread('C:/Users/94890/Desktop/smile.jpg', cv2.IMREAD_UNCHANGED)
    print('Original shape : ', img.shape)The first value of the #img.shape attribute corresponds to the number of rows and the second value corresponds to the number of columns
    width = int(img.shape[1] * 0.3)The column number must be an integer
    height = int(img.shape[0] * 0.3)The number of lines must be an integer
    dsize = (width, height)The first value of the #dsize attribute corresponds to the number of columns, and the second value corresponds to the number of rows
    # resize image
    resized = cv2.resize(img, dsize, interpolation=cv2.INTER_LINEAR)# Bilinear interpolation
    print('Resized shape : ', resized.shape)


    fx = 1.5The number of columns is increased by 1.5 times
    fy = 1.5The number of rows is increased by 1.5 times
    resized1 = cv2.resize(resized, dsize=None, fx=fx, fy=fy, interpolation=cv2.INTER_NEAREST)# Nearest neighbor interpolation
    resized2 = cv2.resize(resized, dsize=None, fx=fx, fy=fy, interpolation=cv2.INTER_LINEAR)# Bilinear interpolation
    print('Resized1 shape : ', resized1.shape)


    # display image
    cv2.imshow("Resized image", resized)
    cv2.imshow("INTER_NEAREST image", resized1)
    cv2.imshow("INTER_LINEAR image", resized2)
    # Save image
    cv2.imwrite("C:/Users/94890/Desktop/Resized_image.jpg", resized)
    cv2.imwrite("C:/Users/94890/Desktop/INTER_NEAREST_image.jpg", resized1)
    cv2.imwrite("C:/Users/94890/Desktop/INTER_LINEAR_image.jpg", resized2)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
Copy the code

The original image,

0.3 times reduction, bilinear interpolation

After reducing the image 1.5 times magnification, the nearest neighbor interpolation

After reducing the image 1.5 times magnification, bilinear interpolation

Machine Learning Online Manual Deep Learning online Manual AI Basics download (PDF updated to25Set) site QQ group1003271085To join the wechat group, please reply "Add group" to obtain a discount website knowledge planet coupon, copy the link directly open: HTTPS://t.zsxq.com/yFQV7am like the article, click on it
Copy the code