In the process of deep learning, we often face the situation of insufficient and unbalanced image samples. In this paper, the author can increase the number of samples quickly and easily through image transformation techniques such as image movement, scaling, rotation and increasing noise based on practical work experience.

All the cases in this article are implemented on Python3.6 using the OpenCV cross-platform computer vision library. For Python and OpenCV installation and use, please refer to my earlier materials, see the references.

1. Picture stitching and translation

1.1. Image movement

Image translation is to move all pixel coordinates of an image horizontally or vertically, that is, all pixels move along the X axis horizontally and the Y axis vertically according to a given offset.

# Move the image away from the edge, keep the size unchanged
def move_img(img_file1,out_file,tunnel,border_position,border_width) :
    print('file1=' + img_file1 )
    img1 = cv2.imread(img_file1, cv2.IMREAD_GRAYSCALE)
    hight,width = img1.shape
    Initialize the empty graph
    final_matrix = np.zeros((hight,width), np.uint8) Uint8) # high * (y, x) 20*20*1
    # change 
    x1=0
    y1=hight
    x2=width
    y2=0   # Image height, coordinates start from top to bottom
    if border_position =='top':
        final_matrix[y2:y1 - border_width, x1:x2] = img1[y2 + border_width:y1, x1:x2]
    Add a margin or white space to the left
    if border_position == 'left':
        final_matrix[y2 :y1, x1:x2 - border_width] = img1[y2:y1, x1 + border_width:x2]

    if border_position == 'right':
        final_matrix[y2 :y1, x1 + border_width:x2] = img1[y2:y1, x1:x2 - border_width]
    # Add margin or white space at the bottom
    if border_position =='bottom':
        final_matrix[y2 + border_width :y1, x1:x2] = img1[y2:y1 - border_width , x1:x2]
    if border_position =='copy':
        final_matrix[y2 :y1, x1:x2] = img1[y2:y1 , x1:x2]

    cv2.imwrite(out_file, final_matrix) 

    return final_matrix
Copy the code

For example code, see Section 5.

1.2. Picture stitching

Picture Mosaic is to read the pictures separately, create a 0 matrix with the size of the target pixel, and finally replace the read pictures with the elements at the target position in the new matrix. It can mainly be used in image switching scenarios, such as the common gear digital dashboard, when the digits carry the half number.

# Mosaic edges around the image, size unchanged
def splicing_img(img_file1,img_file2,out_file,tunnel,border_position,border_width) :
    print('file1=' + img_file1 + ', file2=' + img_file2)
    img1 = cv2.imread(img_file1, cv2.IMREAD_GRAYSCALE)
    img2 = cv2.imread(img_file2, cv2.IMREAD_GRAYSCALE)
    Cv2. IMREAD_COLOR: read a color image; Cv2. IMREAD_GRAYSCALE: Reads images in grayscale mode; Cv2. IMREAD_UNCHANGED: Reads an image and includes its alpha channel.
    hight,width = img1.shape
    final_matrix = np.zeros((hight,width), np.uint8) Uint8) # high * (y, x) 20*20*1
    # change 
    x1=0
    y1=hight
    x2=width
    y2=0   # Image height, coordinates start from top to bottom
    if border_position =='top':
        final_matrix[y2 + border_width:y1, x1:x2] = img1[y2:y1 - border_width, x1:x2]
        final_matrix[y2:border_width, x1:x2] = img2[y2:border_width, x1:x2]
    Add a margin or white space to the left
    if border_position == 'left':
        final_matrix[y2 :y1, x1+ border_width:x2] = img1[y2:y1, x1:x2 - border_width]
        final_matrix[y2:y1, x1:border_width] = img2[y2:y1, x1:border_width]        

    if border_position == 'right':
        final_matrix[y2 :y1, x1:x2 - border_width] = img1[y2:y1, x1 + border_width:x2]
        final_matrix[y2:y1, x2-border_width:x2] = img2[y2:y1, x1:border_width]        
    # Add margin or white space at the bottom
    if border_position =='bottom':
        final_matrix[y2 :y1 - border_width, x1:x2] = img1[y2+ border_width:y1 , x1:x2]
        final_matrix[y1 - border_width:y1, x1:x2] = img2[y2:border_width, x1:x2]
    if border_position =='copy':
        final_matrix[y2 :y1, x1:x2] = img1[y2:y1 , x1:x2]

    cv2.imwrite(out_file, final_matrix) 

    return final_matrix
Copy the code

2. Translation and rotation of image affine transformation

2.1. On affine transformation

Affine transformation, also known as affine mapping, is the transformation of a vector space into another vector space by a linear transformation followed by a translation in geometry. An affine transformation is geometrically defined as an affine transformation or affine mapping between two vector Spaces (from Latin, affine, “and… Correlation “) consists of a nonsingular linear transformation (a transformation performed using a first-order function) followed by a translation transformation. Affine transformations can be implemented by combining a series of atomic transformations, including Translation, Scale, Flip, Rotation, and Shear.

2.2. OpenCV implementation on Python

Rotation 2.2.1.

Rotation is implemented by affine transformation. First, rotation requires defining a rotation matrix, using the cv2.getrotationMatrix2d () function. Parameter 1: center point to be rotated; Parameter 2: The Angle to be rotated; Parameter 3: The scale to be scaled.

# Rotate image, input file name, output file name, rotate Angle
def rotationImg(img_file1,out_file,ra) :
    Get the image size and calculate the image center point
    img = cv2.imread(img_file1, cv2.IMREAD_GRAYSCALE)
    (h, w) = img.shape[:2]
    center = (w/2, h/2)

    M = cv2.getRotationMatrix2D(center, ra, 1.0)
    rotated = cv2.warpAffine(img, M, (w, h))
    #cv2.imshow("rotated", rotated)
    #cv2.waitKey(0)
    cv2.imwrite(out_file, rotated)
    
    return rotated
Copy the code

2.2.2. Translation

Affine transformation is used to translate the image. Firstly, the given translation matrix M is used: [[1,0,x],[0,1,y]]. X and y are the number of images moved by x and y in the transverse and longitudinal directions respectively.


M = [ 1 0 x 0 1 y ] M = \begin{bmatrix} 1 & 0 & x\\ 0 & 1 & y \end{bmatrix}
# Affine transform technique, translation image, X_OFF: number of translation images in the X direction; Y_off: the number of shifted images in the y direction. Positive images move to the right and down, and negative images move to the left and up
def translation_img(img_file1,out_file,x_off,y_off) :
    img = cv2.imread(img_file1, cv2.IMREAD_GRAYSCALE)
    rows,cols = img.shape
    # define the translation matrix, which needs to be of numpy float32 type
    # shift x_OFF on the x axis, shift y_off on the y axis, 2 by 3 matrix
    M = np.float32([[1.0,x_off],[0.1,y_off]])
    dst = cv2.warpAffine(img,M,(cols,rows))
    
    cv2.imwrite(out_file, dst)

Copy the code

3. Zooming and clipping

3.1. Picture zooming

Resize (img, (dstWeight,dstHeight)), the first parameter is the source image data, the second parameter (target width, target height). In practice, the input image size is fixed, so that after zooming in the image, if you enlarge it, you need to crop it, and if you abbreviate it, you have empty space. (Note: The parameter deviation in this case is used to take the starting position of the enlarged image, and the reference position is the upper left corner)

# Zoom, input file name, output file name, enlarge height and width, deviation degree
def resizeImg(img_file1,out_file,dstWeight,dstHeight,deviation) :
    img1 = cv2.imread(img_file1, cv2.IMREAD_GRAYSCALE)
    imgshape = img1.shape

    h = imgshape[0]
    w = imgshape[1]
    final_matrix = np.zeros((h,w), np.uint8)
    x1=0
    y1=h
    x2=w
    y2=0   # Image height, coordinates start from top to bottom
    dst = cv2.resize(img1, (dstWeight,dstHeight))
    if h<dstHeight:
        final_matrix[y2 :y1, x1:x2] = dst[y2+deviation:y1+deviation , x1+deviation:x2+deviation]
    else:
        if deviation == 0:
            final_matrix[y2 :dstHeight, x1:dstWeight] = dst[y2 :dstHeight,x1 :dstWeight]
        else:
            final_matrix[y2 + deviation:dstHeight + deviation, x1 + deviation:dstWeight + deviation] = dst[y2 :dstHeight,x1 :dstWeight]
    cv2.imwrite(out_file, final_matrix)
    
    return final_matrix
Copy the code

3.2. Picture cropping

When doing image processing, it is generally the same size of the image. Therefore, when cutting the picture, the size of the picture remains the same and the unnecessary parts are removed.

# Cut the image
def cut_img(img_file1,out_file,top_off,left_off,right_off,bottom_off) :
    img1 = cv2.imread(img_file1, cv2.IMREAD_GRAYSCALE)
    hight,width = img1.shape    
    x1=0
    y1=hight
    x2=width
    y2=0   Hight,width = img1.shape
    
    # Grayscale image, do not use channel tunnel
    final_matrix = np.zeros((hight,width), np.uint8) Uint8) # high * (y, x) 20*20*1
    final_matrix[y2 + top_off:y1 - bottom_off, x1 + left_off:x2 - right_off] = img1[y2 + top_off:y1 - bottom_off, x1 + left_off:x2 - right_off]

    cv2.imwrite(out_file, final_matrix) 

    return final_matrix
Copy the code

4. Add Gaussian noise/salt and pepper noise to the image

In MATLAB, there are direct functions to add gaussian noise and salt and pepper noise. There are no direct functions in Python-OpencV, but they are easy to implement using related functions.

4.1. Add salt pepper noise

# Add salt and pepper noise, prob: noise ratio
def sp_noiseImg(img_file1,prob) :
    image = cv2.imread(img_file1, cv2.IMREAD_GRAYSCALE)
    output = np.zeros(image.shape,np.uint8)
    thres = 1 - prob 
    for i in range(image.shape[0) :for j in range(image.shape[1]):
            rdn = random.random()
            if rdn < prob:
                output[i][j] = 0
            elif rdn > thres:
                output[i][j] = 255
            else:
                output[i][j] = image[i][j]
    return output
Copy the code

The noise ratio is 0.1, 0.05 and 0.01 respectively.

4.2. Add Gaussian noise

# Add gaussian noise
# mean
# var: variance
def gasuss_noiseImg(img_file1, out_file, mean=0, var=0.001) :
    image = cv2.imread(img_file1, cv2.IMREAD_GRAYSCALE)
    image = np.array(image/255, dtype=float)
    noise = np.random.normal(mean, var ** 0.5, image.shape)
    out = image + noise
    if out.min()"0:
        low_clip = -1.
    else:
        low_clip = 0.
    out = np.clip(out, low_clip, 1.0)
    out = np.uint8(out*255)
    cv2.imwrite(out_file, out)
    
    return out
Copy the code

5. Code testing

"" Created on @author: Xiaoyw" "
#coding: utf-8
import numpy as np
import cv2
import os
import random

# Function section is skipped, see above
if __name__ == '__main__':
    file1 = 'dog.jpg'
    
    move_img(file1,'timg11.jpg'.1.'top'.35)
    move_img(file1,'timg12.jpg'.1.'left'.35)
    move_img(file1,'timg13.jpg'.1.'right'.35)
    move_img(file1,'timg14.jpg'.1.'bottom'.35)
    cut_img(file1,'dog_cut.jpg'.20.10.20.30)
    rotationImg(file1,'dog_ra1.jpg'.30)
    rotationImg(file1,'dog_ra1.jpg'.60)
    rotationImg(file1,'dog_ra2.jpg', -90)
    sp_noiseImg(file1,'dog_sp_01.jpg'.0.01)  
    sp_noiseImg(file1,'dog_sp_05.jpg'.0.05)
    sp_noiseImg(file1,'dog_sp_10.jpg'.0.1)  
    resizeImg(file1,'dog_big.jpg'.250.280.0)
    resizeImg(file1,'dog_small.jpg'.100.200.0)
    splicing_img(file1,file1,'dog2.jpg'.1.'right'.50)
    translation_img(file1,'timg15.jpg'.10.10)
    translation_img(file1,'timg16.jpg', -20, -30)

    pass
Copy the code

Reference:

[1] adding Noise to Images using Python-OpencV (Gaussian Noise, Salt and pepper Noise), Rogn, March 2019

[2] “Methods and Practices for drawing CNN training Images using Python Matplotlib”, CSDN blog Xiao Yongwei, March 2019

[3] opencV Learning (35) Affine Transform warpAffine, CSDN blog, February 2017

[4] Affine Transformation blog. God helps those who help themselves! Alexander2017 in April