preface

Convolutional Neural Network (CNN) is the most commonly used in deep learning to solve image classification problems. The reason why it is called convolutional neural network is that the convolutional layer is used in the hidden layer to process two-dimensional (gray) or three-dimensional (RGB) image data. Each convolutional layer is composed of multiple filters, and each Filter corresponds to a small matrix (the number of rows and columns is usually 2 or 3). The matrix is multiplied and summed along the rows and columns of the image in a certain step to obtain a new set of image data. This process is the process of convolution of the image.

In the application of convolutional neural network, a new image is obtained by convolution of the image (if the step size is greater than 1, the new image will be reduced), which is usually used to extract the features of the image. Each filter is also called a feature map. However, image convolution is not only used in deep neural network to extract features, but has been widely used in image processing before the application of deep learning. For example, blur, sharpening, embossing, contour line and edge detection in image processing software such as PhotoShop that we are familiar with are also realized through image convolution. The difference is that the values of the convolutional matrix in deep learning are trained by the model without rules, while in image processing, the values of the matrix are regular, and the shape is usually 3 rows and 3 columns. We also call this matrix the Image Kernel (in fact, Keras also uses Kernel to represent filter size in convolution layer parameters).

The title of this paper is “Visualization of Convolutional Neural Networks”. In fact, it is also to explain that although the parameters of neural networks are like black boxes and cannot be explained, the output of the convolutional layer can be visualized. This paper explains the visual output results of different Image kernels used for Image convolution from the perspective of Image processing.

1. The import libraries

from PIL import Image, ImageDraw
import numpy as np
import matplotlib.pyplot as plt
# %matplotlib inline
Copy the code

2. Realize Image Kernel operation

2.1 Helper Functions

Pil. Image is used to read either a color Image or a grayscale Image.

def read_img(imgpath, gray=True):
    img = Image.open(imgpath)
    if gray:
        img = img.convert("L")
    return img
Copy the code

Specify the image width to scale the image.

def resize_width(img, width):
    w, h = img.size
    height = int(width * h / w)
    return img.resize((width, height))
Copy the code

With the shortest side length of the image, the square image is cut from the middle.

def crop_center(img):
    w, h = img.size
    c_x, c_y = w/2, h/2
    offset = min(w, h) / 2
    crop_box = (c_x-offset, c_y-offset, c_x+offset, c_y+offset)
    return img.crop(crop_box)
Copy the code

2.2 Specify kernel for image processing

Img is an Image object. Kernel is the matrix for convolution operation; Strides is the single flat step for convolution operation; When mean is False, sum; when True, average.

def apply_img_kernel(img, kernel, strides=1, mean=False):
    img = np.asarray(img)

    h, w = img.shape[:2]
    k_h, k_w = kernel.shape[:2]
    x_range = range(0, w - k_w + 1, strides)
    y_range = range(0, h - k_h + 1, strides)

    if mean:
        prosum = lambda a,b: min((a*b).mean(), 255)
    else:
        prosum = lambda a,b: min((a*b).sum(), 255)

    cal = lambda img: np.array([[prosum(img[i:i+k_h, j:j+k_w], kernel)
                                 for j in x_range]
                                    for i in y_range]).astype(np.uint8)

    if len(img.shape) == 2:
        data = cal(img)
        return Image.fromarray(data)
    elif len(img.shape) == 3:
        r, g, b = np.transpose(img, (2.0.1))
        _r, _g, _b = cal(r), cal(g), cal(b)
        return Image.merge('RGB', [Image.fromarray(d) for d in [_r, _g, _b]])
Copy the code

3. Image Kernel with different effects

Light, sharpen, Blur, emboss, Outline, border. IO /ev/image-ke…

3.1 lighting

def identity_kernel(iden=1.0):
    return np.array([[0.0.0],
                     [0, iden, 0],
                     [0.0.0]])
Copy the code

3.2 sharpening

def sharpen_kernel(inner=5.0,  edge=1.0):
    return np.array([[0,    edge,  0],
                     [edge, inner, edge],
                     [0,    edge,  0]])
Copy the code

3.3 the fuzzy

def blur_kernel(inner=0.25,  edge=0.125, corner=0.0625):
    return np.array([[corner, edge,  corner],
                     [edge,   inner, edge],
                     [corner, edge,  corner]])
Copy the code

3.4 relief

def emboss_kernel(diag=2.0, iden=1.0):
    return np.array([[-diag, -iden, 0],
                     [-iden, iden,  iden],
                     [0,     iden,  diag]])
Copy the code

3.5 outline

def outline_kernel(inner=8.0, outer=1.0):
    return np.array([[outer, outer, outer],
                     [outer, inner, outer],
                     [outer, outer, outer]])
Copy the code

3.6 Edge Detection

def sobel_kernel(direction, base=None, edge=2.0, corner=1.0):
    if base is not None:
        edge = base
        corner = base / 2
    if direction == 'top':
        return np.array([[corner, edge, corner], [0.0.0], [-corner, -edge, -corner]])
    elif direction == 'bottom':
        return np.array([[-corner, -edge, -corner], [0.0.0], [corner, edge, corner]])
    elif direction == 'left':
        return np.array([[corner, 0, -corner], [edge, 0, -edge], [corner, 0, -corner]])
    elif direction == 'right':
        return np.array([[-corner, 0, corner], [-edge, 0, edge], [-corner, 0, corner]])

    return identity_kernel()
Copy the code

4. Use effect

Read the picture, and zoom cropping processing. (Taking pictures on mobile phones is too large, takes a long time to compute, and takes up a lot of space for image display)

Tips: Gray parameter set to True to read grayscale images. Here is a demonstration of color image effects.

img = read_img('yuki.jpeg', gray=False)

img = crop_center(resize_width(img, 320))
Copy the code

Helper function: displays two images on a line to compare the changes before and after Kernel implementation.

def show2imgs(img1, img2, title=None):
    fig = plt.figure(figsize=(10.5))

    if title is not None:
        fig.suptitle(title)

    plt.subplot(121)
    plt.axis('off')
    plt.imshow(img1)

    plt.subplot(122)
    plt.axis('off')
    plt.imshow(img2)

    plt.show()
Copy the code

4.1 lighting

img_identity = apply_img_kernel(img, identity_kernel(1.6))

show2imgs(img, img_identity)
Copy the code

4.2 sharpening

img_sharpen = apply_img_kernel(img, sharpen_kernel(inner=1.7,  edge=0.08))

show2imgs(img, img_sharpen)
Copy the code

4.3 the fuzzy

img_blur = apply_img_kernel(img, blur_kernel())

show2imgs(img, img_blur)
Copy the code

4.4 relief

img_emboss = apply_img_kernel(img, emboss_kernel())

show2imgs(img, img_emboss)
Copy the code

4.5 outline

img_outline = apply_img_kernel(img, outline_kernel(inner=8.9, outer=1.29), mean=True)

show2imgs(img, img_outline)
Copy the code

4.6 Edge Detection

img_sobel_top = apply_img_kernel(img, sobel_kernel('top'.0.03))
img_sobel_bottom = apply_img_kernel(img, sobel_kernel('bottom'.0.03))
img_sobel_left = apply_img_kernel(img, sobel_kernel('left'.0.03))
img_sobel_right = apply_img_kernel(img, sobel_kernel('right'.0.03))

show2imgs(img_sobel_top, img_sobel_bottom)
show2imgs(img_sobel_left, img_sobel_right)
Copy the code

5. Open source

This article is open source and can be downloaded from github: github.com/kenblikylee…

You can also use PIP to directly install the experience (see github homepage for details) :

pip install imgkernel
Copy the code



Wechat scan qr code to obtain the latest technology original