Practice in Artificial Intelligence: Notes on Tensorflow ii

The article directories

  • preface
  • 1. Basic steps of convolutional neural network
    • 1. Computation of convolution by convolutional neural network
    • 2. Selection of receptive field and convolution kernel
    • 3. All zero Padding
    • 4. Tf describes the convolution layer
    • 5. Batch standardization (BN operation)
    • Pooling Pooling
    • 7. Quit Dropout
    • 8. Convolutional neural network construction and parameter analysis
  • 2. Classical convolutional network explanation
    • 1, LeNet
    • 2, AlexNet
    • 3, VGGNet
    • 4, InceptionNet
    • 5, ResNet
    • 6. Summary of classical convolutional networks
  • conclusion

preface

Speak this goal: interpretation of the convolution of the neural network basic steps and analysis more classic network architecture, hope to help you classic five papers download links: link: https://pan.baidu.com/s/1rIH1nh28ON6DKM6T9HPXbQ extracted code: kbd8


1. Basic steps of convolutional neural network

1. Computation of convolution by convolutional neural network

Convolution concept:

Concept of convolution: Convolution can be considered as an effective method to extract image features. Generally, a square convolution kernel is used to slide on the input feature graph according to the specified step length to traverse each pixel in the input feature graph. For each step, the convolution kernel and the input feature graph will have a coincidence region, and the corresponding elements in the coincidence region will be multiplied and summed together with the bias term to obtain a pixel point of the output feature

Convolution notes:

The depth of the input feature graph determines the depth of the convolution kernel. The number of convolution kernels at the current layer determines the depth of the output feature graph at the current layer. If the feature extraction ability of a layer is insufficient, several convolution kernels can be used at this layer: The three-dimensional convolution kernel is used to realize the spatial sharing of parameters. During the convolution calculation, the parameters in the convolution kernel are fixed, and the parameters will be updated during the back propagation

Example:

2. Selection of receptive field and convolution kernel

I have thought about this before, see the link below: Why is the effect of two layers of 33 convolution kernels better than that of one layer of 55 convolution kernels?

3. All zero Padding

In the Tensorflow framework, the padding = ‘SAME’ or padding = ‘VALID’ parameter is used to indicate whether to complete

Zero filling, its influence on the output feature size is as follows:



For a 5 by 5 by 1 picture,

When the padding = ‘SAME’, the edge length of the output image is 5; When padding = ‘VALID’, the output image side length is 3.

4. Tf describes the convolution layer

Tf.keras.layers.Conv2D(filters= number of convolution kernels, kernel_size= size of convolution kernels,# square write kernel length integer, or (kernel height h, kernel width W)Strides = strides for length,Write step size is an integer, or (vertical step size h, horizontal step size w), default 1
padding ="same" or "valid".# use full zero padding is "same", not use "valid"(default)
activation ="relu" or "sigmoid" or"train" or"softmax".# if there is BN, leave this blankInput_shape =(Height, width, number of channels)# input feature dimension, can be omitted)
Copy the code

The call method is as follows:

Convolved with 6 (5,5) convolution kernels, not full zero filling, sigmoid as activation function 2. Pooled kernel of (2,2) with step 2, maximum pooling is selected 3. Flatten is used to straighten the output into one-dimensional array 4

model=tf.keras.models.Sequential([
    Conv2D(6.5,padding='valid',activation='sigmoid'),
    MaxPool2D(2.2),
    Or call it this way
    # Conv2D (6, 5, 5), padding = 'valid', activation = 'sigmoid'),
    # MaxPool2D (2, 2, 2),
    # can also be called this way
    # Conv2D (filters = 6, kenrnel_size = (5, 5), padding = 'valid', activation = 'sigmoid'),
    # MaxPool2D (pool_size = (2, 2), strides = 2),
    Flatten(),
    Dense(10,activation='softmax')])Copy the code

When building convolutional networks using the Tensorflow framework, the BatchNormalization function is typically used to build BN layers for BatchNormalization, so BN is often not written in Conv2D functions

5. Batch standardization (BN operation)

Neural network is more sensitive to data near 0. However, as the number of network layers increases, feature data will deviate from the mean value of 0. Standardization is used to pull the offset data back to batch standardization: it is used for standardization processing of a batch of data, which is commonly used between convolution operation and activation operation



No. K, Batch output characteristic map:

BN operation pulls the original offset feature data back to the mean value of 0, so that the data entered into the activation function is distributed in the linear region of the activation function, so that the slight changes of input data are more obviously reflected in the output, and the distinguishing power of the activation function on the input data is improved.

However, this simple normalization of feature data makes the feature data completely meet the standard normal distribution and concentrate in the linear region of the activation function, which makes the activation function lose its nonlinear nature. Therefore, two trainable parameters are introduced for each convolution kernel in BN operation.

In reverse propagation, the scaling factor and offset factor will be trained and optimized together with other parameters to be trained. The width and offset of feature data distribution are optimized by scaling factor and offset factor. The nonlinear expression power of the network is guaranteed.

The BN layer is between the convolutional layer and the activation layer. Tf provides BatchNormalization() for BN manipulation.

model=tf.keras.models.Sequential([
    Conv2D(filters=6,kernel_size=(5.5),padding='same'),# convolution layer
    BatchNormalization(),							   # BN layer
    Activation('relu'),								   # the activation layer
    MaxPool2D(pool_size=(2.2),strides=2,padding='same'),# pooling layer
    Dropout(0.2),										# dropout layer
])
Copy the code

Pooling Pooling

Pooling is used to reduce the maximum amount of feature data. Pooling can extract image texture, and mean pooling can retain background features. If pooling is performed with 2 * 2 to check the input image, the output image will become one quarter of the input image.

Tf. Keras. The layers. MaxPool2D (pool_size = pooling nuclear size, strides = pooling step length, padding ='valid'or'same'#same full zero fill, valid not full zero fill,) tf. Keras. The layers. AveragePooling2D (pool_size = pooling nuclear size, strides = pooling step length, padding ='valid'or'same'#same full zero fill, valid not full zero fill
)
Copy the code

Call examples:

model=tf.keras.models.Sequential([
    Conv2D(filters=6,kernel_size=(5.5),padding='same'),# convolution layer
    BatchNormalization(),							   # BN layer
    Activation('relu'),								   # the activation layer
    MaxPool2D(pool_size=(2.2),strides=2,padding='same'),# pooling layer
    Dropout(0.2),										# dropout layer
])
Copy the code

7. Quit Dropout

In order to alleviate the over-fitting of the neural network, some neurons of the hidden layer are often temporarily abandoned from the neural network in a certain proportion in the neural network training.

When the neural network is used, all the neurons are restored to the neural network.

Dropout functions:

Tf. Keras. Layers.Dropout(Probability of Dropout)Copy the code

8. Convolutional neural network construction and parameter analysis

Convolution neural network: after features are extracted by convolution kernel, the main modules of the fully connected network are: conv batch standardization BN Activation Pooling convolution is a feature extractor, namely CBAPD

Here we still apply the framework of network eight shares:

3. Build the required network structure. When the network structure is relatively simple, the Sequential network model can be built using the TF. keras. However, when the network is no longer a simple sequential structure, but other special structures appear (such as the hop structure in ResNet), it is urgent to use class to define their own network structure. The optimizer (Adam, SGD, RMSdrop) and loss function (cross entropy function, mean square error function, etc.) are usually specified in this step. 5. Input the data into the compiled network for training (model.fit), and specify the number of training rounds epochs and batCH_size. Since the number of parameters and calculation are generally large, and the training takes a long time, breakpoint continuation training and model parameter preservation are usually added in this step. 6. Print the detailed information of the neural network model (model.summary), including the network structure and parameters of each layer of the network

Here is the flatten function:

The Flatten layer is used to “Flatten” the input, that is, to one-dimensional the multidimensional input, and is often used in the transition from the convolution layer to the fully connected layer. Flatten does not affect batch size.

Next, we use classes to build a network structure with 5×5 convolutional kernels, 6 pooling kernels, 2×2, and step size 2:

# Use classes to build network structures, CBAPD
class Baseline(Model) :
    def _init_(self) :
        super(Baseline,self)._init_()
        self.c1 =Conv2D(filters=6,kernel_size=(5.5),padding='same')
        self.b1 =BatchNormalization()
        self.a1 =Activaction('relu')
        self.p1 =MaxPool2D(pool_size=(2.2),strides=2,padding ='same')
        self.d1 =Dropout(0.2)
        
        self.flatten =Flatten()
        self.f1 =Dense(128,activation='relu')
        self.d2 =Dropout(0.2)
        self.f2 =Dense(10,activation='softmax')
The #call function calls each layer of the network structure constructed in the init function
def call(self,x) :
    x=self.c1(x)
    x=self.b1(x)
    x=self.a1(x)
    x=self.p1(x)
    x=self.d1(x)
    
    x=self.flatten(x)
    x=self.f1(x)
    x=self.d2(x)
    y=self.f2(x)
    return y
A forward propagation from input to output returns the inference result
Copy the code

Let’s analyze the parameters:

Open the weights file: Baseline/conv2D /kernel:0 layer-1 network 5x5x3 convolution kernel: 6 7)=> 140 Baseline/conv2D/BIAS :0 Bias item b (6,)=>6 Baseline /batch_normalization/gamma:0 BN operation scaling factor gamma, Exploratory/exploratory/exploratory: exploratory/exploratory/exploratory: exploratory/exploratory/exploratory: exploratory/exploratory/exploratory: exploratory/exploratory/exploratory: exploratory Baseline /dense/kernel:0 128)=>196608 baseline/dense/bias:0 baseline/ denSE_1 /kernel:0 Baseline/denSE_1 /bias:0 baseline/ DENSE_1 / BIAS :0 450+6+6+6+196608+128+1280+10=198494 With these parameters, the forward propagation of neural network can be reproduced. Application can find that the network parameters of neural network are very many. In addition, it can be found that most parameters are concentrated on the full-connection layer, while the parameters of the convolution layer account for a relatively small proportion. However, the parameters of the convolution kernel are very important (because convolution is a feature extractor, and feature parameters are the focus of image recognition). Therefore, reducing the parameters of the fully connected network may be a good network optimization method.

So far, the basic neural network construction method has been explained. The features of LeNet, AlexNet, VGGNet, InceptionNet and ResNet will be explained one by one, and the network architecture will be recreated using TensorFlow-based code.

2. Classical convolutional network explanation

1, LeNet

Network structure:



Tf recurrence model:

Here, the original model is adjusted. The input image size is changed to 32 * 32 * 3 to adapt to data set CIFAR10, and the large convolution kernel is replaced by a small one

class LeNet5(Model) :
	def __init__(self) :
		super(LeNet5,self).__init__()
		self.c1=Conv2D(filters=6,kernel_size=(5.5),padding='valid',input_shape=(32.32.3),activation='sigmoid')
		self.p1=MaxPool2D(pool_size=(2.2),strides=2)
		
		self.c2=Conv2D(filters=16,kernel_size=(5.5),padding='valid',activation='sigmoid')
		self.p2=MaxPool2D(pool_size=(2.2),strides=2)
		
		self.flatten =Flatten()
		self.f1 =Dense(120,activation='sigmoid')
		self.f1 =Dense(84,activation='sigmoid')
		self.f1 =Dense(10,activation='softmax')
    def call(self, x) :
        x = self.c1(x)
        x = self.p1(x)

        x = self.c2(x)
        x = self.p2(x)

        x = self.flatten(x)
        x = self.f1(x)
        x = self.f2(x)
        y = self.f3(x)
        return y


model = LeNet5()		
Copy the code

Advantages: Sharing convolution kernel, reducing network parameters. How to understand weight sharing in convolutional neural networks?

2, AlexNet

Network structure:



Due to lack of video memory, the training was completed in two parts. Here, the original model is adjusted. The input image size is changed to 32 * 32 * 3 to adapt to data set CIFAR10, and the large convolution kernel is replaced by a small one.

class AlexNet8(Model) :
	def __init__(self) :
		super(AlexNet8,self).__init__()
		self.c1=Conv2D(filters=96,kernel_size=(3.3))
		self.b1=BatchNormalization()
		self.a1=Activation('relu')
		self.p1=MaxPool2D(pool_size=(3.3),strides=2)
		
		self.c2=Conv2D(filters=256,kernel_size=(3.3))
		self.b2=BatchNormalization()
		self.a2=Activation('relu')
		self.p2=MaxPool2D(pool_size=(3.3),strides=2)
		
		self.c3=Conv2D(filters=384,kernel_size=(3.3,padding='same',activation='relu')
		self.c4=Conv2D(filters=384,kernel_size=(3.3,padding='same',activation='relu')
		self.c5=Conv2D(filters=256,kernel_size=(3.3,padding='same',activation='relu')
		
		self.p3=MaxPool2D(pool_size=(3.3),strides=2)
		
		self.flatten =Flatten()
		self.f1 =Dense(2048,activation='relu')
		self.d1 =Dropout(0.5)
		self.f1 =Dense(2048,activation='relu')
		self.d1 =Dropout(0.5)
		self.f1 =Dense(10,activation='softmax')
def call(self, x) :
        x = self.c1(x)
        x = self.b1(x)
        x = self.a1(x)
        x = self.p1(x)

        x = self.c2(x)
        x = self.b2(x)
        x = self.a2(x)
        x = self.p2(x)

        x = self.c3(x)

        x = self.c4(x)

        x = self.c5(x)
        x = self.p3(x)

        x = self.flatten(x)
        x = self.f1(x)
        x = self.d1(x)
        x = self.f2(x)
        x = self.d2(x)
        y = self.f3(x)
        return y


model = AlexNet8()
Copy the code

Advantages: The activation function uses Relu to improve training speed; Dropout prevents overfitting

3, VGGNet

Network structure:



In order to adapt to CIFAR10 data set, the input image size was adjusted from 224 * 244 * 3 to 32 * 32 * 3

class VGG16(Model) :
    def __init__(self) :
        super(VGG16, self).__init__()
        self.c1 = Conv2D(filters=64, kernel_size=(3.3), padding='same')  # convolution layer 1
        self.b1 = BatchNormalization()  BN layer # 1
        self.a1 = Activation('relu')  # Activate Layer 1
        self.c2 = Conv2D(filters=64, kernel_size=(3.3), padding='same', )
        self.b2 = BatchNormalization()  BN layer # 1
        self.a2 = Activation('relu')  # Activate Layer 1
        self.p1 = MaxPool2D(pool_size=(2.2), strides=2, padding='same')
        self.d1 = Dropout(0.2)  # dropout layer

        self.c3 = Conv2D(filters=128, kernel_size=(3.3), padding='same')
        self.b3 = BatchNormalization()  BN layer # 1
        self.a3 = Activation('relu')  # Activate Layer 1
        self.c4 = Conv2D(filters=128, kernel_size=(3.3), padding='same')
        self.b4 = BatchNormalization()  BN layer # 1
        self.a4 = Activation('relu')  # Activate Layer 1
        self.p2 = MaxPool2D(pool_size=(2.2), strides=2, padding='same')
        self.d2 = Dropout(0.2)  # dropout layer

        self.c5 = Conv2D(filters=256, kernel_size=(3.3), padding='same')
        self.b5 = BatchNormalization()  BN layer # 1
        self.a5 = Activation('relu')  # Activate Layer 1
        self.c6 = Conv2D(filters=256, kernel_size=(3.3), padding='same')
        self.b6 = BatchNormalization()  BN layer # 1
        self.a6 = Activation('relu')  # Activate Layer 1
        self.c7 = Conv2D(filters=256, kernel_size=(3.3), padding='same')
        self.b7 = BatchNormalization()
        self.a7 = Activation('relu')
        self.p3 = MaxPool2D(pool_size=(2.2), strides=2, padding='same')
        self.d3 = Dropout(0.2)

        self.c8 = Conv2D(filters=512, kernel_size=(3.3), padding='same')
        self.b8 = BatchNormalization()  BN layer # 1
        self.a8 = Activation('relu')  # Activate Layer 1
        self.c9 = Conv2D(filters=512, kernel_size=(3.3), padding='same')
        self.b9 = BatchNormalization()  BN layer # 1
        self.a9 = Activation('relu')  # Activate Layer 1
        self.c10 = Conv2D(filters=512, kernel_size=(3.3), padding='same')
        self.b10 = BatchNormalization()
        self.a10 = Activation('relu')
        self.p4 = MaxPool2D(pool_size=(2.2), strides=2, padding='same')
        self.d4 = Dropout(0.2)

        self.c11 = Conv2D(filters=512, kernel_size=(3.3), padding='same')
        self.b11 = BatchNormalization()  BN layer # 1
        self.a11 = Activation('relu')  # Activate Layer 1
        self.c12 = Conv2D(filters=512, kernel_size=(3.3), padding='same')
        self.b12 = BatchNormalization()  BN layer # 1
        self.a12 = Activation('relu')  # Activate Layer 1
        self.c13 = Conv2D(filters=512, kernel_size=(3.3), padding='same')
        self.b13 = BatchNormalization()
        self.a13 = Activation('relu')
        self.p5 = MaxPool2D(pool_size=(2.2), strides=2, padding='same')
        self.d5 = Dropout(0.2)

        self.flatten = Flatten()
        self.f1 = Dense(512, activation='relu')
        self.d6 = Dropout(0.2)
        self.f2 = Dense(512, activation='relu')
        self.d7 = Dropout(0.2)
        self.f3 = Dense(10, activation='softmax')

    def call(self, x) :
        x = self.c1(x)
        x = self.b1(x)
        x = self.a1(x)
        x = self.c2(x)
        x = self.b2(x)
        x = self.a2(x)
        x = self.p1(x)
        x = self.d1(x)

        x = self.c3(x)
        x = self.b3(x)
        x = self.a3(x)
        x = self.c4(x)
        x = self.b4(x)
        x = self.a4(x)
        x = self.p2(x)
        x = self.d2(x)

        x = self.c5(x)
        x = self.b5(x)
        x = self.a5(x)
        x = self.c6(x)
        x = self.b6(x)
        x = self.a6(x)
        x = self.c7(x)
        x = self.b7(x)
        x = self.a7(x)
        x = self.p3(x)
        x = self.d3(x)

        x = self.c8(x)
        x = self.b8(x)
        x = self.a8(x)
        x = self.c9(x)
        x = self.b9(x)
        x = self.a9(x)
        x = self.c10(x)
        x = self.b10(x)
        x = self.a10(x)
        x = self.p4(x)
        x = self.d4(x)

        x = self.c11(x)
        x = self.b11(x)
        x = self.a11(x)
        x = self.c12(x)
        x = self.b12(x)
        x = self.a12(x)
        x = self.c13(x)
        x = self.b13(x)
        x = self.a13(x)
        x = self.p5(x)
        x = self.d5(x)

        x = self.flatten(x)
        x = self.f1(x)
        x = self.d6(x)
        x = self.f2(x)
        x = self.d7(x)
        y = self.f3(x)
        return y


model = VGG16()
Copy the code

In general, the structure of VGGNet is quite regular. It inherits effective methods such as Relu activation function and Dropout operation in AlexNet. Meanwhile, it adopts a single size 3 * 3 small Convolution kernel to form a regular C (Convolution, The typical structures of convolutional neural network (NORMALIZATION), B (Batch Normalization), A (Activation), P (Pooling), and D (Dropout) are used extensively in convolutional neural networks. The advantages of small convolution kernels reduce parameters and improve identification accuracy. The network structure is neat and suitable for parallel acceleration.

4, InceptionNet

Advantages: Convolution kernels of different sizes are used in one layer to improve perception (consistent output feature area is realized through padding); 1 * 1 convolution kernel is used to change the number of output characteristic channels (reduce network parameters).

Basic unit:

class ConvBNRelu(Model) :
	def __init__(self,ch,kernelsz=3,strides=1,padding='same') :
		super(ConvBNRelu,self).__init__()
		self.model=tf.keras.models.Sequential([
		Conv2D(ch,kernelsz,strides=strides,padding=padding),
		BatchNormalization(),
		Activation('relu')])def call(self,x,training=None) :
		x=self.model(x,training=training)
		return x
The number of channels in the # CH feature graph is the number of convolution kernels
class InceptionBlk(Model) :def __init__(self,ch,strides=1) :
		super(InceptionBlk,self).__init__()
		self.ch=ch
		self.strides=strides
		self.c1=ConvBNRelu(ch,kernelsz=1,strides=strides)
		self.c2_1=ConvBNRelu(ch,kernelsz=1,strides=strides)
		self.c2_2=ConvBNRelu(ch,kernelsz=3,strides=1)
		self.c3_1=ConvBNRelu(ch,kernelsz=1,strides=strides)
		self.c3_2=ConvBNRelu(ch,kernelsz=5,strides=1)
		self.p4_1=MaxPool2D(3,strides=1,padding='same')
		self.c4_2=ConvBNRelu(ch,kernelsz=1,strides=strides)
	def call(self,x) :
		x1=self.c1(x)
		x2_1=self.c2_1(x)
		x2_2=self.c2_2(x2_1)
		x3_1=self.c3_1(x)
		x3_2=self.c3_2(c3_1)
		x4_1=self.p4_1(x)
		x4_2=self.c4_2(x4_1)
		#concat along axis=channel
		x=tf.concat([x1,x2_2,x3_2,x4_2],axis=3)
		return x
		
Copy the code

In fact, the basic unit looks like this at the beginning. Through the horizontal combination of convolution layer and pooling layer of different sizes (convolution and pooling have the same size, which is convenient for stacking), the network depth is broadened and the adaptability of the network to size is enhanced. However, since the convolution kernel is directly calculated on the output of the upper layer, resulting in more parameters and complicated operation, 1 * 1 convolution kernel is added to reduce feature thickness. Take the 5 * 5 convolution operation as an example to illustrate this problem. It is assumed that the output of the layer on the network is 100 * 100 * 128 (H * W * C), and the output is 100 * 100 * 32 after passing the convolution layer of 32 * 5 * 5 (32 convolution kernels with a size of 5 * 5) (step 1, all zero filling). The parameter number of convolution layer is 32 * 5 * 5 * 128 = 102400; If the convolutional layer of 32 * 1 * 1 is first passed (output is 100 * 100 * 32), and then the convolutional layer of 32 * 5 * 5 is passed, the output is still 100 * 100 * 32, but the number of parameters of the convolutional layer becomes 32 * 1 * 1 * 128

  • 32 * 5 * 5 * 32 = 29696, only about 30% of the number of original parameters, which is the dimension reduction effect of small convolution kernel.



    The network architecture composed of basic modules is as follows:



    Code Description:
class Inception10(Model) :
			def __init__(self,num_blocks,num_classes,init_ch=16,**kwargs) :
				super(Inception10,self).__init__(**kwargs)
				self.in_channels=init_ch
				self.out_channels=init_ch
				self.num_blocks=num_blocks
				self.init_ch=init_ch
				self.c1=ConvBNRelu(init_ch)
				self.blocks=tf.keras.models.Sequential()
				for block_id in range(num_blocks):
					for layer_id in range (2) :if layer_id==0:
							block=InceptionBlk(self.out_channels,strides=2)
						else:
							block=InceptionBlk(self.out_channels,strides=1)
						self.blocks.add(block)
					#
					self.out_channels*=2
				self.p1 = GlobalAveragePooling2D()
				self.f1 = Dense(num_classes,activation='softmax')
			def call(self,x) :
				x=self.c1(x)
				x=self.blocks(x)
				x=self.p1(x)
				y=self.f1(x)
				return y
model=Inception10(num_blocks=2,num_classes=10)	
Copy the code

Parameter num_blocks represents the number of blocks in InceptionNet. Each Block is composed of two basic units. After each Block, the size of the feature graph becomes 1/2 and the number of channels becomes twice as large. Num_classes represents the number of classes, init_CH represents the initial number of channels, and represents the initial number of convolutional kernels of InceptionNet basic unit

InceptionNet adopts “global average pooling + full connection layer”, VGGNet(with three full connection layers) average pooling: Sliding in the form of a window on the feature graph, taking the average value in the window as the sampling value global average pooling: In this way, each feature graph is directly associated with the classification probability to replace the function of the full connection layer. In addition, additional training parameters will not be generated, which reduces the possibility of over-fitting, but will lead to slower network convergence. InceptionNet broadens the network structure by using multi-dimensional convolution reaggregation and reduces the number of parameters through 1 * 1 convolution operations.

5, ResNet

Advantages: residual hopping between layers, introducing forward information, reducing gradient disappearance, making it possible for neural network layers to become deeper.

It is known that for a network with proper depth, increasing the number of layers will lead to the improvement of training error rate:



The core idea of ResNet is to add several identity mapping layers (y = x, output equals input) to a shallow network with saturation accuracy, so as to increase the network depth without increasing the error. This enables the layers of neural network to exceed the previous constraints and improve accuracy.

The schematic diagram of this residual structure is as follows:



Note that the addition here is substantially different from the addition in InceptionNet. For Inception, the addition adds layers along the depth direction, like a “thousand layer cake”; The addition in ResNet is the numerical addition of the corresponding elements of the feature graph.

Code Description:

class ResnetBlock(Model) :
	def __init__(self,filters,strides=1,residual_path=False) :
		super(ResnetBlock,self).__init__() self.filters = filters self.strides = strides self.residual_path = residual_path self.c1 =  Conv2D(filters,(3.3),strides=strides,padding='same',use_bias=False)
		self.b1 = BatchNormalization()
		self.a1 = Activation('relu')
		
		self.c2 = Conv2D(filters,(3.3),strides=1,padding='same',use_bias=False)
		self.b2 = BatchNormalization()
		
		When residual_path is Ture, downsample the input and convolve with 1x1 convolution kernel to ensure the same dimension of X and F(x)
		if residual_path:
			self.down_c1 = Conv2D(filters,(1.1),strides=strides,padding='same',use_bias=False)
			self.down_b1 = BatchNormalization()
		self.a2=Activation('relu')
	def call(self,inputs) :
		residual = inputs	#residual is the input value itself, that is, residual=x
		# Compute F(X) by convolution, BN layer, activation layer
		x=self.c1(inputs)
		x=self.b1(x)
		x=self.a1(x)
		
		x=self.c2(x)
		y=self.b2(x)
		
		if self.residual_path:
			residual=self.down_c1(inputs)
			residual=self.down_b1(residual)
		out = self.a2(y+residual)  Print the sum of the two parts, F(x)+x or F(x)+Wx, and then activate the function
		return out
Copy the code

Schematic diagram of ResNet18 network structure and its tf construction model:

Figure 1 Figure 2
class ResNet(Model) :
	def __init__(self,block_list,initial_filters=64) :		#block_list indicates that each block has several convolutional layers
		super(ResNet,self).__init__()
		self.num_blocks =len(block_list)
		self.block_list =block_list
		self.out_filters = initial_filters
		self.c1 = Conv2D(self.out_filters,(3.3),strides=1,padding='same',use_bias=False,kernel_initialize='he_normal')
		self.b1 = tf.keras.layers.BatchNormalization()
		self.a1 = Activation('relu')
		self.blocks = tf.keras.models.Sequential()
		# Build ResNet network structure
		for block_id in range(len(block_list)):			# number of resnet_block
			for layer_id in range(block_list(block_list[block_id])):		# number of convolution layers
				ifblock_id! =0 and layer_id==0 : 			# Downsample the input of each block except the first block. For a sequence of samples spaced several times, the resulting new sequence is the original sequence
					block = ResnetBlock(self.out_filters,strides=2,residual_path = True)
				else:
					block = ResnetBlock(self.out_filters,residual_path=False)
				self.blocks.add(block)		Add the built block to Resnet
			self.out_filters *=2			# The next block has twice as many convolution kernels as the previous block
		self.p1 = tf.keras.layers.GlobalAveragePooling2D()
		self.f1 = tf.keras.layers.Dense(10)
	def call(self,inputs) :
		x=self.c1(inputs)
		x=self.b1(x)
		x=self,a1(x)
		x=self.blocks(x)
		x=self.p1(x)
		y=self.f1(x)		Dropout operation is not needed because GAP is used
		return y
					
Copy the code

Three-tier residual units are used to build deeper networks:

6. Summary of classical convolutional networks

conclusion

Convolutional Neural Networks — AlexNet, VGG, GoogleNet, ResNet, CNN Analysis and ImageNet Champion Model analysis over the years