This article is an official PyTorch tutorial: How to build neural networks. Build a simple neural network based on the specialized sub-module torch. Nn of PyTorch.

Complete tutorial running Codelab

The torch. The nn documents

Neural networks consist of layers/modules that perform operations on data. Torch. Nn provides all the modules needed to build the neural network.

Each module in PyTorch is a subclass of nn.module. In the following sections, we will build a neural network to classify the 10 categories.

Build a neural network

Neural networks consist of layers/modules that perform operations on data. Torch. Nn provides all the modules needed to build the neural network. Each module in PyTorch is a subclass of nn.module. In the following sections, we will build a neural network to classify the 10 categories.

import os
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
Copy the code

Loading training equipment

We want to be able to train our models on hardware accelerators such as gpus. You can use torch. Cuda to check if the GPU is available.

Device = 'cuda' if torch.cuda.is_available() else 'CPU' CPU print('Using {} device'.format(device)Copy the code

Define the class

We define the neural network through nn.Module and initialize the neural network in __init__. Each nn.Module subclass performs operations on the input data in the forward method.

class NeuralNetwork(nn.Module): def __init__(self): Super (NeuralNetwork, self).__init__() self.flatten = nn.Flatten() self.linear_relu_stack = nn.Sequential( nn.Linear(28*28, 512), nn.ReLU(), nn.Linear(512, 512), nn.ReLU(), nn.Linear(512, 10), nn.ReLU() ) def forward(self, x): # Forward propagation x = self.flatten(x) logits = self.linear_relu_stack(x) return logitsCopy the code

The model needs to be instantiated and moved to the GPU before it can be used

Model = NeuralNetwork().to(device)Copy the code

In order to create complex nonlinear mappings between the inputs and outputs of the model, nonlinear activation functions are used.

They introduce nonlinearity after linear transformation to help neural networks learn various complex mappings. In this model, we use nn.ReLU between linear layers, and other activation functions can also be used to introduce nonlinearity.

X = torch. Rand (1, 28, 28, device=device) * * * * * * * * * * * * * * * * * * * * * * * * Print (f"Predicted class: {y_pred}")Copy the code

Neural network layer description

Next, let’s break down the network to see what each layer does.

To illustrate this, we will take a small batch of three 28×28 image samples and input them into the network

Input_image = torch. Rand (3,28,28)Copy the code

Nn. Flatten the layer

The Flatten layer is used to render multidimensional inputs one-dimensional and is often used in the transition from the convolution layer to the fully connected layer.

The nn.Flatten layer converts each 28×28 image into a contiguous array of 784 (28×28=78428\times 28=78428×28=784) pixel values (the batch dimension is kept at 3).

Flatten = nn.flatten () flat_image = flatten(input_image) # (3,28,28)Copy the code

Nn. Linear layer

Nn.Linear layer is a module that uses weights and biases to transform the input data linearly.

Layer1 = nn.Linear(in_features=28*28, out_features=20) 20) hidden1 = layer1(flat_image) print(hidden1.size())Copy the code

Nn. ReLU layer

In order to create complex nonlinear mappings between the inputs and outputs of the model, nonlinear activation functions are used. They introduce nonlinearity after linear transformation to help neural networks learn various complex mappings.

In this model, we use nn.ReLU between linear layers, and other activation functions can also be used to introduce nonlinearity.

print(f"Before ReLU: {hidden1}\n\n")
hidden1 = nn.ReLU()(hidden1)
print(f"After ReLU: {hidden1}")
Copy the code

Nn. Sequential layer

The last linear layer of the neural network returns the logits, the original value of the range in [−∞, ∞][-\infty, \infty][−∞, ∞]. After these values are passed to the nn.Softmax module, logit is scaled into the interval of [0,1][0,1][0,1], representing the predicted probability of the model for each class.

The dim parameter indicates the location of the operation for each dimension, and the result of the operation adds up to 1.

softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)
Copy the code

Output model structure

Many layers in the neural network are parameterized, that is, with associated weights and biases, and these parameters are iteratively optimized during training.

The subclass nn.module automatically tracks all fields defined within the model object and accesses all parameters using the model’s parameters() or named_parameters() methods.

We can iterate through the model for each parameter and output its size and value.

print("Model structure: ", model, "\n\n")

for name, param in model.named_parameters():
    print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")
Copy the code

The full tutorial is available for the final output