Participate in the 12th day of The November Gwen Challenge, see the details of the event: 2021 Last Gwen Challenge
The multilayer perceptron structure to be implemented is this two-layer structure:
import torch
from torch import nn
from d2l import torch as d2l
Copy the code
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
Copy the code
There should be no explanation here. For those of you who have read my previous articles on Hands-on Deep Learning, you will know that you set the mini-Batch batch size to 256 and then load the training set and test set of the fashion mnist dataset.
A user warning appears, which was also written in manual implementation of Softmax. No repetitions.
num_inputs, num_outputs, num_hiddens = 784.10.256
W1 = nn.Parameter(torch.randn(
num_inputs, num_hiddens, requires_grad=True) * 0.01)
b1 = nn.Parameter(torch.zeros(num_hiddens, requires_grad=True))
W2 = nn.Parameter(torch.randn(
num_hiddens, num_outputs, requires_grad=True) * 0.01)
b2 = nn.Parameter(torch.zeros(num_outputs, requires_grad=True))
params = [W1, b1, W2, b2]
Copy the code
- First, set the size of input layer, hidden layer and output layer respectively.
- We said that each image of the dataset is 28 by 28, so the input vector is 28 by 28=784
- Here we set up a multi-layer perceptron with a single hiding layer, which contains 256 hiding units
- The output layer vector size is 10, because the image is divided into ten classes.
- Then initialize the weights and bias for each layer.
nn.Parameter
You can add it or not. I didn’t add it before.
def relu(X) :
a = torch.zeros_like(X)
return torch.max(X, a)
Copy the code
The activation function is ReLU, not sigmoID or something like that. Common activation function – Digging gold (juejin. Cn)
def net(X) :
X = X.reshape((-1, num_inputs))
H = relu(X@W1 + b1) # where "@" stands for matrix multiplication
return (H@W2 + b2)
Copy the code
Set up the network.
- Let’s deal with X first.
- The multiplication notation uses the
@
, can be seen here:Various multiplications in PyTorch – Nuggets (juejin. Cn).
loss = nn.CrossEntropyLoss()
Copy the code
Here directly use cross entropy loss, will not repeat the wheel, interested to see hands-on deep learning 3.6- manual softmax regression – digging gold (juejin. Cn), there is written how to achieve cross entropy loss.
num_epochs, lr = 10.0.1
updater = torch.optim.SGD(params, lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)
d2l.predict_ch3(net, test_iter)
Copy the code
- Set the number of iterations and learning rate of training
- Set to optimize
d2l.train_ch3
trainingd2l.predict_ch3
To evaluate the model we learned, weApply the model on some test data.