Hands-on deep learning 4.2 Multi-layer sensing machine manual implementation

Participate in the 12th day of The November Gwen Challenge, see the details of the event: 2021 Last Gwen Challenge

The multilayer perceptron structure to be implemented is this two-layer structure:

import torch
from torch import nn
from d2l import torch as d2l
Copy the code

batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
Copy the code

There should be no explanation here. For those of you who have read my previous articles on Hands-on Deep Learning, you will know that you set the mini-Batch batch size to 256 and then load the training set and test set of the fashion mnist dataset.

A user warning appears, which was also written in manual implementation of Softmax. No repetitions.

num_inputs, num_outputs, num_hiddens = 784.10.256

W1 = nn.Parameter(torch.randn(
    num_inputs, num_hiddens, requires_grad=True) * 0.01)
b1 = nn.Parameter(torch.zeros(num_hiddens, requires_grad=True))
W2 = nn.Parameter(torch.randn(
    num_hiddens, num_outputs, requires_grad=True) * 0.01)
b2 = nn.Parameter(torch.zeros(num_outputs, requires_grad=True))

params = [W1, b1, W2, b2]
Copy the code

First, set the size of input layer, hidden layer and output layer respectively.
- We said that each image of the dataset is 28 by 28, so the input vector is 28 by 28=784
- Here we set up a multi-layer perceptron with a single hiding layer, which contains 256 hiding units
- The output layer vector size is 10, because the image is divided into ten classes.
Then initialize the weights and bias for each layer.nn.ParameterYou can add it or not. I didn’t add it before.

def relu(X) :
    a = torch.zeros_like(X)
    return torch.max(X, a)
Copy the code

The activation function is ReLU, not sigmoID or something like that. Common activation function – Digging gold (juejin. Cn)

def net(X) :
    X = X.reshape((-1, num_inputs))
    H = relu(X@W1 + b1)  # where "@" stands for matrix multiplication
    return (H@W2 + b2)
Copy the code

Set up the network.

Let’s deal with X first.
The multiplication notation uses the@, can be seen here:Various multiplications in PyTorch – Nuggets (juejin. Cn).

loss = nn.CrossEntropyLoss()
Copy the code

Here directly use cross entropy loss, will not repeat the wheel, interested to see hands-on deep learning 3.6- manual softmax regression – digging gold (juejin. Cn), there is written how to achieve cross entropy loss.

num_epochs, lr = 10.0.1
updater = torch.optim.SGD(params, lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)
d2l.predict_ch3(net, test_iter)
Copy the code

Set the number of iterations and learning rate of training
Set to optimize
d2l.train_ch3training
d2l.predict_ch3To evaluate the model we learned, weApply the model on some test data.

Hands-on deep learning 4.2 Multi-layer sensing machine manual implementation

Related Posts

TensorFlow has launched MoveNet, its latest postures detection model

Text classification remains BERT? The dual contrast learning framework is also too strong

Not only double 11 big screen – Flink application scenario introduction