Hands-on deep learning 4.4 Overfitting underfitting

On the 14th day of the November Gwen Challenge, check out the details of the event: the last Gwen Challenge 2021

import math
import numpy as np
import torch
from torch import nn
from d2l import torch as d2l
Copy the code

max_degree = 20  # The maximum order of a polynomial
n_train, n_test = 100.100  # Training and testing data set size
true_w = np.zeros(max_degree)  Allocate space for real weights
true_w[0:4] = np.array([5.1.2, -3.4.5.6])

features = np.random.normal(size=(n_train + n_test, 1))
np.random.shuffle(features)
poly_features = np.power(features, np.arange(max_degree).reshape(1, -1))
for i in range(max_degree):
    poly_features[:, i] /= math.gamma(i + 1)  # 'gamma(n)' is the factorial of n-1 which is (n-1) factorial.
    
# 'labels' dimension: (' n_train' + 'n_test',)
labels = np.dot(poly_features, true_w)
labels += np.random.normal(scale=0.1, size=labels.shape)
Copy the code

This code is the key to understanding this article!

First, we need to manually generate a manual data set. Given XXX, [use the following third-order polynomial to generate labels for training and test data:]

Frac {x^2}{2! } + 5.6 \ frac {x ^ 3} {3! } + \epsilon \text{where} \epsilon \sim \mathcal{N}(0, 0.1^2)

The noise term ϵ\epsilonϵ follows a normal distribution with a mean value of 0 and a standard deviation of 0.1.

That is, we generate a line that matches y=5+1.2x−3.4×22! + 5.6 x33! Frac {x^2}{2! } + 5.6 \ frac {x ^ 3} {3! } y = 5 + 1.2 x – 3.42! X2 + 5.63! X3 data set, and I randomly add gaussian noise to it.

One more thing, Gaussian noise is a random error subject to a Gaussian distribution. I was in long Liang Qu’s class, and I had no idea what he was talking about.

max_degreeSince we talked about underfitting, we set this to 20, which means we can fit it up to 20 orders.
true_wThe length is 20, but our artificial data set is a third order polynomial, so we only need to assign the first four values.
featuresRandom generation of x is a random generation of xn_train + n_testThe vector
np.random.shuffle(features)I’m going to throw x out of order
poly_featuresIs the power of features, that is, the number of degrees of x
And then the for loop is correctpoly_featuresDivided by factorial.
labelsThe eigenvalues to be entered are x times the real w
The last sentence is to add random noise to it

NumPy ndarray translate into tensor
true_w, features, poly_features, labels = [torch.tensor(x, dtype=d2l.float32) for x in [true_w, features, poly_features, labels]]
Copy the code

def train(train_features, test_features, train_labels, test_labels,num_epochs=400) :
    loss = nn.MSELoss()
    input_shape = train_features.shape[-1]
    # bias=False Do not set bias because we already implemented it in the polynomial feature
    net = nn.Sequential(nn.Linear(input_shape, 1, bias=False))
    batch_size = min(10, train_labels.shape[0])
    train_iter = d2l.load_array((train_features, train_labels.reshape(-1.1)),
                                batch_size)
    test_iter = d2l.load_array((test_features, test_labels.reshape(-1.1)),
                               batch_size, is_train=False)
    trainer = torch.optim.SGD(net.parameters(), lr=0.01)
    animator = d2l.Animator(xlabel='epoch', ylabel='loss', yscale='log',
                            xlim=[1, num_epochs], ylim=[1e-3.1e2],
                            legend=['train'.'test'])
    for epoch in range(num_epochs):
        d2l.train_epoch_ch3(net, train_iter, loss, trainer)
        if epoch == 0 or (epoch + 1) % 20= =0:
            animator.add(epoch + 1, (evaluate_loss(net, train_iter, loss),
                                     evaluate_loss(net, test_iter, loss)))
    print('weight:', net[0].weight.data.numpy())
Copy the code

Animators are visualizations, so ignore the code with the animator.

The codes and final results of third-order polynomial fitting and under-fitting over-fitting are as follows:

# Select the first 4 dimensions from the polynomial characteristics, i.e. 1, x, x^2/2! , x^3/3!
train(poly_features[:n_train, :4], poly_features[n_train:, :4],labels[:n_train], labels[n_train:])
Copy the code

# Select the first 2 dimensions from the polynomial characteristics, i.e., 1, x
train(poly_features[:n_train, :2], poly_features[n_train:, :2],labels[:n_train], labels[n_train:])
Copy the code

# Select all dimensions from the polynomial characteristics
train(poly_features[:n_train, :], poly_features[n_train:, :],
      labels[:n_train], labels[n_train:], num_epochs=1500)
Copy the code

You can read more about hands-on Deep Learning here: Hands-on Deep Learning – LolitaAnn’s Column – Nuggets (juejin. Cn)

Notes are still being updated …………

Hands-on deep learning 4.4 Overfitting underfitting

Related Posts

Graphic python | install set with the environment

Detailed summary of Kalman filter principle + specific case analysis

Recurrent neural network RNN, LSTM, GRU