Recently wrote a few basic things, always understand sex, didn’t see the instance, today is a basic network structure RNN, then write a instance, experience the depth of the neural network, the cow force the RNN neural network learning, although look very profound, but don’t panic, not theory, is full of vernacular, everybody can understand.
Note: WHEN reading, I hope to look at the table of contents first to know what I am talking about. I can also quickly read the points I want to get, so that I can understand it faster.
1. What is RNN
RNN is short for Rerrent Neural Network. Its English name is Rerrent Neural Network = RNN. Why loop. This will be explained slowly. There’s no hurry.
RNN is very effective for sequential data. It can mine temporal and semantic information in data. In terms of words, RNN has context information, can understand context, and can be used to mine the relationship between data when you do analysis.
For example: I don’t like beauty, to break up the text, I, no, love, beauty, in the ordinary course of connection of the neural network can each word is encouraged, no relationship, through a lot of function to fitting the data, this machine may understand is like beauty, not consider “no” in front of the context information.
RNN can solve this problem. RNN will record the information of the whole sentence and then make a comprehensive judgment before reaching a conclusion.
To sum up: RNN neural network is good at finding the relationship between data.
2. Principle description
2.1 RNN and full connection neural network slightly difference Common connection neural network as follows, all of them, and each attribute is independent, and then through a lot of fitting function parameters, and then processed, their own conclusions, as detailed introduction of function fitting, it can be seen that there was no correlation between each data
RNN is a recurrent neural network. Although I want to describe RNN in the most straightforward terms, but you may refer to the data in the future, you will see this picture frequently, so I put it in, so as not to fail to understand next time, because the figure is not so easy to understand, of course, if you understand RNN, you may understand. But it’s a little hard for us to get started.
The left part is the unexpanded RNN as shown in the figure. Where is the cycle of the recurrent neural network
X is a vector that represents the value of the input layer
U is the weight matrix from the input layer to the hidden layer
S is a vector that represents the value of the hidden layer
V is the weight matrix from the hidden layer to the output layer.
O is also a vector that represents the value of the output layer
Expressed as a function:
Let’s do it in code
def getHidenS(x,w,prevS):
return x * u + prevS*w
def getOutput(s):
return s * v
Copy the code
2.3 RNN expansion diagram interpretation
The diagram on the right looks simple, but the x below adds a time series, where the x represents the input word in time, for example:
For example: I love China, the word sequence, T-1 is the vector representation of the word “I”, love is the vector representation of the word “T”, t+1 is the vector representation of the word “China”
O is the output of the neural network for each word input, that is to say, each time you input a word vector there is an output, and then you can come to the conclusion that you can use one of them, or you can take a combination of them, depending on your needs
2.4 Common parameters of some points of RNN: Common parameters of time series of RNN, that is to say, the whole RNN shares a common set of parameters. When input at different time points, the weight parameters of neural network are all the same set. So there’s only one set of W parameters.
Memory function: Memory function is realized based on the output value of the hidden layer. Because the hidden layer will save the last information
3. The pseudo-code of RNN means encoding “I love China”, I = 1 love = 2 China =3
Enter x = [1,2,3]
W = 1 # weight matrix u = 1 # input layer to hidden layer matrix V = 2 # hidden layer to output layer matrix prevS = 1 # Hidden layer output value
def getHidenS(x, w, prevS):
return x * u + prevS * w
def getOutput(s,v):
return s * v
sentance = [1.2.3]
for x in sentance:
prevS = getHidenS(x,u,prevS)
o = getOutput(prevS,v)
print('Hidden layer value:'+ str(prevS))
print('Output layer value:'+str(o))
print('-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --')
Copy the code
PrevS saves previous memories, and each output can be used to determine
4. Here’s a quick example
import torch
from torch import nn
import numpy as np
import matplotlib.pyplot as plt
# https://www.cnblogs.com/lokvahkoor/p/12263953.html
# torch.manual_seed(1) # reproducible
# hyperparameter definition
TIME_STEP = 10 # rnn time step
INPUT_SIZE = 1 # rnn input size
LR = 0.02 Vector #
HIDDEN_SIZE = 32# Number of hidden layer neurons
EPOCH = 100
Output 100 float points in horizontal coordinates
steps = np.linspace(0, np.pi * 2.100, dtype=np.float32) # float32 for converting torch FloatTensor
x_np = np.sin(steps)
y_np = np.cos(steps)
The input argument is a sequence of sine, and the output is a sequence of cosine
plt.plot(steps, y_np, 'r-', label='target (cos)')
plt.plot(steps, x_np, 'b-', label='input (sin)')
plt.legend(loc='best')
plt.show()
input("Please enter:")
class RNN(nn.Module) :
def __init__(self) :
super(RNN, self).__init__()
self.rnn = nn.RNN(
input_size=INPUT_SIZE,
hidden_size=HIDDEN_SIZE, # Number of hidden neurons
num_layers=1.# a layer of RNN
batch_first=True.# input & output will has batch size as 1s dimension. e.g. (batch, time_step, input_size)
)
self.out = nn.Linear(HIDDEN_SIZE, 1)
def forward(self, x, h_state) :
# x = (batch, time_step, input_size)
# h_state = (n_layers, batch, hidden_size)
# r_out = (batch, time_step, hidden_size)
out, h_state = self.rnn(x, h_state)
out = out.view(-1, HIDDEN_SIZE) # (10, 32)
out = self.out(out) # (1, 1)
out = out.unsqueeze(dim=0) # (1,10,1) -> (n_layers, batch, hidden_size)
return out, h_state
rnn = RNN()
print(rnn)
optimizer = torch.optim.Adam(rnn.parameters(), lr=LR) # the optimizer
loss_func = nn.MSELoss() # Loss function
h_state = None Hide the output value of the layer
plt.figure(1, figsize=(12.5))
plt.ion() # continuously plot
for step in range(EPOCH):
# Every time new data is generated, the overall trend is to fit the cosine curve
start, end = step * np.pi, (step + 1) * np.pi # time range
# use sin predicts cos
steps = np.linspace(start, end, TIME_STEP, dtype=np.float32,
endpoint=False) # float32 for converting torch FloatTensor
x_np = np.sin(steps)
y_np = np.cos(steps)
# np.newaxis insert new dimensions,(1,10,1)
# shape (batch, time_step, input_size)
# means one in each batch
x = torch.from_numpy(x_np[np.newaxis, :, np.newaxis])
y = torch.from_numpy(y_np[np.newaxis, :, np.newaxis])
prediction, h_state = rnn(x, h_state) # compute output
Save the hidden layer result of the previous step and enter it next time
h_state = h_state.data # repack the hidden state, break the connection from last iteration
loss = loss_func(prediction, y) # Calculation error
optimizer.zero_grad() Clear the previous gradient
loss.backward() # Backpropagation
optimizer.step() # Optimization parameters
# Start drawing
plt.plot(steps, y_np.flatten(), 'r-')
plt.plot(steps, prediction.data.numpy().flatten(), 'b-')
plt.draw()
plt.pause(0.05)
plt.ioff()
plt.show()
Copy the code
Take a look at the final fit:
5. Problems existing in RNN
For gradient extinction: since they all have special ways to store “memories”, “memories” with large gradients in the past will not be erased immediately like simple RNN, so the gradient extinction problem can be overcome to a certain extent.
For gradient explosion: The problem to overcome gradient explosion is gradient clipping. That is, when the gradient you calculate exceeds the threshold C or is less than the threshold -C, set the gradient to C or -C.
6, summary
The key point of RNN is the memory function, that is, the context information is saved, but there are some problems, we will analyze how to solve this problem later.
Original is not easy, ask for praise support, power for love.
Click on the title to jump to
1. If you don’t go into the pit again, it will be too late
2. It is too late to enter the pit again. What is the simplest neural network like?
Understanding what a tensor is in three minutes