Deep learning 004-Elman Recurrent Neural network

(Python libraries and versions used in this article: Python 3.6, Numpy 1.14, Scikit-learn 0.19, matplotlib 2.2)

Elman neural Network is the earliest Recurrent neural Network, which was proposed by Elman in 1990 and is also known as SRN (Simple Recurrent Network). SRN considers temporal information, and the output of the current moment is related not only to the input of the current moment, but also to the input of all previous moments. SRN is one of the simplest RNN structures. Compared with the traditional two-layer full-connection feed-forward network, it only adds timing feedback connection in the full-connection layer.

To put it simply, the previous calculation of deep neural network can be simply understood as: Yt = F (Xt), while SRN also puts the result of the last moment into the model as input, which is equivalent to YT = F (Xt, YT-1). Due to this recursion, each YT result is not only related to its own feature vector Xt, but also related to the output result of the previous moment YT-1, so recursively, Yt and all the previous Xt, XT-1, xT-2… So YT is equivalent to “remembering” all input variables X of the previous N moments.

So how does SRN do this? SRN is generally divided into four layers: Input layer, hidden layer, layer and output layer, and said before the simple neural network, the layer will have the effect of a temporary variable Var, results in t – 1 time after yt – 1, the output of yt – 1 at the same time also save a copy to the Var, then calculate t time as a result, the Var as a variable input, Therefore, Var is equivalent to a delay operator to achieve the purpose of memory and make the entire network structure have the strain ability to adapt to time series. As shown below:

The cyclic layer in the figure is actually the continuation layer with different names. It is not easy to see the timing of this network structure, so after expansion, it is:

The picture is from a Recurrent Neural Network.

For more information about the more complex structure of Recurrent Neural Networks, refer to RNN (Recurrent Neural Networks)

So how do you build and train SRN models?


1. Construct and train The Elman recurrent neural network

1.1 Preparing data Sets

This time, we automatically generate a series of data, which contains four sections of data. The following is the data generation function.

# Prepare the data set
# Generate some sequence data with NP, this sequence data has four segments
def waveform_dataset(points_num):
    Build the waveform data set with four segments of points_num.
    stage1=1*np.cos(np.arange(points_num))
    stage2=2*np.cos(np.arange(points_num))
    stage3=3*np.cos(np.arange(points_num))
    stage4=4*np.cos(np.arange(points_num))
    
    dataset_X=np.array([stage1,stage2,stage3,stage4])The # 4 points_num column
    dataset_X=dataset_X.reshape(points_num*4.1) # change to: 4*points_num row, one column, that is, the entire sequence
    
    amp1 = np.ones(points_num) # the amplitude of each data segment is 1,4,2,0.5 respectively
    amp2 = 4 + np.zeros(points_num) 
    amp3 = 2 * np.ones(points_num) 
    amp4 = 0.5 + np.zeros(points_num) 
    dataset_y=np.array([amp1,amp2,amp3,amp4]).reshape(points_num*4.1)
    return dataset_X,dataset_y
Copy the code

Take a look at the distribution of the data set:

1.2 Build and train the model

Go directly to the code, where the existing function newelm() in the NeuroLab module is used to build an SRN model, consisting of two layers of neural network.

# Build and train models
import neurolab as nl
net = nl.net.newelm([[2 -.2]], [10.1], [nl.trans.TanSig(), nl.trans.PureLin()])
# Create a two-layer neural network
net.layers[0].initf = nl.init.InitRand([0.1.0.1].'wb')
net.layers[1].initf= nl.init.InitRand([0.1.0.1].'wb')
net.init()
# Network initialization
error = net.train(dataset_X, dataset_y, epochs=3000, show=300, goal=0.01)
Copy the code

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — – — – a — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Epoch: 300; Error: 0.08632353521527447; Epoch: 600; Error: 0.07758197978278435; Epoch: 900; Error: 0.047083147244329486; Epoch: 1200; Error: 0.03948011155907889; Epoch: 1500; Error: 0.03808612642771739; Epoch: 1800; Error: 0.03600983543384789; Epoch: 2100; Error: 0.04108011778013388; Epoch: 2400; Error: 0.0388262030539809; Epoch: 2700; Error: 0.033576743782171244; Epoch: 3000; Error: 0.03329548827926802; The maximum number of train epochs is reached

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — – — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — –

1.3 Use the trained model to predict new samples

Assume that the dataset_X used in the training set is a new sample, and look at the difference between the result and the predicted value.

# Predict new samples with trained models
predict_y=net.sim(dataset_X)
plt.plot(dataset_y,label='dataset')
plt.plot(predict_y,label='predicted')
plt.legend()
plt.title('Comparison of Truth and Predicted')
Copy the code

Of course, we can also use the waveform_dataset() function to generate some new data and use the trained model to predict it.

Generate a new dataset
newset_X,newset_y=waveform_dataset(100)
predict_y=net.sim(newset_X)
plt.plot(newset_y,label='dataset')
plt.plot(predict_y,label='predicted')
plt.legend()
plt.title('Comparison of Truth and Predicted')
Copy the code

It can be found that the model can also roughly predict the newly generated sequence data.

# # # # # # # # # # # # # # # # # # # # # # # # small * * * * * * * * * * and # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

1. Neurolab has integrated some simple neural network functions, such as the simplest cyclic neural network model –Elman cyclic neural network. For complex or self-defined cyclic neural networks, other more complex deep learning frameworks are needed.

1. Elman cyclic neural network model is the simplest cyclic neural network structure, which can only solve some relatively simple sequence data problems.

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #


Note: This part of the code has been uploaded to (my Github), welcome to download.

References:

1, Classic Examples of Python machine learning, by Prateek Joshi, translated by Tao Junjie and Chen Xiaoli