Writing in the front

The following article will first introduce the basic principles of Extreme Learning Machine (ELM), then implement ELM through Python and use it in stock price prediction. The source code is obtained at the end of the article.

1

Basic principles of extreme Learning machine \

Extreme Learning Machine (ELM) is an algorithm proposed by Huang Guangbin to solve single hidden layer neural network. ELM is an algorithm applied to training single hidden layer feedforward neural network. Traditional single hidden layer feedforward neural network has many application achievements in pattern recognition, signal processing, short-term prediction and other fields due to its simple structure, fast training speed and high generalization ability. Compared with the traditional BP algorithm based on gradient descent to train SLFN, ELM has better generalization ability and faster training speed. \

For a single hidden layer neural network as shown in the figure below, suppose there is a sample, where, for a single hidden layer neural network with one hidden layer node, the forward propagation process is expressed as:

   

Where, represents the activation function, is the first weight matrix of the hidden layer, is the first bias item of the hidden layer, and is the weight of the output layer.

The goal of single-hidden layer neural network is to minimize the error of is output, which can be expressed as:

   

Then if there is such that:

   

In matrix form: \

   

Where, is the output of the hidden layer node, is the weight of the output layer, and is the expected output. \

In order to train the single hidden layer neural network to minimize the loss function:

   

In the ELM algorithm, the weight and bias of the hidden layer are determined randomly, so the output matrix of the hidden layer is determined. Then the training process is transformed into solving a linear system, and the weight of the output layer can be determined: \

   

Where, is the Moore-Penrose generalized inverse of the matrix. It can be proved that the norm of the solution is minimum and unique.

Since ELM is a batch-based algorithm, it means that in the training phase, it needs to obtain all the training data and train then test, instead of updating online with the arrival of new data. Therefore, Professor Huang guangbin’s team also proposed an online sequential over-the-limit learning machine algorithm for online learning and updating network parameters. Os-elm has the advantages of ELM in speed and generalization ability, and can constantly update the model with the arrival of new data, instead of retraining the model. Os-elm is divided into two parts, the first part is through a small number of training samples, using ELM algorithm to calculate and initialize the output weight; The second part starts online learning. Every time a new data sample comes, a recursive formula is used to get the new output weight, so as to realize online and fast training. For specific derivation of the recursive formula of OS-ELM, you can refer to relevant papers and online materials. Here is the flow chart of OS-ELM algorithm: \

2

Environment preparation \

Local Environment:

Python 3.7
IDE:Pycharm
Copy the code

Library version:

numpy 1.18.1
pandas 1.0.3 
sklearn 0.22.2
matplotlib 3.2.1
Copy the code

Then, import all the libraries you need:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
Copy the code

3

Code implementation \

First, we need to define an ELM class and initialize the weight parameters.

class ELM() :
    def __init__(self, input_nums, hidden_nums, output_nums):
        self.input_nums = input_nums
        self.hidden_nums = hidden_nums
        self.output_nums = output_nums
        self.is_inited = False


        # Hidden layer weight matrix
        self.W = np.array([[np.random.uniform(-1.1) for _ in range(self.hidden_nums)] for i in range(self.input_nums)])
        # Hidden layer bias
        self.bias = np.array([np.random.uniform(-1.1) for _ in range(self.hidden_nums)])
        # Output layer weight
        self.beta = np.zeros(shape=[self.hidden_nums, self.output_nums])
        # (H^{T}H)^{-1}
        self.P = np.zeros(shape=[self.hidden_nums, self.hidden_nums])
Copy the code

Then define the initialization method and the activation function, which uses sigmoID. In the initialization phase, the output matrix of the hidden layer is calculated first, then the weight of the output layer is calculated. Finally, setting the is_inited parameter to True indicates that initialization is complete. \

    def init_train(self, x, target):
        # output matrix
        H = self.activation(np.dot(x, self.W) + self.bias)
        HT = np.transpose(H)
        HTH = np.dot(HT, H)
        self.P = np.linalg.inv(HTH)
        pHT = np.dot(self.P, HT)
        self.beta = np.dot(pHT, target)
        self.is_inited = True


    def activation(self, x):
        return 1 / (1 + np.exp(-x))
Copy the code

After the initialization of the model is completed, it can be used for further online update whenever new data arrives. The way of update can be referred to the algorithm flow chart of OS-ELM above. The code implementation is as follows, where X and target can be individual samples or batch samples. \

    def seq_train(self, x, target):
        batch_size = x.shape[0]
        H = self.activation(np.dot(x, self.W) + self.bias)
        HT = np.transpose(H)
        I = np.eye(batch_size)
        Hp = np.dot(H, self.P)
        HpHT = np.dot(Hp, HT)
        temp = np.linalg.inv(I + HpHT)
        pHT = np.dot(self.P, HT)
        self.P -= np.dot(np.dot(pHT, temp), Hp)
        pHT = np.dot(self.P, HT)
        Hbeta = np.dot(H, self.beta)
        self.beta += np.dot(pHT, target - Hbeta)
Copy the code

Finally, it is the realization of the prediction method, that is, a forward calculation: \

    def predict(self, x):
        return np.dot(self.activation(np.dot(x, self.W) + self.bias), self.beta)
Copy the code

In order to evaluate the effect of the model, we used the closing price data of Shanghai Stock Index from 2014 to 2017 to conduct simple experimental verification. It is necessary to normalize the data first, then divide the data according to the sliding window, and then divide the training and test sets. In the experiment, we first tested the effect of ELM without online update, and then tested the effect of ELM with online update. RMSE and fitting effects of the two are shown as follows:

RMSE of ELM 34.04633755067868
RMSE of OS-ELM 29.139883515427595
Copy the code

It can be seen from the experimental results that ELM can achieve good data fitting effect, and the ELM updated online can achieve better effect. Although the accuracy of ELM is not particularly high compared with LSTM, TCN and other deep learning, its simple structure and faster training speed can make it also have strong practical application value.

4

Conclusion \

In this paper, we introduced the basic principle of ELM, and conducted experiments on the model through Shanghai Stock Index. It can be seen that ELM has certain effects on stock price prediction. In addition, we only use price data as data input. In order to better combine market characteristics, we can also use some technical indicators as input to make the model learn more market information. \

References:

Huang, G. B. ,  Zhu, Q. Y. , &  Siew, C. K. . (2006). Extreme learning machine: theory and applications. Neurocomputing , 70(1/3), 489-501.

Easy to learn machine learning algorithm — EXTREME Learning Machine (ELM) Easy to learn machine learning algorithm — ONLINE sequential extreme learning machine OS-ELM

For complete code and data, please reply “090” in the background of the public account of ARTIFICIAL Intelligence Quantitative Laboratory. The content of this article is technical discussion and learning only, and does not constitute any investment advice.

Learn more about artificial intelligence and quantitative finance

<- Please scan for attention

Let me know you’re watching