A brief introduction of BP neural network prediction algorithm

Note: Section 1.1 mainly summarizes and helps to understand the principle of BP neural network algorithm considering the influence factors, that is, the conventional TRAINING principle of BP model is explained (whether to skip according to their own knowledge). Section 1.2 begins with the BP neural network prediction model based on the influence of historical values.

When BP neural network is used for prediction, there are mainly two types of models from the perspective of input indexes to be considered:

1.1 BP neural network algorithm principle affected by relevant indicators

As shown in Figure 1, when BP is trained with the newff function of MATLAB, it can be seen that most cases are three-layer neural networks (namely, input layer, hidden layer and output layer). 1) Input layer: the input layer is equivalent to the human five senses. The five senses obtain external information, which corresponds to the input port of the neural network model in the process of receiving input data. 2) Hidden Layer: corresponding to the human brain, the brain analyzes and thinks about the data transmitted by the five senses. The hiddenLayer of the neural network maps the data x transmitted by the input Layer, which can be simply understood as a formula hiddenLayer_output=F(W *x+ B). Where w and b are weight and threshold parameters, F() is mapping rule, also called activation function, and hiddenLayer_output is the output value of the hidden layer for the transmitted data mapping. In other words, the hidden layer maps the input influence factor data X to produce the mapped value. 3) Output layer: it can correspond to human limbs. After thinking about the information from the five senses (hidden layer mapping), the brain controls the limbs to perform actions (responding externally). Similarly, output layer of BP neural network maps hiddenLayer_output again, outputLayer_output= W *hiddenLayer_output+ B. Where w and B are weight and threshold parameters, and outputLayer_output is the output value (also called simulation value and predicted value) of the neural network output layer (understood as the external execution action of human brain, such as the baby tapping the table). 4) Gradient descent algorithm: by calculating the deviation between outputLayer_output and the y value passed in by the neural network model, the algorithm is used to adjust parameters such as weight and threshold accordingly. This process, you can think of it as the baby slaps the table, misses it, adjusts its body depending on how far it misses so that the arm that is swinging again gets closer and closer to the table and hits.

Here’s another example to deepen your understanding:

The BP neural network shown in Figure 1 has an input layer, a hidden layer and an output layer. How does BP realize the output value outputLayer_output of the output layer through the three-layer structure, constantly approaching the given Y value, so as to obtain an accurate model by training?

From the ports strung together in the picture, one can think of a process: taking the subway. Imagine figure 1 as a subway line. One day wang went home by subway: Get on the bus at the input starting station, pass through many stations (hiddenLayer), and then find that the seat is too far (outputLayer corresponds to the current position), then Wang xx will be based on the distance from home (Target) (Error) of the current position, Return to the hiddenLayer and take the subway again (error reverse transmission, using the gradient descent algorithm to update w and b). If wang makes a mistake again, the adjustment process will be carried out again.

From the example of baby beating the table and Wang taking the subway, consider the problem: the complete training of BP needs to first input data to input, and then through the mapping of the hidden layer, the output layer gets the BP simulation value. According to the error between the simulation value and the target value, adjust the parameters, so that the simulation value constantly approaches the target value. For example, (1) infants react to external interference factors (X) and thus predict. The brain continuously adjusts the position of arms and controls the accuracy of limbs (Y and Target). (2) Wang got on the bus (X), passed through the station (predict), and kept returning to the halfway station to adjust his position and arrived home (Y and Target).

In these links, influencing factor data X and Target value data Y (Target) are involved. According to x and y, BP algorithm is used to find the rule between X and Y, and x is mapped to approximate Y. This is the role of BP neural network algorithm. One more word, all the processes mentioned above are BP model training, so though the model finally obtained is accurate in training, is the BP network found accurate and reliable? Then, we put X1 into the trained BP network to obtain the corresponding BP output value (predicted value) predicT1. By drawing and calculating the indicators such as Mse, Mape and R square, we can compare the closeness of predicT1 and Y1, so as to know whether the prediction of the model is accurate. This is the testing process of BP model, that is, to realize the prediction of data and verify the accuracy of the prediction by comparing with the actual value.



FIG. 1 structure diagram of 3-layer BP neural network

1.2 BP neural network based on the influence of historical values

Taking the power load forecasting problem as an example, the two models are distinguished. When predicting the power load within a certain period of time:

One way is to predict the load value at time T by considering the climatic factors at time T, such as the influence of air humidity X1, temperature X2 and holidays X3, etc. This is the model described in 1.1 above.

Another approach is to think that the change of power load value is related to time. For example, the power load value at t-1, T-2 and T-3 is related to the load value at t, which satisfies the formula Y (t)=F(y(t-1), Y (t-2),y(t-3)). When BP neural network is used to train the model, the influencing factor values input into the neural network are historical load values Y (t-1), Y (T-2),y(t-3). In particular, 3 is called autoregressive order or delay. The output value given to the target in the neural network is y(t).

Firefly algorithm

The basic idea of the algorithm is described as follows: in the population, each firefly individual is randomly distributed in the space defined by the objective function. At the initial stage, all fireflies have the same luciferase value and dynamic decision radius. Among them, each firefly individual determines its moving direction according to the strength of signals from all neighbors within the dynamic decision radius. The dynamic decision radius of firefly will change with the number of firefly individuals within its range, and the fluorescein of each firefly will change with the number of firefly individuals within its decision radius. Firefly swarm optimization algorithm is memoryless, without the global information and gradient information of the objective function, and has the characteristics of fast computing speed, few adjustment parameters and easy implementation. In firefly evolution, each iteration consists of firefly deployment (initialization), fluorescein update stage, movement probability calculation stage, position update stage and neighborhood update stage, which are described as follows:

1. Firefly deployment (initialization)

2. Fluorescein update stage

3. Calculation stage of moving probability

4. Location update stage

5. Neighborhood scope update stage

​​

Three, part of the code

clc clear all tic m=2; % of dimension n = 50; Number of fireflies a=3; b=-3; jis=0; L0=5; % fluorescein initial value beta=0.08; % Dynamic decision domain update rate nt=5; % Domain number threshold S =0.05; % % s step = 0.03; % step gama = 0.6; % fluorescein update rate, P =0.4; % fluorescein disappearance rate t=2; % Number initial value iter_max=200; % Maximum number of iterations R0=3; % The initial value of Rd in dynamic decision domain Rs=5; % RS>=Rd L=zeros(n,iter_max); Rd=zeros(n); P=zeros(n,n); Nei=cell(n,iter_max); % randomly assigned individual fluorescein and dynamic decision domain for I =1:n L(I,1)=L0; Rd(i)=R0; For I =1:n X(I,1:m)=(a-b)*rand(1,m)+b; for I =1:n X(I,1:m)=(a-b)*rand(1,m)+b; plot(X(i,1),X(i,2),'sk'); Hold on end while t < % iter_max fluorescein updates for I = 1: n L (I, t) = (1 - p) * L (I) (t - 1) + gama * J1 (X (I, 1: m)); End position of % rules for I = 1: n for j = 1: n the if (norm (X (j, 1: m) - X (I, 1: m)) < Rd (I)) && (L (I, t) < Nei L (j, t)) = {I, t} [j, Nei {I, t}]. End end tempsum=zeros(n); for i=1:n for j=Nei{i,t} tempsum(i)=L(j,t)-L(i,t)+tempsum(i); Mobile end end % probability calculation for I = 1: n for j = Nei {I, t} (I, j) = P (L (j, t) - L (I, t))/tempsum (I); end end for i=1:n if isempty(Nei{i,t}) X(i,1:m)= X(i,1:m); Rd(i)=min(Rs,max(0,Rd(i)+beta*(nt-length(Nei{i,t})))); plot(X(i,1),X(i,2),'*k'); hold on else for j=Nei{i,t} if P(i,j)==max(P(i,:))&&P(i,j)~=0 SeJ=j; % % choose the best mobile direction position update X (I, 1: m) = X (I, 1: m) + s. * (X (SeJ, 1: m) - X (I, 1: m))/norm (X (SeJ, 1: m) - X (I, 1: m)); Dynamic decision domain % update Rd (I) = min (Rs, Max (0, Rd (I) + beta * (nt - length (Nei {I, t})))); plot(X(i,1),X(i,2),'*k'); hold on end end P(i,:)=zeros(1,n); End end if t<=150 s=s-0.0003; End t=t+1 End % For I =1:n num=0; For j = I + 1: n if norm (X (I, 1: m) - X (j, 1: m)) < 0.05 num = num + 1; if num<=1 jis=jis+1; if J1(X(i,1:m))<J1(X(j,1:m)) fuzhu(jis,1:m)=X(j,1:m); else fuzhu(jis,1:m)=X(i,1:m); end X(j,1:m)=[inf]; else if J1(X(i,1:m))<J1(X(j,1:m)) fuzhu(jis,1:m)=X(j,1:m); end X(j,1:m)=[inf]; Toc for I =1:n plot(X(I,1),X(I,2),'or'); Hold on end grid on %x1=-3:0.05:3; %y1=x1'; %x=ones(size(y1))*x1; %y=y1*ones(size(x1)); %z=3.*(1-x).^2.*exp(-(x.^2+(y+1).^2))-10.*(x./5-x.^3-y.^5).*exp(-(x.^2+y.^2))-(1/3).*exp(-((x+1).^2+y.^2)); %figure(1) %surf(x,y,z) %figure(1 ) %for i=1:n %temp2=X{i,2}; %plot(temp2(1),temp2(2),'sk'); % hold on % temp3=X{i,iter_max}; % plot(temp3(1),temp3(2),'*r'); % hold on % for j=2:iter_max-1 % temp=X{i,j}; % plot(temp(1),temp(2),'k'); % hold on % grid on %end %end %for j=2:iter_max-1 % temp2=X{1,2}; % plot(temp2(1),temp2(2),'sr'); % grid on % hold on % temp3=X{1,199}; % plot(temp3(1),temp3(2),'or'); % hold on % temp=X{1,j}; % plot(temp(1),temp(2),'*k-.'); %hold on %endCopy the code

4. Simulation results

FIG. 2 Convergence curve of firefly algorithm

The following table shows the test statistics

The test results

Test set accuracy rate

Correct rate of training set

BP neural network

100%

95%

GWO-BP

100%

99.8%

Five, reference and code private message blogger

Forecasting of Water Resources Demand in Ningxia Based on BP Neural Network