Abstract: The algorithm model is built based on dynamic multi-hidden layer LSTM RNN. The loss function uses the cross_entropy maximum loss with M dimensions of input and N dimensions of output. The code is implemented based on PYTHon3.6. X and TensorFlow1.13.x.
1. Introduction
Data-driven user behavior analysis is particularly important for products in new retail, finance, supply chain, online education, banking, securities and other industries. The purpose of user behavior analysis is to promote product iteration, achieve precise marketing, provide customized services, and drive product decisions.
Let’s take the business scenario of a new retail gas station as an example. When we go to fill up our car, we are faced with spending money and time-consuming problems such as annoying traffic jams and waiting in line for gas. We might stop at a gas station about once a week to get some gas, maybe stop at a convenience store to pick up some non-gas items, or wash our car.
The behavioral event analysis shown in the figure above is to analyze the transaction data of fuel card for user specific events according to key operational indicators. By tracking or recording user behavior events in a timely manner, you can quickly understand the trend of events and customer completion.
The prediction of stock price in the near and future also belongs to the analysis category of time series trading behavior. It is easier for us to obtain stock trading data and obvious benchmarking prediction results. Now, the industry can refer to the use of CNN+LSTM algorithm to do behavior analysis, stock price prediction cases. Due to the lack of data for businesses such as new retail, we first studied the algorithmic model using stock data.
2. Sequence prediction analysis, overview of deep learning LSTM algorithm
In deep learning, there is a neural network suitable for processing sequence data, that is, the recurrent neural network RNN, which is widely used in NLP and has good effects. Moreover, it is also widely used in financial quantitative analysis, especially in behavior analysis and stock price prediction. We will not delve into the algorithmic model here, but only introduce the key content from the perspective of application.
2.1. Overview of recurrent Neural network RNN
Rerrent Neural Network (RNN), which appeared in the 1980s, refers to a simple Neural Network structure that repeats cycles with the passage of time series. It is composed of input layer, hidden layer and output layer.Time series predictive analysis is to predict the features of an event in the future by using the features of an event in the past period of time. This is a kind of relatively complex prediction modeling problem. Different from the prediction of regression analysis model, time series model is dependent on the sequence of events, and the input results of the model after changing the sequence of values with the same size are different.
RNN learns on sequential data, and in order to remember the data, RNN generates memories of previous events just like humans do.
2.2. Overview of long and short-term memory recurrent neural network LSTM
A regular RNN is like a grandpa, sometimes forgetful. Why is that? A normal RNN takes a long path to the last point in time for input using mobile software. And then, we get the error, and in passing the error backwards, he’s multiplying by his own parameter, W, at every step. If this W is a number less than 1, and you multiply it by the error, the error will also be close to zero at the initial point in time, and the error will essentially disappear. We call this problem gradient extinction or gradient dispersion. On the other hand, if W is a number greater than 1, then it ends up being infinite, which is what we call a razor explosion. This is why ordinary RNNS have no way of recalling distant memories.
LSTM is born to solve this problem. Compared with ordinary RNN, LSTM has three more controllers: input gate, output gate and forget gate. The figure below shows an overview of the results within LSTM.
Ft = sigma (Wf ⋅ [ht – 1, xt] + bf) f_t = sigma (W_f \ cdot [h_ {1} t -, x_t] + b_f) ft = sigma (Wf ⋅ [ht – 1, xt] + bf) it = sigma (Wi ⋅ [ht – 1, xt] + bi) i_t = sigma (W_i \ cdot [h_ {1} t -, x_t] + b_i) it = sigma (Wi ⋅ [ht – 1, xt] + bi) ot = sigma settlement (⋅ [ht – 1, xt] + bo) o_t = sigma (W_o \ cdot [h_ {1} t -, x_t] + b_o) ot = sigma settlement (⋅ [ht – 1, xt] + bo) C ~ t = tanh (Wc ⋅ [ht – 1, xt] + BC) \ tilde _t = {C} tanh (W_c \ cdot [h_ {1} t -, x_t] + b_c) C ~ t = tanh (Wc ⋅ [ht – 1, xt] + BC) Ct = ft * C ~ Ct – 1 + it tC_t = f_t \ times C_ {} t – 1 + i_t \ times \ tilde {C} _tCt = ft x Ct – 1 + it * ht = C ~ t ot * tanh h_t = o_t \ times (Ct) Tanh (C_t) ht = ot by tanh (Ct)
Among them, FTF_TFT is the forgetting gate, ITI_TIT is the input gate, oTO_TOt is the output gate, CtC_tCt is the neuron state, hTH_THT is the hidden layer state value, W and B are the weight and bias respectively.
LSTM has an extra memory to control the global situation, which we represent as the main plot in the figure, which is equivalent to the main plot in the script. The ordinary RNN system is a split story. We look at the input of the plot is very important for the final result, input control will be the line story according to the important degree to the main line, to forget, if the line of the story that changed our previous ideas, then forgotten control before some of the main story will forget, in proportion to replace new story now. LSTM is like a cure for memory decline, leading to better results.
For the LSTM algorithm model, the output states is a tuple, representing CtC_{t}Ct and HTH_ {T} HT respectively, where hTH_ {t} HT is equal to the output of the corresponding last moment in the outputs (i.e., the last cell).
3. LSTM algorithm model prediction of stock price practice
3.1. Build LSTM regression model based on TensorFlow 1.13.x
We define the main structure of LSTM RNN as class MultiLSTM. This RNN consists of three parts (input_layer, cell, output_layer) :
(1) Define the input layer:
with tf.name_scope('inputs'):
self.xs = tf.placeholder(tf.float32, [None, n_steps, input_size], name='xs')
self.ys = tf.placeholder(tf.float32, [None, n_steps, output_size], name='ys')
self.batch_size = tf.placeholder(tf.int32, [], name='batch_size')
# Probability of a node not being dropout
self.keep_prob = tf.placeholder(tf.float32, name='keep_prob')
Copy the code
The input layer also includes batCH_size (batch size of training data) and Keep_PROB (probability of avoiding over-fitting and not deleting nodes).
(2) Define the hidden layer:
Construct multi-hidden layer neural network.
Define multilayer LSTM
def add_multi_cell(self) :
cell_list = tf.contrib.rnn.BasicLSTMCell(self.cell_size, forget_bias=1.0, state_is_tuple=True)
with tf.name_scope('dropout') :if self.is_training:
# Add Dropout. To prevent overfitting, add dropout regular to its hidden layer
cell_list = tf.contrib.rnn.DropoutWrapper(cell_list, output_keep_prob=self.keep_prob)
tf.summary.scalar('dropout_keep_probability', self.keep_prob)
lstm_cell = [cell_list for _ in range(self.num_layers)]
lstm_cell = tf.contrib.rnn.MultiRNNCell(lstm_cell, state_is_tuple=True) # Missed? , state_is_tuple=True
with tf.name_scope('initial_state') : self.cell_init_state = lstm_cell.zero_state(self.batch_size, dtype=tf.float32) self.cell_outputs, self.cell_final_state = tf.nn.dynamic_rnn( lstm_cell, self.l_in_y, initial_state=self.cell_init_state, time_major=False)
Copy the code
When dropout is performed in RNN, the part of RNN is not dropout. Dropout is only performed when information is transmitted between multiple cells at the same time T.
Dynamic_rnn (cell, inputs) the time_major parameter in tf.nn. Dynamic_rnn (cell, inputs) has different inputs:
- Inputs: Seem_major =False if inputs are batches, Steps, Inputs. If the inputs
- If it is (Steps, Inputs), time_major=True;
(3) Define the output layer:
The output layer implements full connectivity without activation functions, that is, linear yi= Wixi +biy_i= w_IX_I +b_iyi= Wixi +bi.
Define the output full connection layer
def add_output_layer(self) :
l_out_x = tf.reshape(self.cell_outputs, [-1, self.cell_size], name='2_2D')
Ws_out = self._weight_variable([self.cell_size, self.output_size])
bs_out = self._bias_variable([self.output_size, ])
# shape = (batch * steps, output_size)
with tf.name_scope('Wx_plus_b'):
self.pred = tf.matmul(l_out_x, Ws_out) + bs_out
Copy the code
(4) Define the loss function:
Using tf. Contrib. Legacy_seq2seq. Sequence_loss_by_example define the loss function, because it is a multiple output model, averaging will affect the outcome, I take the maximum.
def compute_cost(self) :
losses = tf.contrib.legacy_seq2seq.sequence_loss_by_example(
[tf.reshape(self.pred, [-1], name='reshape_pred')],
[tf.reshape(self.ys, [-1], name='reshape_target')],
[tf.ones([self.batch_size * self.n_steps*self.output_size], dtype=tf.float32)],
average_across_timesteps=True,
softmax_loss_function=self.ms_error,
name='losses'
)
with tf.name_scope('average_cost') :# take maximum loss value
self.cost = tf.reduce_max(losses, name='average_cost')
''' self.cost = tf.div( tf.reduce_sum(losses, name='losses_sum'), self.batch_size_, name='average_cost') '''
tf.summary.scalar('cost', self.cost)
print('self.cost shape is {}'.format(self.cost.shape))
Copy the code
(5) Common problems and precautions in BUILDING LSTM network:
- Many-to-many is the most classical structure in RNN, whose input and output are sequence data of equal length.
- In tensorFlow1. x, batch_size should be set to an indefinite value to facilitate testing and model application without the need to construct batch_size.
- When performing LOSSE, for example, in the mean square error process, note that the number of data is BATCH_sizen_stepsOutput_size;
- Due to the limitation of data set size, when timestep is too large and the length of training set is small, the prediction results of the trained model will not fluctuate much and show a straight line. Therefore, it is suggested to shorten timestep.
At this point, the multi-layer LSTM model has been built and the integration code is as follows
class MultiLSTM(object) :
def __init__(self, n_steps, input_size, output_size, cell_size, batch_size,num_layers,is_training) :
self.n_steps = n_steps
self.input_size = input_size
self.output_size = output_size
self.cell_size = cell_size # number of LSTM neural units
self.batch_size_ = batch_size # enter the batch_size size
self.num_layers = num_layers # LSTM layers
# Is the training state
self.is_training = is_training
with tf.name_scope('inputs'):
self.xs = tf.placeholder(tf.float32, [None, n_steps, input_size], name='xs')
self.ys = tf.placeholder(tf.float32, [None, n_steps, output_size], name='ys')
self.batch_size = tf.placeholder(tf.int32, [], name='batch_size')
# Probability of a node not being dropout
self.keep_prob = tf.placeholder(tf.float32, name='keep_prob')
with tf.variable_scope('in_hidden'):
self.add_input_layer()
with tf.variable_scope('Multi_LSTM'):
self.add_multi_cell()
with tf.variable_scope('out_hidden'):
self.add_output_layer()
with tf.name_scope('cost'):
self.compute_cost()
with tf.name_scope('train'):
self.train_op = tf.train.AdamOptimizer(LR).minimize(self.cost)
def add_input_layer(self,) :
l_in_x = tf.reshape(self.xs, [-1, self.input_size], name='2_2D') # (batch*n_step, in_size)
Ws_in = self._weight_variable([self.input_size, self.cell_size])
bs_in = self._bias_variable([self.cell_size,])
with tf.name_scope('Wx_plus_b'):
l_in_y = tf.matmul(l_in_x, Ws_in) + bs_in
self.l_in_y = tf.reshape(l_in_y, [-1, self.n_steps, self.cell_size], name='2_3D')
Define multilayer LSTM
def add_multi_cell(self) :
cell_list = tf.contrib.rnn.BasicLSTMCell(self.cell_size, forget_bias=1.0, state_is_tuple=True)
with tf.name_scope('dropout') :if self.is_training:
# Add Dropout. To prevent overfitting, add dropout regular to its hidden layer
cell_list = tf.contrib.rnn.DropoutWrapper(cell_list, output_keep_prob=self.keep_prob)
tf.summary.scalar('dropout_keep_probability', self.keep_prob)
lstm_cell = [cell_list for _ in range(self.num_layers)]
lstm_cell = tf.contrib.rnn.MultiRNNCell(lstm_cell, state_is_tuple=True) # Missed? , state_is_tuple=True
with tf.name_scope('initial_state') : self.cell_init_state = lstm_cell.zero_state(self.batch_size, dtype=tf.float32) self.cell_outputs, self.cell_final_state = tf.nn.dynamic_rnn( lstm_cell, self.l_in_y, initial_state=self.cell_init_state, time_major=False)
Define the output full connection layer
def add_output_layer(self) :
l_out_x = tf.reshape(self.cell_outputs, [-1, self.cell_size], name='2_2D')
Ws_out = self._weight_variable([self.cell_size, self.output_size])
bs_out = self._bias_variable([self.output_size, ])
with tf.name_scope('Wx_plus_b'):
self.pred = tf.matmul(l_out_x, Ws_out) + bs_out
def compute_cost(self) :
losses = tf.contrib.legacy_seq2seq.sequence_loss_by_example(
[tf.reshape(self.pred, [-1], name='reshape_pred')],
[tf.reshape(self.ys, [-1], name='reshape_target')],
[tf.ones([self.batch_size * self.n_steps*self.output_size], dtype=tf.float32)],
average_across_timesteps=True,
softmax_loss_function=self.ms_error,
name='losses'
)
with tf.name_scope('average_cost'):
self.cost = tf.reduce_max(losses, name='average_cost')
''' self.cost = tf.div( tf.reduce_sum(losses, name='losses_sum'), self.batch_size_, name='average_cost') '''
tf.summary.scalar('cost', self.cost)
print('self.cost shape is {}'.format(self.cost.shape))
@staticmethod
def ms_error(labels, logits) :
return tf.square(tf.subtract(labels, logits))
def _weight_variable(self, shape, name='weights') :
initializer = tf.random_normal_initializer(mean=0., stddev=1..)return tf.get_variable(shape=shape, initializer=initializer, name=name)
def _bias_variable(self, shape, name='biases') :
initializer = tf.constant_initializer(0.1)
return tf.get_variable(name=name, shape=shape, initializer=initializer)
Copy the code
Note: Refer to the Python code “LSTM regression” for the code prototype.
The number of hidden layers is different. The official recommendation is:
# The official recommended way to write this is to use the list generator:
num_units = [128.64]
cells = [BasicLSTMCell(num_units=n) for n in num_units]
stacked_rnn_cell = MultiRNNCell(cells)
Copy the code
3.2. Stock data sets
Stock data comes from Tushare, a free, open source Python financial data interface package. I’m using the Pro version. The data is more stable and the quality is better. Pro is still an open, free platform.
This practice case uses post-weighted data of a stock, as well as input sse index, SHENZHEN Index, NASDAQ Index, Dow Jones Index, Hang Seng Index (due to computer performance, only extra reserve SSE index, Shenzhen index).Construct the training data set and split the closing stock price, trading volume, Sse Index and SHENZHEN Component index as outputs (OUTPUT_SIZE = 4). Due to the small time series of the data set, 15 sets of timing data are taken (TIME_STEPS = 15). From the earliest moment, 15 sets of full column data are taken as inputs. The last input is the last date of the current data set minus 15 (PRED_SIZE=15).
The code for the process of splitting the dataset is shown below, including the input data for the last 15 sets of data for the next 15 days (PRED_SIZE=15).
def get_train_data() :
df = pd.read_csv('share20210302.csv')
return df
def get_test_data() :
df = pd.read_csv('share20210302.csv')
#df = df.iloc[-(TIME_STEPS+1):] #
df = df.iloc[-TIME_STEPS:]
return df
def get_pred_data(y,z,sc) :
yy = np.concatenate((y, z),axis=1)
y=sc.inverse_transform(yy)
return y
Set the data set
def set_datas(df,train=True,sc=None) :
df['Year'] = df['trade_date'].apply(lambda x:int(str(x)[0:4]))
df['Month'] = df['trade_date'].apply(lambda x:int(str(x)[4:6]))
df['Day'] = df['trade_date'].apply(lambda x:int(str(x)[6:8]))
df['Week'] = df['trade_date'].apply(lambda x:datetime.datetime.strptime(str(x),'%Y%m%d').weekday())
# NASDAQ, Dow, need to move down one record
#shift_columns = ['open3','high3','close3','low3','change3','pct_chg3','open4','high4','close4','low4','change4','pct_chg4']
#df[shift_columns] = df[shift_columns].shift(1)
Rearrange the columns of the table to facilitate data extraction
##df = df.reindex(columns=col_name)
df = df.drop('trade_date',axis=1)
Reset_index (drop=True
col_name = df.columns.tolist()
# column moves to the collection position
col_name.remove('close1')
col_name.remove('close2')
col_name.remove('vol0')
Drop columns that are not important
#del_list = ['high3','low3','change3','pct_chg3','high4','low4','change4','pct_chg4','high5','low5','change5','pct_chg5']
#for name in del_list:
# col_name.remove(name)
col_name.insert(0.'close1')
col_name.insert(1.'close2')
col_name.insert(2.'vol0')
df = df[col_name]
#sc = MinMaxScaler(feature_range= (0,1))
if train:
sc = MinMaxScaler(feature_range= (0.1))
training_set = sc.fit_transform(df)
else:
The test set also needs to be normalized using the original Scaler
training_set = sc.transform(df)
Construct a data set of chronological length
def get_batch(train_x,train_y) :
data_len = len(train_x) - TIME_STEPS
seq = []
res = []
for i in range(data_len):
seq.append(train_x[i:i + TIME_STEPS])
res.append(train_y[i:i + TIME_STEPS]) # Take the last 5 groups of data
#res.append(train_y[i:i + TIME_STEPS])
seq ,res = np.array(seq),np.array(res)
return seq, res
if train:
seq, res = get_batch(training_set[:-PRED_SIZE], training_set[PRED_SIZE:][:,0:OUTPUT_SIZE]) # 0:9
else:
seq, res = training_set, training_set[:,0:OUTPUT_SIZE]
seq, res = seq[np.newaxis,:,:], res[np.newaxis,:,:]
return seq, res, training_set[:,OUTPUT_SIZE:],sc,col_name,df
Copy the code
Note: Uniform standards should be adopted for training set and test set after normalization, that is, fit_transform should be used once for MinMaxScaler and transform should be used later. It’s easy to capsize in this creek!
3.3. Model training and parameters
In model training, batche data is obtained in batches using iterators tF.data.dataset. From_tensor_slices. The training process log data is recorded in the Logs directory and viewed using the Tensorboard tool.
import tensorflow as tf
import numpy as np
import pandas as pd
import datetime
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
BATCH_START = 0
TIME_STEPS = 15
BATCH_SIZE = 30
INPUT_SIZE = 25
OUTPUT_SIZE = 4
PRED_SIZE = 15 # Predict the output of 15-day sequence data
CELL_SIZE = 256
NUM_LAYERS = 3
LR = 0.00001
EPOSE = 40000
if __name__ == '__main__':
model = MultiLSTM(TIME_STEPS, INPUT_SIZE, OUTPUT_SIZE, CELL_SIZE, BATCH_SIZE, NUM_LAYERS,True)
sess = tf.Session()
merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("logs", sess.graph)
# tf.initialize_all_variables() no long valid from
# 2017-03-02 if using tensorflow >= 0.12
if int((tf.__version__).split('. ') [1]) < 12 and int((tf.__version__).split('. ') [0]) < 1:
init = tf.initialize_all_variables()
else:
init = tf.global_variables_initializer()
sess.run(init)
# relocate to the local dir and run this line to view it on Chrome (http://localhost:6006/):
# $ tensorboard --logdir logs
state = 0
xs = 0
df = get_train_data()
train_x,train_y,z,sc,col_name,df = set_datas(df,True)
# Use from_tensor_slices to queue data, use Batch and repeat to divide data batches, and make data sequences last indefinitely
dataset = tf.data.Dataset.from_tensor_slices((train_x,train_y))
dataset = dataset.batch(BATCH_SIZE).repeat()
# Use the generators make_one_shot_iterator and get_next to fetch data
Single iterators can only loop through data once, and single iterators can be used without a manual display call to sess.run() to initialize them
#iterator = dataset.make_one_shot_iterator()
Initializable iterators can be reinitialized to loop, but require a manual display call to sess.run() to loop
iterator = dataset.make_initializable_iterator()
next_iterator = iterator.get_next()
losse = []
for i in range(EPOSE):
This is display initialization. We only need to call this method if our iterator is dataset.make_initializable_iterator(), otherwise we don't
sess.run(iterator.initializer)
seq, res = sess.run(next_iterator)
if i == 0:
feed_dict = {
model.xs: seq,
model.ys: res,
model.batch_size:BATCH_SIZE,
model.keep_prob:0.75.# create initial state
}
else:
feed_dict = {
model.xs: seq,
model.ys: res,
model.batch_size:BATCH_SIZE,
model.keep_prob:0.75,
model.cell_init_state: state # use last state as the initial state for this run
}
_, cost, state, pred = sess.run(
[model.train_op, model.cost, model.cell_final_state, model.pred],
feed_dict=feed_dict)
losse.append(cost)
if i % 20= =0:
#print(state)
print('cost: '.round(cost, 5))
result = sess.run(merged, feed_dict)
writer.add_summary(result, i)
plt.rcParams['font.sans-serif'] = ['SimHei'] # Display Chinese tags
plt.rcParams['axes.unicode_minus'] =False
losse = np.array(losse)/max(losse)
plt.plot(losse, label='Training Loss')
plt.title('Training Loss')
plt.legend()
plt.show()
# End of training
Copy the code
In a command line window, go to the current program directory and use the tensorboard –logdir logs command. The calculation figure is as follows:The training process is as follows, and the cost value reaches 0.013x.
3.4. Prediction results and analysis
Using the last 15 records of the dataset as the input of the forecast for the next 15 days, the output results are as follows:
df = get_test_data()
seq,res,z,sc,col_name,df = set_datas(df,False,sc)
seq = seq.reshape(-1,TIME_STEPS,INPUT_SIZE)
share_close = df['close0'].values
share_vol = df['vol0'].values/10000
share_sh = df['close1'].values
share_sz = df['close2'].values
model.is_training = False
feed_dict = {
model.xs: seq,
model.batch_size:1,
model.keep_prob:1.0
}
#pred,state = sess.run([model.pred,model.cell_init_state], feed_dict=feed_dict)
pred = sess.run([model.pred], feed_dict=feed_dict)
#print(pred[0])
y=get_pred_data(pred[0].reshape(TIME_STEPS,OUTPUT_SIZE),z,sc)
df= pd.DataFrame(y,columns=col_name)
df.to_csv('y.csv')
share_close1 = df['close0'].values
share_vol1 = df['vol0'].values/10000
share_sh1 = df['close1'].values
share_sz1 = df['close2'].values
# Merge forecast shift PRED_SIZE
share_close1 = np.concatenate((share_close[:PRED_SIZE],share_close1),axis=0)
share_vol1 = np.concatenate((share_vol[:PRED_SIZE],share_vol1),axis=0)
share_sh1 = np.concatenate((share_sh[:PRED_SIZE],share_sh1),axis=0)
share_sz1 = np.concatenate((share_sz[:PRED_SIZE],share_sz1),axis=0)
plt.plot(share_sh, label='Closing Shanghai Index')
plt.plot(share_sh1, label='Forecast close of Shanghai Stock Index')
plt.plot(share_sz, label='Shenzhen Closing Index')
plt.plot(share_sz1, label='Forecast Closing Shenzhen Index')
plt.plot(share_close, label='Actual closing value')
plt.plot(share_vol, label='Actual volume')
plt.plot(share_vol1, label='Volume forecast')
plt.plot(share_close1, label='Closing forecast')
plt.title('Test Loss')
plt.legend()
plt.show()
Copy the code
The prediction results are 15 consecutive days starting from abscissa 14 in the figure below. The time series of the input data experienced a Chinese annual holiday, so the data set selected may not be appropriate.
Note: because the data overlapped with the input during the first 15 days, the overlay on the image made the line slightly darker and the other line invisible.
4. Summary
Recently, seeing stocks continue to fall has given me confidence that the model research prepared for behavioral analysis is meaningful.
LSTM prediction models with multiple hidden layers, multiple inputs and outputs, and different dimensions of input and output are still of practical significance in behavior analysis and prediction. If CNN or feature engineering are combined as inputs, better results will be obtained.
As for the number of neurons set in RNN Cell, after several rounds of training practice, it is generally more than 10 times of the input, especially when the number of data sets is small, it should be more. However, this can lead to a situation where the training becomes more difficult and the cycle becomes more difficult.
Due to the author’s limited level, welcome feedback exchange.
Reference:
1. What is LSTM Recurrent Neural Network, Don’t bother PYTHON, November 2016
2. “LSTM Training and Test Length (BATCH_size) Error Solutions”, CSDN Blog, David-Chow, April 2019
3. Tf.nn. Dynamic_rnn Details, Zhihu, Ying Zhong Youwei, August 2018
4. User Research: How to Do User Behavior Analysis? Everyone is a Product Manager, Zhu Xuemin, December 2019
5. “LSTM Multi-variable Time Series Stock Forecasting Based on Keras”, CSDN Blog, Xiao Yongwei, April 2020
6. “the use of LSTM in TensorFlow1.4” 51CTO blog, mybabe0312, February 2019
7. Parameter Calculation in LSTM, Zhihu, Yizhen, December 2018