Original link: http://tecdat.cn/?p=19751

 

Original source:Tuo End number according to the tribe public number

 

 

 

This example shows how to classify sequence data using a long short-term memory (LSTM) network.

To train deep neural networks to classify sequence data, LSTM networks can be used. The LSTM network allows you to feed sequence data into the network and make predictions based on the various time steps of the sequence data.

This example uses a Japanese vowel dataset. This example trains the LSTM network to identify speakers for a given time series data representing two Japanese vowels of continuous speech. The training data included the time series data of nine speakers. Each sequence has 12 features and varies in length. The dataset contained 270 training observations and 370 test observations.

Load sequence data

Load Japanese vowel training data. XTrain is an array of 270 sequences of cells containing a dimension of variable length 12. Y is for labels “1”, “2”… , the classification vector of “9” corresponds to nine speakers respectively. The entry XTrain in is a matrix with 12 rows (one for each element) and a different number of columns (one for each time step).

XTrain(1:5)
ans=5×1 cell array
    {12x20 double}
    {12x26 double}
    {12x22 double}
    {12x20 double}
    {12x21 double}
Copy the code

The first time series in the visualization. Each line corresponds to a feature.

Figure plot(Train') xlabel(" time step ") title(" Train sample 1") numFeatures = size(XTrain{1},1); Legend (" characteristics"Copy the code

Preparing to populate data

During training, by default, the software breaks training data into small batches and populates sequences so that they are of the same length. Too much padding can have a negative impact on network performance.

To prevent the training process from adding too much padding, you can sort the training data by sequence length and select the size of the small batch so that the sequences in the small batch have similar lengths. The following figure shows the effect of filling the sequence before and after sorting the data.

Gets the sequence length for each observation.

Sort the data by sequence length.

View the sorted sequence length in the bar chart.

Figure bar(sequenceLengths) ylim([0 30]) xLabel (" sequence ") ylabel(" length ") title(" sorted data ")Copy the code

Selecting a small batch size of 27 can evenly divide training data and reduce the amount in the small batch. The following figure illustrates the amount of padding added to the sequence.

Define the LSTM network architecture

Define the LSTM network architecture. Specify the input size as a sequence of size 12 (the size of the input data). Specify a bidirectional LSTM layer with 100 hidden cells and print the last element of the sequence. Finally, nine classes are specified by including a fully connected layer of size 9, followed by a Softmax layer and a classification layer.

If the full sequence can be used for prediction, the bidirectional LSTM layer can be used in the network. The bidirectional LSTM layer learns from the complete sequence at each time step. For example, if you cannot use the entire sequence for prediction, such as one time step, use the LSTM layer instead.


layers = 
  5x1 Layer array with layers:

     1   ''   Sequence Input          Sequence input with 12 dimensions
     2   ''   BiLSTM                  BiLSTM with 100 hidden units
     3   ''   Fully Connected         9 fully connected layer
     4   ''   Softmax                 softmax
     5   ''   Classification Output   crossentropyex
Copy the code

Now, specify the training options. Specify the optimizer as’ Adam ‘, the gradient threshold as 1, and the maximum calendar element as 100. To reduce the amount of fill in a small batch, select the small batch size of 27. As with the length of the longest sequence, specify the sequence length as ‘longest’. To ensure that the data is still sorted by sequence length, specify that the data is never randomly sorted.

Because batch processing has a shorter sequence, training is more cpu-friendly. Specify ‘ExecutionEnvironment’ as’ CPU ‘. To train on the GPU (if any), set ‘ExecutionEnvironment’ to’ auto’ (the default).

Train the LSTM network

Train LSTM network trainNetwork using the specified training options.

Test the LSTM network

Load the test set and classify the sequence into speakers.

Load Japanese vowel test data. XTest is a cell array containing 370 sequences of dimensions 12 of variable length. YTest is the label “1”, “2”… The classification vector of “9” corresponds to nine speakers.

XTest(1:3)
ans=3×1 cell array
    {12x19 double}
    {12x17 double}
    {12x19 double}
Copy the code

LSTM net is trained using sequences of similar length. Make sure the test data is organized the same way. Sort test data by sequence length.

Categorize test data. To reduce the amount of data introduced by the sorting process, set the batch size to 27. To apply the same padding as the training data, specify the sequence length as ‘longest’.

Calculate the classification accuracy of the prediction.

Acc = sum(YPred == YTest)./numel(YTest) ACC = 0.9730Copy the code

Most welcome insight

1. Python for NLP: Classification using Keras’s multi-label text LSTM neural network

2. In Python, LSTM is used for time series predictive analysis — predicting power consumption data

3. Python uses LSTM in Keras to solve sequence problems

4.Python uses PyTorch machine learning classification to predict bank customer churn model

5.R language multivariate Copula GARCH model time series prediction

6. Use GAM (Generalized additive Model) in R language for power load time series analysis

7. ARMA, ARIMA (Box-Jenkins), SARIMA and ARIMAX models in R language are used to predict time series numbers

8. Empirical study on time series estimation of time-varying VAR model with R language

9. Time series analysis was carried out using the generalized additive model GAM