The following content is in the read distance data bosses, technical part of the article summarizes the original please check: rambling time series forecasting – byte articles – zhihu zhuanlan.zhihu.com/p/486343380
1. Effect test
-
Numerical dimension insensitive metric:
-
SMPAE (Symmetric MAPE) :
-
-
The percentage represents the form MAE
-
Upper (200%) lower bound (0%)
-
-
WMAPE (weighted MAPE) :
-
-
Evaluate the errors of different orders of magnitude equivalently
-
Solve MAPE divisor to zero
-
-
2. Model classification
2.1 Traditional Methods
-
MA (moving average, averaging the observed values of the past N time points as the prediction of the next point) :
- Advantages: fast, as baseline effect is good.
- Disadvantages: can not do many steps, and lag problems.
-
ARIMA (autoregressive moving average) :
- Advantages: better fit than MA and AR
- Disadvantages: parameter finding, long running time, need to predict each sequence separately.
-
Prophet (Additive Temporal decomposition model) :
- Advantages: Non-linear trends have an advantage over seasonal external variables and can output probabilities.
- Disadvantages: Single sequence prediction, large scale effort.
-
Others: Orbit and NeuralProphet, not so good.
-
Disadvantages:
- The timing sequence itself has some property requirements, non-end-to-end optimization;
- Can only do a single sequence prediction, high performance overhead.
- Autoregressive, no covariable can be introduced.
- Multi-step prediction doesn’t work.
2.2 ML method
-
The modeling approach
- Timing conversion table class problems
-
Characteristics of the engineering
-
New features are constructed according to different feature types, and the pretreatment part is not discussed here.
-
Automatic time series feature engineering tool – Tsfresh (According to other documents, full automation is not smart enough to generate too many features and may explode memory)
- Tsfresh official documentation
- Tsfresh in Chinese
-
Other features of sequence method is introduced: zhuanlan.zhihu.com/p/67832773
-
-
model
-
GBDT:
- LightGBM, Fastai (Optuna or FLAML can be used for auto-tuning)
- The expression of business characteristics is better than NN
-
NN:
- Class variable representation learning will have better embedding
- Flexible loss design
- Multi-objective learning is more convenient than tree model.
-
2.3 DL method
- RNN (RNN, LSTM, GRU)
- Seq2Seq (RNN combination)
- RNN component as the basic unit, encoder to do training window information extraction, decoder to do prediction of multiple output. - Evaluation: general effect, high calculation cost, poor stability, error analysis and model interpretation difficult to do.Copy the code
- WaveNet (Hollow causal convolution)
-Better parallelism than RNN, one-dimensional CNN was used for sequence prediction, residual connection and skip connection were added, and a series of complex door mechanisms were added. - Evaluation: Not as good as RNNCopy the code
- LSTNet
- Evaluation: Inferior to Feature engineering + FAstaiCopy the code
- DeepAR
- Seq2Seq, but can output the probability distribution. - Evaluation: it is difficult to stabilize convergence, the accuracy fluctuates greatly, and the effect is not better than GBDT.Copy the code
- N-Beats
- Univariate prediction, with certain explanatory season and trend. - Evaluation: general effect, not good to add additional characteristic variables.Copy the code
- TFT
- The DL model that can challenge the tree model has certain approximation with the tree model and has characteristic variable selection network. - Evaluation: somewhat interesting, due to the principle of its quasi tree form, the effect is stable, but the calculation cost is still relatively large.Copy the code
-
Conclusion:
- At present, GBDT is better than DL in large scale
- DL’s strengths include pre-train, transfer learning and represent learning, and generation model is difficult to be applied to timing domain.
2.4 the sequential AutoML
-
Library:
- Auto_TS
-
Mainstream methods:
- Feature engineering +GBDT
- TFT structure makes it feasible to automate feature engineering of data set interface.
\