Interpretation: Stock Embedding technical index optimization

Writing in the front

The following article is mainly from a bit Indicator for All published in KDD 2019. Stock-wise Technical Indicator Optimization with Stock Embedding. In view of the different effectiveness of technical indicators of different stocks, this paper puts forward a technical index optimization framework. A method of stock embedding is proposed according to the inherent characteristics of different stock embedding. Experiments verify the effectiveness of the embedding results obtained and the optimized indexes can achieve ideal detection results. The original paper is obtained at the end of the paper. * * * *

Abstract

Technical analysis is one of the most important investment methods in the field of quantitative trading. It attempts to predict stock movements by analyzing historical price and volume data of financial assets. In order to solve the characteristics of low signal-noise and high uncertainty of financial markets, general technical analysis develops technical trading indicators, which can form the basis of robust and profitable investment strategies as a mathematical summary of historical price and volume data. However, it is observed that different technical indicators have different effectiveness for individual stocks with different attributes. Therefore, it is a great challenge for the stock selection and investment oriented by technical indicators.

In order to solve this problem, a technical trading index optimization (TTIO) framework is designed in this paper to optimize the original technical indexes by using the attributes of individual stocks. In order to obtain an effective representation of stock attributes, skip-gram commonly used in Word2VEc is used to learn stock embedding. In addition, stocks with similar attributes are obtained through a portfolio determined by the fund manager. Based on the learned stock representation, TTIO further learns a rescale network to optimize the performance of the metrics. Finally, a large number of experiments on real stock market data show that the method proposed by the author can obtain effective stock representation optimized for technical indicators, and the optimized indicators can produce more accurate investment signals.

In a word, the contributions of this article mainly lie in the following aspects: \

An index optimization model is proposed to obtain better index performance by integrating different attributes of individual stocks.
In order to represent stocks with different attributes, a stock embedding method based on fund manager collective behavior is proposed.
Experiments are carried out on real stock data, and indicators in real investment strategies are used to evaluate the effectiveness of the new index optimization method

Model is introduced

As noted earlier, stocks with different attributes have different affinity for the same metric. Inspired by this, it is necessary to adjust individual stocks according to stock attributes to improve the existing technical indicators. Therefore, the author first proposed a stock embedding method to represent the stock attributes, and then optimized the technical indexes based on the extracted individual stock embedding. Therefore, the overall framework of the model mainly includes two parts: stock embedding and optimization of technical indicators.

Stock Embedding \

In this section, the goal of stock embedding is to obtain an efficient representation reflecting the attributes of the stock, based on the rule that shares with similar attributes should have similar representations. One easy way to do this is through manual tagging by human experts. However, this is quite unrealistic due to the high requirements for its efficiency and robustness. Therefore, this paper tries to solve this problem from the perspective of data mining. Specifically, the paper mined the potential representation of stock embeddings from the historical portfolios of a group of fund managers in the market, that is, the stocks contained in the portfolio of fund managers may have similar attribute information (there are also a series of problems in doing so, which will be described in the next article). The specific methods of stock embedding are as follows:

1. The funds managed by fund managers usually contain a range of stocks, and each fund manager has his own preference and expertise for different stock selections. For example, some experienced fund managers may prefer to stay invested in stocks with relatively stable price sequences. As a result, these stocks are more likely to be of a similar nature because they tend to be held by a group of fund managers. Therefore, for stocks held by the same fund, it helps to learn similar embeddings.

2. According to the relationship between the fund and the stocks that constitute the fund, a bipartite diagram can be obtained, where, represents stocks, represents the fund to which these stocks belong, and represents the investment relationship between funds and stocks, as shown in the figure below:

3. After the bipartite graph of fund-stock is obtained, Random Walk algorithm is used to generate sampling sequence. Where, the probability of a stock node to a fund node is expressed in the following formula, which represents the share of the stock in the fund:

Similarly, we can also obtain the probability formula of a fund node to a stock node: \

Finally, after removing the fund node from the generated sequence, we can get many sampling sequences containing the stock node.

4. After the sampling sequence of the stock node is obtained, the skip-Gram algorithm is used to maximize the conditional probability between the neighbor node and its feature representation:

Where is the adjacent node of the node. Then, using skip-Gram architecture, a neural network is trained to predict the probability of each node actually appearing in the neighboring nodes around the target node. Then, the embedded representation of each stock is obtained through the hidden layer of the trained neural network.

Technical Trading Indicator Optimization Model \

Now that you have an embedded representation of the stock, you need to optimize the technical metrics. In order to preserve the properties of the original index as much as possible, a single-layer rescaling network is proposed, in which the new index is just the original index after rescaling. The simple single-layer network design is chosen to ensure that stocks with similar embedding characteristics have similar scaling scores. If our model has many layers, high nonlinearity does not guarantee this property. The specific design steps of the optimization model are as follows:

1. First, a Re-scaling network was proposed, in which the network took the embedded features of stocks as inputs and then learned to obtain each technical indicator after re-scaling for each stock. The scaling network mainly includes two parts: one is the scaling weight, that is, to learn the scaling weight of each stock embedding corresponding to each technical index through a simple network: \

In order to ensure that the weight after scaling is in a certain range, the second step is to normalize the weight of all stocks, in which softmax operation is used:

2. After obtaining the scaling weight for each stock embedded, the final optimized technical index can be obtained by multiplying the original technical index and the weight coefficient: \

The overall model architecture takes Information Correlation (IC) as the optimization objective function, and then learns parameters based on gradient descent.

3. Finally, considering the dynamic nature of adaptation investment, the author proposes a Rotation Learning mechanism to adjust the parameters of the model over time. Rotation Learning is a kind of online Learning. It can be used to update sequential data and predict each step of future data, rather than batching through all the training data to generate the best predictor. The algorithm flow chart of the algorithm is as follows:

Experimental verification

Data set: The trading indicator uses seven trading indicators for more than 2,000 Chinese stocks from 2013 to 2016. The fund’s portfolio is also from 2013 to 2016. In addition, the trading indicators involved are shown in the following table:

In order to evaluate the effect of the model, this paper uses Raw(direct trading indicators), Norm Re-scales (standardized trading indicators), NoEmb(direct re-scale of trading indicators), The four schemes of no embedding and Complex(output the embedding and re-scaled indicators by two-layer neural network) are compared. Finally, based on these five indicators, a multi-factor or single-factor stock selection strategy is constructed to compare the information coefficient and return rate. The results of the information coefficient are shown in the figure below: \

The yield results are shown in the figure below, where you can see that the proposed approach (red line) achieves better performance than the baseline model in both multi-factor and single-factor strategies: \

conclusion

In this paper, the authors propose a general, interpretable framework to optimize the technical indicators of tacit knowledge mined from external sources. In this paper, a new method for the difference between index and stock is proposed, and the knowledge is mined from the collective behavior of empirical investors, and the stock embedded representation is studied from the perspective of data mining. Then, the authors propose a carefully designed scaling network to preserve the original properties of the index and assign similar re-scaling weights to stocks with similar embedded representations. However, the index generated by the model proposed by the author does not show the difference in time sequence, so the paper simply ADAPTS it to the real world by using rough Rotation Learning method. Therefore, dynamic optimization of technical indicators will be left to later work.

Bibliography: \

Zhige Li, Derek Yang, Li Zhao, Jiang Bian, Tao Qin, and Tie-Yan Liu. 2019. Individualized Indicator for All: Stock-wise Technical Indicator Optimization with Stock Embedding. In The 25th ACMSIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’19), August 4 — 8, 2019, Anchorage, AK, USA. ACM, New York, NY, USA, 9 Pages.

Follow the “ARTIFICIAL Intelligence Quantization Laboratory” public account, send the background 070 can obtain the original paper.

Learn more about artificial intelligence and quantitative finance

<- Please scan for attention

Let me know you’re watching

Interpretation: Stock Embedding technical index optimization

Related Posts

Knowledge graph (III) : knowledge extraction

2021-01-22 Python TimedRotatingFileHandler Cannot delete files automatically after suffix is changed

Flying paddle and Chenyao technology to complete compatibility certification