Writing in the front

The following article is based on Investment Behaviors Can Tell What Inside, published in KDD2019. Exploring Stock Intrinsic Properties for Stock Trend Prediction. This article found that common professional fund managers for stock investment behavior of the stock intrinsic attributes, and based on this to extract the potential characteristics of stock property, and further use of potential characteristics of dynamic stock market status and trends of modeling, resulting in the stock and the market dynamic correlation, then the correlation and polymerization of dynamic stock index, So as to achieve more accurate stock forecast. The original paper is obtained at the end of the paper.

1

Abstract

Stock trend prediction is a method to predict the future stock price trend, which plays a key role in the process of seeking the maximization of stock investment profit. In recent years, more and more people are applying machine learning techniques, especially deep learning, to pursue more promising stock predictions. While deep learning has made great strides, it still holds the lead due to human investors’ understanding of the inherent properties of stocks. This paper proposes to improve the trend forecasting ability of stocks by extracting and mining the intrinsic attributes of stocks. Specifically, the authors find that the investment behavior in mutual fund portfolio data reflects the common belief of professional fund managers about the inherent properties of stocks, which can be used to extract the potential performance of stock attributes for further prediction. Therefore, based on the extraction of stock attributes, the author further proposes to use stock representation to model dynamic market state and trend, so as to generate dynamic correlation between stock and market, and then aggregate the correlation with dynamic stock indicators, so as to achieve more accurate stock prediction. Finally, a large number of experiments on real stock market data prove the validity of the extracted stock attributes.

To sum up, the main contribution of this paper lies in:

  • Based on the principle that the stocks held by the same fund manager may have common attributes, the intrinsic attributes of stocks are mined from the mutual fund portfolio data.
  • A new deep learning framework is developed to integrate static stock intrinsic attributes into dynamic stock prediction tasks by modeling dynamic market states/trends.
  • The validity of stock trend prediction based on stock intrinsic attributes and corresponding dynamic market state is proved by empirical study.

2

Model is introduced

The overall framework of the proposed model is shown as follows: \

Explore the intrinsic characteristics of stock from investment behavior

Inspired by the observation that fund managers have different preferences for the intrinsic properties of different stocks, the authors propose to learn the potential representation of intrinsic properties of stocks by mining the collective investment behavior of fund managers in mutual fund portfolio data. In particular, based on the observations above, the stocks contained in the portfolios of the same fund manager are more likely to have common intrinsic attributes. Accordingly, we can transform the mutual fund portfolio data into a Matrix of fund managers and stocks, and use Matrix Factorization to extract the potential vector of each stock, which is regarded as the representation of the intrinsic properties of the stock. The specific method is shown in the figure below, where the overall preference feature vector of the first fund manager is, the inherent feature vector of the first stock is, the investment behavior of the first fund manager on the first stock is reflected in, represents the investment share of the stock.

Matrix decomposition is widely used in recommendation system, text mining, computer vision and other scenarios. Because it can be used to learn potential representation vectors for interactions between two entities. In our task, given a set of known investment behaviors, it is possible to estimate the parameter sum, i.e., the potential representation of stocks and fund managers, by solving the following optimization problem to fit the training data.

In addition, in reality, there will be some fund managers’ bias towards some stocks as prior knowledge, so some paranoid terms are also introduced. Therefore, fund managers’ investment behavior on stocks can be estimated as: \

In this way, the final optimization objective is defined as:

The following regular terms are introduced to prevent the model from overfitting.

It is worth mentioning that the investment behavior of fund managers depends not only on the inherent properties of stocks, but also on the attention to the dynamic trend of stocks. In other words, no fund manager wants to invest in a stock with a clear downward trend, even if it has certain attributes that appeal to him. In addition, in actual investments, fund managers may invest in other diversified stocks to reduce the risk of limited stocks. Thus, in addition to the inherent preferences of fund managers, the portfolios of semi-annual fund managers are influenced by the dynamic trends in stocks and the diversity of risk aversion. As long as we look at semi-annual mutual fund portfolio data, accumulated investment behavior over a broad enough period of time can amplify managers’ long-term preferences, mitigate the effects of short-term trend dynamics or diversification, and reduce risk. Thus, by mining mutual fund portfolio data over a sufficiently long period of time, we can safely discover the intrinsic properties of stocks.

According to the inherent characteristics of the stock prediction

After extracting the internal characteristics of the stock, the next step will be to predict the stock through the internal characteristics of the stock. Predicting the future price trend of stocks can be regarded as a typical machine learning problem, which is either a classification task of price trend or a regression task of price yield. Each stock is mapped to a feature space and then converted to its label by a predictive function. In this way, each stock is an independent individual. Therefore, the forecast label of this article is not rising or falling, but taking the yield rate as the target score to judge the profitability of a stock. Its objective function is:

In addition, considering the strong time dynamic characteristics of the stock market, it is intuitive to take the historical status of the stock as a factor to predict its future trend. Therefore, most traditional methods put dynamic inputs, such as daily prices and various indicators, into time series analysis models, such as autoregressive models, Kalman filters, technical analysis, etc. In recent years, with the rapid development of deep learning technology, deep neural networks, especially circular neural networks, have been applied in stock prediction tasks and produced the most advanced performance. Therefore, we can abstract the DNN method without losing generality. In essence, the dynamic input of each stock at time T is projected into a dynamic stock representation, and then predictions are made according to this higher-level representation. While deep learning has made great strides, human investors remain ahead thanks to their understanding of the inherent properties of stocks. Therefore, it is very valuable to bring stock attributes into the current stock forecasting framework in order to pursue more accurate stock forecasting. A simple method is to combine the representation of stock attributes with the dynamic representation, which can be expressed as the following formula in form, where the dynamic input characteristics of stock J and the internal characteristics of stock J are represented.

Intuitive that market representation should reflect the market’s current preferences for various stock attributes, we propose a daily market representation model based on a group of stocks with the highest yield on a given day, taking into account that the stocks with the highest yield can reflect the latest market preferences. In particular, market representativeness is calculated by averaging the representativeness of the stocks at the top of the list, allowing for the highest yielding stocks to reflect the most recent market preferences. More formally, we can calculate the market state at time T according to the stock representation of the return rate within top-K at time T

This process is shown below: \

After having this market representation, we can calculate the correlation between the stock attributes of each stock and the current market state as follows:

Implementing forecasts in this way assumes that market conditions remain consistent for two consecutive days, so given the limitations of this assumption, it is important to model future market trends based on historical market conditions, rather than just using the previous day’s market conditions for stock forecasting. To this end, the author uses LSTM to dynamically model the market state, namely:

In this way, after combining the stock state with the dynamic representation of the market, the prediction from T to T +1 can be realized. The overall model framework can be combed by referring to the overall model framework at the beginning of this chapter. \

3

Experimental verification

In the stock prediction model, the author collected the time series data of daily price and trading volume of Chinese stocks from 2012 to 2016. There are more than 2,000 stocks in total, covering the vast majority of Chinese stocks. To further generate dynamic indicators, the authors calculated a total of 101 trading indicators based on the previous study. In order to effectively extract the intrinsic attributes of stocks, the author also collected half-yearly reports of China mutual fund portfolios from 2012 to 2016. The following table shows the number of funds and stocks filtered through the semiannual mutual fund portfolio report. In predicting stock movements, we filter out stocks that have been in a trading halt for more than 2% of the closing session. For those stocks that have never been invested by any fund, the zero vector is represented.

The evaluation indicators used are as follows:

To test the effect of learned stock representations from mutual fund portfolios, the authors use some qualitative analysis to assess whether the learned stock representations capture intrinsic attributes. Specifically, all stocks are clustered based on their respective learning representation. The following table shows three examples of equity clusters obtained in the second half of 2015. From this table, we can see that all stocks in the first cluster belong to basic industries, while all stocks in the second cluster are related to light industries. In addition, most stocks in the livestock and agricultural industries are clustered in the third cluster. Such clustering results can clearly show that the representation of stocks extracted from a mutual fund portfolio has certain inherent attributes.

The following figure shows the MAP results calculated every six months after filtering Top50, Top100 and Top200 of the proposed model. \

In order to explore the profitability of the model, the paper selected the top 50 stocks with the highest predicted return rate to form a portfolio. The cumulative return rate obtained by comparison with the method is shown in the figure below. It can be seen that compared with other comparison methods such as LSTM, the proposed method has significantly stronger profitability. \

4

conclusion

In order to improve the existing stock prediction models based on dynamic input, the intrinsic characteristics of stock should be considered in stock prediction task. This paper has three contributions: first, it is the first time to use the intrinsic properties of stocks to help investors choose stocks; Secondly, we propose to extract the intrinsic attributes of stocks from mutual fund portfolios. Third, we build a new model that uses static stock attributes in a dynamic way to make predictions by measuring correlations between markets and stocks. In the future, we plan to look for the intrinsic properties of stocks from other valuable data and specifically extend the market state model. In addition, we will explore more useful fund manager investment behavior to improve stock forecasting models. In addition, interested readers can also refer to a paper published by this research group, which is similar to this research topic. That paper is also pushed on our official account. Interpretation: optimization of individual Stock technical indexes is realized by Stock Embedding.

Bibliography: \

[1] Chen, C. ,  Zhao, L. ,  J  Bian,  Xing, C. , &  Liu, T. Y. . (2019). Investment Behaviors Can Tell What Inside: Exploring Stock Intrinsic Properties for Stock Trend Prediction. the 25th ACM SIGKDD International Conference. ACM.

Follow the “ARTIFICIAL Intelligence Quantization Laboratory” public account, send the background 082 can obtain the original paper.

Learn more about artificial intelligence and quantitative finance

<- Please scan for attention

Let me know you’re watching