0. Say it first
This paper has been published for a period of time, but as a classical idea of extracting user interest from user timing information, it is necessary to learn from some technical details.
Personal experience:
- User behavior can be viewed as a sequential feature (an improvement over DIN)
- GRU/LSTM hidden variables – Insight into the interests behind user behavior
- Attention – Capture the correlation between user interest at different time points and the current recommended item, and get the latest interest of the user on the recommended item
Original address:
Arxiv.org/pdf/1809.03…
Original code:
Github.com/mouna99/die…
1. The background
In the product recommendation scenario, users’ interests change over time. At present, the mainstream recommendation algorithms in the industry generally represent the user’s behavior as interest, and lack of special methods to model interest.
In order to realize the modeling of users’ Interest, Alibaba puts forward the DIEN(Deep Interest Evolution Network) model. Based on DIN(Deep Interest Network), this paper proposes two main innovations:
- The user interest sequence is introduced and modeled
- Model changes in users’ interests
Correspondingly, the paper embedded the following two-layer structure in the model:
- Interest extraction layer abstracts the abstract representation of user interest through user purchase sequence
- Interest evolution layer, modeling to represent user interest changes
The final recommendation result is obtained by calculating the correlation between the candidate articles to be recommended and the user’s recent interest, and combining the user characteristics and context characteristics.
2. Model architecture
This section describes DIEN’s model architecture in detail. Focus on the key interest extraction layer and interest evolution layer.
2.1 Model Architecture
The main structure of DIEN is similar to DIN, which mainly optimizes the User behavior Sequence. In order to realize the modeling of user Interest, DIEN designed the user behavior sequence into three layers, namely, the user behavior layer, the Interest Extractor layer and the Interest Evolving layer. Among them, the user behavior layer is the model input layer, which inputs the user’s behavior sequence to the item. Interest extraction layer, processing time sequence, modeling user interest; The interest evolution layer extracts the user’s recent interest which is most relevant to the item to be recommended.
2.2 Interest extraction layer
Since the user behavior layer is the input of the user’s behavior sequence, it is natural to use the sequence model in the field of natural language processing to model the user’s behavior sequence, and use the internal implicit state of the model to represent the user’s interest.
Here, the paper uses GRU to process user behavior embedding. In fact, it is also feasible to replace GRU with LSTM/RNN, which is essentially modeling and predicting sequence data. Compared with LSTM, GRU has a simpler structure, fewer parameters and faster convergence. Compared with simple RNN, GRU can process longer sequence data and has better memory ability. This paper uses GRU to process user behavior sequence as follows
In fact, it is the standard GRU calculation method. Through the GRU gate unit, the historical information of the user is summarized, the current information is absorbed, and the hidden state of the user is obtained. The hidden state here can be abstracted as the user’s current interest, namely hTH_ {t} HT in Formula (5).
2.3 Evolution layer of interest
The interest evolution layer takes the interest of each moment in the user sequence calculated by the interest extraction layer as input, and obtains the interest evolution of the user towards the recommended item by calculating the correlation with the current item to be recommended, and then obtains the user’s latest interest related to the recommended item. In this paper, attention mechanism is introduced into GRU implicit state to calculate the weight of different interests on recommended goods ata_{t}at.
The above formula eAE_aea represents the embedding of the articles to be recommended. The embedding is associated with the user’s interest hTH_ {t} HT (the output of the interest extraction layer) at each time and normalized to the value range of 0-1, so as to get the weight of interest at each time.
With attention on user interest at every moment, how can we deduce user interest from the current user interest related to the recommended item? The AUGRU (GRU with attentional Update Gate) is proposed in this paper.
AUGRU has improved the GRU update mechanism by introducing attention. In AUGRU, the calculated update factor utu_{t}ut needs to be multiplied by a attention weight factor as the final update factor, which is then applied to the calculation of the hidden state to obtain the hidden state at the current time.
2.4 reasoning
DIEN’s reasoning layer is basically the same as DIN’s structure, and it is also the current mainstream recommendation network model scheme. The output of the interest evolution layer (the user’s interest in treating the recommended article at the current time), the vector representation of the recommended article, the vector representation of the user and the environment context feature are cascled into the full connection layer for inference, and the final model recommendation result is obtained.
In the selection of optimizer, the paper did not do too much design, choose commonly used SGD/ADAM can be.
Effect of 3.
Since there are papers published, the effect is naturally better than the current mainstream model, so it will not be repeated here. However, in the industry, it is usually necessary to weigh the model improvement effect and maintenance cost. It can be seen that DIEN has a certain improvement compared with the classic wide&deep. However, there is no doubt that the model complexity is also much higher, so it is necessary to choose the model according to the situation.
4. To summarize
This paper introduces the DIEN model and focuses on the key points of the model design, including interest extraction layer and interest evolution layer. Interest extraction layer uses RNN(GRU is used here) to model user behavior sequence and extract user interest. In the interest evolution layer, Attention mechanism is embedded into GRU through AUGRU to calculate the weight of users’ interest in the recommended items at different historical moments, and then the interest of users in the recommended items at the current moment can be obtained.