0. Write first
Then the last DIEN Paper interpretation will talk about a recent achievement of Alibaba –DSIN(Deep Session Interest Network) Deep Interest Session Network. DSIN is an upgraded version of DIEN, which not only divides the user sequence more carefully, but also further optimizes the number of network architecture layers and improves the expression ability of the model to users’ interests.
Personal experience:
- The user’s behavior sequence can be represented by a session sequence, in which the user’s interest changes little.
- Self-attention is used to extract user interests within a session
Address: arxiv.org/pdf/1905.06…
Paper code: github.com/shenweichen…
1. The background
Both DIN \ DIEN and DIN \ DIEN use the method of modeling the user’s interest by taking each independent item element in the user’s behavior sequence as a point of interest. However, in real scenarios, users’ interests do not change much during a certain period of time (session). Therefore, the DSIN model is proposed to introduce user session interest, and a series of modeling such as interest extraction and interest evolution are implemented accordingly. By determining the process of session- > Session interest extraction -> interest interaction -> interest activation (introducing items to be recommended), the final recommendation is achieved.
2. Model architecture
The DSIN model structure is shown in the figure.
As we can see from the figure, the left side is the general part, which is also designed in other models, including user characteristics, item characteristics, and context characteristics. The difference is the session-based user interest modeling design on the right. This part is mainly divided into the following four parts:
- Session Division Layer – Sessioin Interest Division Layer
- Session Interest Extractor Layer – Session Interest Extractor Layer
- The Session Interest Interacting Layer — The Sessioin Interacting Layer
- Session Interest Activating Layer — Session Interest Activating Layer
The four modules are described in detail below.
2.1 the Session Division Layer
Taking 30min as a session interval, the article elements within 30min in the user behavior sequence are classified as a session, and the interval is followed by analogy, thus the whole user behavior sequence is divided into one session after another. It can be seen from the schematic diagram provided by the paper that within a session, the interests of users basically remain unchanged, so the categories of items browned remain the same.
2.2 Session Interest Extractor Layer
We divide the user behavior sequence into sessions. Then, how to model the user’s interest within this session? Each session is essentially a subsequence of a sequence of user behavior, and self-attention handles the correlation between elements in the sequence. Therefore, DSIN uses multi-head self-attention to model each session. Meanwhile, in order to depict the order between different sessions, DSIN introduces Bias Encoding, which is actually the Encoding of location information in the sequence.
2.3 The Session Interest Interacting Layer
The user session interest extraction layer models the user’s interest in the session. Considering that the user’s interest will change over time, a session interest evolution layer is designed in DSIN to learn the user’s interest evolution, which is strongly consistent with DIEN’s idea of interest evolution modeling. Different from DIEN, DSIN uses bidirectional LSTM to model the evolution of users’ interests, as shown in the figure below.
The hidden state in LSTM is regarded as a combination of user interest at the upper and lower moments, and LSTM is used to learn and capture the changing trajectory of user interest.
2.4 Session Interest Activating Layer
As mentioned above, LSTM learns the trajectory of user interest changes, while the recommendation result evaluates the influence of user interest in each session on the preference of the current item to be recommended. Therefore, DSIN introduces a session interest activation layer,
The idea is similar to DIEN. Attention is performed on each interest representation vector in the session interest extraction layer and the session interest evolution layer with the item to be recommended.
How relevant the items to be recommended are to each session’s interest. Then, it cascaded with item feature, user feature and context feature to the full connection layer to obtain the final recommendation probability.
Effect of 3.
The comparison between DSIN and mainstream recommendation models in the industry, such as youtubeNet, Wide&Deep, DIN and DIEN, is presented. It can be seen that compared with other models, DSIN has improved the recommendation effect.
4. To summarize
The designer of DSIN was cleverly aware of the fact that users’ interests would not change significantly in a certain period of time, and then added the concept of session on the basis of DIEN to design DSIN. This shows that for any recommendation scene, it is necessary for algorithm related staff to have a profound understanding of it and reflect their understanding into the modeling ideas, so as to have a directional optimization model. In addition, DSIN also has many design details worth learning, such as adding a bias enconding for each session to correct item location information in each session. All in all, this paper is worth every product recommendation algorithm engineer perusing.