E-commerce websites are one of the important application fields of personalized recommendation system. Amazon is an active user and promoter of personalized recommendation system. Amazon’s recommendation system has penetrated into all kinds of goods on the website, bringing at least 30% of amazon’s sales. Not only e-commerce, recommendation systems are everywhere. QQ, renren friends recommended; Sina Weibo of people you might be interested in; Youku, Tudou movie recommendation; Douban book recommendation; Dacongdianping restaurant recommendations; Jiayuan century dating recommendation; Career recommendations on Skyrim.

Classification of recommendation algorithms:

Classification by data use: • Collaborative filtering algorithm: UserCF, ItemCF, ModelCF • Content-based recommendation: user content attributes and item content attributes • Social filtering: based on users’ social network relationships

I. Introduction of collaborative filtering algorithm

Collaborative filtering technology is based on users’ historical preferences for items, mining for correlations between items, or mining for correlations between users, and making recommendations based on these correlations.

Mainly with the help of collaborative filtering algorithm:

  • 1. User-based collaborative filtering algorithm UserCF

    How similar users are to each other based on how different users rate items,

    Finally: recommend items purchased by similar users to users according to their similarity.

    Simply put: Recommend items to users that other users buy with similar interests

    According to the figure below, AC has similar purchasing interests, so we recommend items purchased by C to A and C is A similar user: this is user-based. Make recommendations.

  • 2. Item-based collaborative filtering algorithm ItemCF evaluates the similarity between items through users’ ratings of different items, and makes recommendations based on the similarity between items. To put it simply: suggest items to the user that are similar to the items he previously liked. Two factors determine the rating of item: \

    • A > Probability of co-occurrence between items
    • B > User’s liking score for the item:
    • For example: although the probability of co-occurrence of two items is very high, but the user once evaluated the item as hate, so even if this item and his favorite item co-occurrence probability is very high, it will not be recommended.

      According to the user’s purchase situation, item AC has the highest probability of co-occurrence, so user C is recommended to purchase item C similar to item A.

Steps:

  • 1. Calculate the two-position matrix of co-occurrence of all items, which does not consider users but only counts The Times of co-occurrence of items in all users’ shopping baskets.
  • 2. A one-dimensional matrix of each user’s rating of all items
  • 3. Multiply the two matrices to obtain a one-dimensional matrix, which is the recommendation vector for each item of the user.

2. Tmall data

item_id,user_id,action,vtime
i161,u2625,click,2014/9/18 15:03
i161,u2626,click,2014/9/23 22:40
i161,u2627,click,2014/9/25 19:09
i161,u2628,click,2014/9/28 21:35
i161,u2629,click,2014/9/27 16:33
i161,u2630,click,2014/9/5 18:45
i161,u2631,click,2014/9/29 16:57
i161,u2632,click,2014/9/24 21:58
i161,u2633,click,2014/9/25 22:41
i161,u2634,click,2014/9/16 13:30
i161,u2635,click,2014/9/20 9:23
i161,u2636,click,2014/9/21 1:00
i161,u2637,click,2014/9/24 22:51
i161,u2638,click,2014/9/27 22:40
i161,u2639,click,2014/9/20 10:25
i161,u2640,click,2014/9/28 20:36
Copy the code

3. Collaborative filtering algorithm flow based on Item

3.1 Co-occurrence matrix, user rating vector and recommendation vector

The meaning of this Recommended Vector is obtained by multiplying the co-occurrence Matrix and the User Preference Vector

What? The combination of the two things to implement Item Based Cooperative Filtering didn’t come as a surprise. Item-based (separate from user-based) is reflected in the co-occurrence Matrix. Take all the records of the users’ over-typing of an item and form a co-occurrence Matrix that reflects the degree of association between items.

Why does multiplying a User Preference Vector by a User rating Vector make a Recommended Vector? Again, the third term of R, 24.5, is explained by R3: the recommendation of item 103 for user U. It’s important to understand what we’re doing with this set of algorithms. \


Let me express the calculation of R3, which is R103, as follows:







It can be seen from the above that C103i*Ui is Ui representing the user’s liking degree for I, and C103i represents the number of times that I and 103 appear simultaneously. The more I items and 103 appear simultaneously, the larger C103i is, the larger the user’s liking degree for I is, the larger the Ui is, and the larger the value of R103 is, the more worthy of recommending 103. \

R101, R104,R105 and R107 in the R vector are very large, but we can ignore them and recommend the items that users have already bought, that is, they have already bought one of these items. It is ok to recommend the largest (or TopN) items that users have not bought. R102,R103,R106 above choose a maximum of 103,103 is can recommend the commodity \