Simple recommendation model of simple recommendation system
Recommendation system
To answer the most critical question of all: What is a recommendation system? Here are three ways to answer the question:
- What can it do
- What it needs
- How to do it
Let’s start with the first question: What can a recommendation system do?
A recommendation system can make connections that will eventually be made between users and items
In advance,Find out.
The key here is to advance the word, recommendation system from a huge network, to discover the connection between people and objects in advance, and help this connection as early as possible to establish.
The second question is what does it require? Finding connections between people and objects requires that enough connections already exist for the recommendation system to predict future connections.
And the third question, how? There are many ways, and this series will focus on machine learning.
Now that we know what a recommendation system is, let’s look at the next question: When do we need a recommendation system?
The recommendation system is introduced from three aspects above. If we need to predict connections, and there are enough connections, then the recommendation system is necessary.
One of the things that’s hard to quantify is: How do you define if there are enough connections? First we need to know what are the factors that affect the connection between people and objects?
- The first nature is the number of items itself. If there are few items in the product, which can be handled by manual labor, there will be few connections generated by users, because the bottleneck of connection number lies in the number of items, so it is not suitable to build a recommendation system at this time.
- Second, the number of users and items also becomes large in some ways, but the connection between users and items is very small, which is reflected in the low retention and return visit of users. At this time, it is not necessary to have a recommendation system. At this time, we should try to find the reason for the loss of users until they can contribute the first batch of connections
We have a simple formula to determine whether a recommendation system is needed:
The numerator is the increased number of connections, and the denominator is the increased number of active users and the increased number of active items.
- If the increased number of connections is mainly contributed by the increased number of active users and items, the value will be small and is not suitable for the recommendation system.
- If the increase in the number of connections has little to do with new active users and items, then the number of connections has a tendency to grow spontaneously, and it is appropriate to add recommendation systems to speed up this process.
Let’s start with the simplest recommendation system: a recommendation model based on popularity.
Prevalence model
The most basic recommendation model in the recommendation system is the recommendation model based on popularity. Popularly speaking, it is to recommend users what content attracts them.
One measure of content here is popularity, and there are two factors that affect the popularity of an item:
- Time: During commuting hours, people are more likely to open the headlines than other times. Naturally, the article attracts more attention, but this does not mean the quality of the article is high.
- Location: This “location” is not really a geographical location, but rather where your items are displayed on a service or website. For example, in most search engine services, the number one item is likely to receive significantly more attention than the number two or next item.
Therefore, we cannot use absolute value to measure prevalence, but should use a “Ratio”, or calculate some kind of “Probability”.
One way to do this is to count click-through rates. If the CTR is modeled mathematically, whether an item is clicked after display can actually be regarded as a Bernoulli random variable, so the estimation of CTR becomes the estimation process of a Bernoulli distribution parameter. For Bernoulli parameter estimation, we can adopt maximum likelihood estimation. Assuming that the probability of click is P, there will be a total of N presentations. If there are N clicks, the probability of occurrence is:
P = p^n * (1-p)^(N-n)
You take the log of both sides of this probability, you get
log(P) = nlog(p) + (N-n)log(1-p)
Take the derivative, and you get the extremum when p is equal to n over n.
But when N or N is 0, the p obtained by maximum likelihood is actually not very accurate.
So now we have a problem: when N or N is 0, the maximum likelihood estimation doesn’t reflect the real properties of these objects very well.
One solution is to use prior distribution, which leads to the concept of conjugate prior distribution.
The conjugate distribution of Bernoulli distribution is beta distribution, which uses Bayes formula, whose core formula is:
Posterior distribution = likelihood function * Prior distribution/P(X)
As for conjugate prior, you can refer to the topic model of the previous article: mathematical basis of LDA, where we know that the Beta distribution gives the prior distribution of P.
Another solution is to estimate CTR based on different time periods. We can use the CTR of the previous time period as a prior knowledge to more accurately estimate CTR of the current time period.
Similarity model
After introducing the recommendation model based on popularity, we then look at the recommendation model based on similar information, which is also called collaborative filtering and can be summarized as follows:
Similar users may have similar preferences, and similar items may be preferred by similar people.
So we have to look for similar users or similar items.
The core idea of collaborative filtering is to borrow data. To be specific, in the case of insufficient data of user A, we can mine user B for reference, so as to improve USER A with the data of user B. The idea is that we “cluster” user A and user B together and think that they represent the same type of user. We abstracted the modeling of a single user to a specific type of user, which allowed us to get more data.
There are two main types of synergy:
- Memory-based collaboration
- User collaborative
- Items together
- Model-based collaboration
Memory-based collaboration focuses on memory, remembering the items consumed by each person and then recommending them, which can be subdivided into
- User synergy: What do people like you consume
- Item synergy: What items are similar to those you consume
So memory-based collaboration is all about finding similar users or similar items, so let’s start with similar users.
User collaborative
The idea behind user collaboration is to cluster users based on their historical behavior, and then recommend items to users based on their common preferences.
The core formula of user collaboration:
Let’s interpret the above formula:
On the left-hand side of the formulaRepresents user U’s prediction of item I. On the right side of the formula is a similarity weighting between user U and user J.Represents the similarity between user U and user J,Represents user J’s rating of item I.
With the above formula, we will look at the actual production to use, need to pay attention to several places:
- The user vector is represented by items. If there are many items, the user vector dimension is very high, and it takes time to calculate the vector similarity
- Because we need to calculate the similarity of any two users, the complexity is zero
- To calculate the relationship between each user and item, the complexity is zero
For the above problem, you can see how to solve the code on Github, welcome everyone star.
Items together
The key formula for item synergy is as follows:
With the previous user collaboration foundation, it can be easily understood. The key points to remember here are:
- User synergy Because the user base is much larger than the number of items, all computing costs are high
- In user collaboration, there are few common consumption behaviors between users. Even if there are common items, they are popular items, which is not helpful for user similarity calculation
- Users’ preferences change more quickly than objects’ characteristics
To sum up: Item synergy is to find the most similar items according to the items already recorded by the user.
Slope one algorithm
Slope One algorithm is proposed in the face of the problem that the model cannot be updated online in item collaboration.
Its main innovation lies in the introduction of:
- The confidence level of the difference between the two items is represented by introducing the number of common users of the two items
- Models can be updated online in real time
conclusion
This paper introduces two recommendation models:
- Recommendation model based on popularity
- Recommendation model based on similar information and recommendation model based on content characteristics
The model based on popularity is simple and effective, and can solve part of the cold start problem with some prior knowledge, while the recommendation model based on similar information makes full use of the wisdom of the group to solve the problem of sparse users and items through clustering.
Your encouragement is my motivation to continue writing, and I look forward to our common progress.