This article originated from personal public account: TechFlow, original is not easy, for attention


Today, in the 29th article of our machine learning series, we will talk about the application of SVD to ancient recommendation scenarios.

The logic behind the recommendation

Have you ever thought about a problem, when we are shopping in Taobao or some east such e-commerce website. As soon as we enter the home page, we will see a lot of products on display on the home page. These products are often of high quality and attractive, and once they are shopped there may be no end. So how does an e-commerce platform know what we might like when it has so many products? What’s the logic behind this?

So basically behind this, the algorithms on the platform side do two things, the first thing is recall, and the second thing is sort. Essentially, it is similar to what search engines do, but the difference is that when searching, we have search terms as input, while the home page recommendation does not have any explicit input information. Therefore, products can only be recalled according to some features of the user’s portrait and the user’s previous behavior on the platform. After the recalled products, a model is used to estimate the probability of users clicking and sort according to this probability.

Although the recall – sort framework has not changed, the recall algorithm, the logic, and the sorting algorithm and logic have been iterating. In particular, recall model has made great progress from collaborative filtering at the beginning to vector recall, twin tower model and tree model and so on, and the effect of model has also made a qualitative leap.

Today we are going to focus on collaborative filtering. Although this model is very simple and has almost retired from the historical stage, it is still a classic algorithm worth learning.

The principle of collaborative filtering

The principle of collaborative filtering is very simple, in a nutshell, to find similar products and similar people.

Because the number of products and people on the platform may be very large, when we want to recommend, we can not enumerate all products to predict the click rate, which is obviously not able to resist the machine. So we want to use the user’s behavior on the platform and let the user’s behavior guide the platform. Find users with similar behaviors and similar products based on users’ behaviors.

img

So collaborative filtering has two sets of logic, or you can think of it as two approaches. The first method is user-based, which means to find users with similar preferences. This is not difficult to understand. For example, students are most likely to buy stationery and books frequently. Suppose we know that A and B behave similarly, which means they might have similar preferences. Therefore, if A has bought product 1 and gives high praise, while B has not, it is likely that B will also like this product, so we can recommend it to B.

The second method is certainly item-based. For example, if you search and click on A commodity A, the platform will recommend the commodity BCD similar to this commodity to you, which will be placed in the bottom of the commodity details page. For example, if you are looking at a shirt, it might suggest a shirt from another company, it might suggest trousers or a tie. The logic is essentially the same, because these items are more closely related to the shirt.

The next question is user to user, product to product how do you get the correlation?

The answer is simple, and it comes from this matrix:


Let’s look at this matrix, which is a matrix of behavior related to users and goods, where each row represents the behavior of a user, and each column represents the sales of each item. So we can use this matrix to represent users in rows, and goods in columns. Now that we have vectors representing users and goods, it is easy to calculate the similarity between vectors to find similar users and goods.

We can compute the similarity of vectors in many ways, we can compute the cosine of two vectors, we can compute the Euclidean distance, Pearson’s value, and so on.

The role of the SVD

That’s all I have to say about collaborative filtering, but the problem is that it doesn’t seem to have anything to do with SVD.

It’s easy to see how they relate to each other when you think about it, which is fine for smaller companies or scenarios. Take movie rating sites for example, because the number of movies is not very large, at best in the order of ten thousand, so the matrix may still be saved. If it is an e-commerce company, both goods and users are in the dimension of hundreds of millions, this matrix is obviously very huge, and it is impossible to store in memory, let alone calculate the similarity. Moreover, such a matrix is bound to have a lot of sparsity and vacancy, so it is reasonable for us to compress it by SVD.

First we develop an auxiliary function that calculates the minimum number of singular values required based on the percentage we set:

def select_K(sigma, percentage):
    square = sigma**2 
    base = sum(square) 
    s = 0 
    k = 0
 for i in sigma:  s += i**2  k += 1  if s >= base * percentage:  return k Copy the code

Secondly, SVD decomposition was performed on the original matrix, and threshold values were set to compress the original matrix:

data = np.mat([[0.0.0.0.0.4.0.0.0.0.5].           [0.0.0.3.0.4.0.0.0.0.3].           [0.0.0.0.4.0.0.1.0.4.0].           [3.3.4.0.0.0.0.2.2.0.0].           [5.4.5.0.0.0.0.5.5.0.0]. [0.0.0.0.5.0.1.0.0.5.0]. [4.3.4.0.0.0.0.5.5.0.1]. [0.0.0.4.0.4.0.0.0.0.4]. [0.0.0.2.0.2.5.0.0.1.2]. [0.0.0.0.5.0.0.0.0.4.0]. [1.0.0.0.0.0.0.1.2.0.0]])  u, sigma, v = np.linalg.svd(data) k = select_K(sigma, 0.95) sigmaK = np.mat(np.eye(k) * sigma[:k]) itemMat = data.T.dot(u[:,:k]).dot(sigmaK.I) Copy the code

What you end up with is a matrix of items, where each row vector corresponds to an item.


This is just a simulation, but in real application, we can compress hundreds of millions or even more dimensions into hundreds or even fewer, greatly reducing the cost of storage. Moreover, SVD calculations can be distributed and run concurrently, so even if the raw data is very large, it can be supported.

conclusion

This is the end of the application of collaborative filtering algorithm and SVD. Although the algorithm is very simple and easy to implement, there are still many problems to be solved. For example, the matrix of users and products is not immutable, because we will have new products and new users register at any time. What should we do about these new products and new users without behaviors?

The other problem is that the algorithm doesn’t have much room for improvement, and once the implementation is live, there’s not much we can do about it. If it is other models or algorithms, we can obtain better results through iterative algorithms and model methods, but collaborative filtering is not available. That’s why it’s getting phased out.

This is the end of today’s article, if you like this article, please invite a wave of quality sanlian, give me a little support (follow, forward, like).

This article is formatted using MDNICE