This article is published on Zhihu :www.zhihu.com/column/c_14…
You are also welcome to subscribe to my column
1. Introduction
Why this column?
I believe that like most people, my technical enlightenment book is also The Beauty of Mathematics by Teacher Wu Jun. When I was in college, I was deeply attracted by the charm of mathematics and algorithm for the first time. It was at that time that I came into contact with the concept of recommendation. After working, I spent most of my time developing systems and engineering related to recommendation. With the development of business, recommendation technology changes with each passing day and becomes more and more complex. Therefore, I hope to improve my overall understanding by communicating and learning with others through knowledge accumulation.
On the other hand, there are many big V’s on the network, which is a good technical column to introduce the recommendation system. For example, Wang Zhe’s machine learning notes focus more on algorithms and paper interpretation. I also hope to share my understanding from the perspective of engineering implementation.
2. Review of recommendation system
1. What is a recommendation system and what problems does it solve?
Recommendation system is a personalized information filtering system.
From the perspective of cognitive psychology, human beings continuously deepen our understanding of the external world by collecting information, extracting, processing and converting information into concepts and knowledge.
Messages can take the form of words, pictures, sounds, visual effects, etc. With the development of human civilization, the way of acquiring information is also changing. In primitive tribal times, it was tortoise shell hieroglyphics, murals, scrolls and paper in the Middle Ages, and in the industrial revolution, the telegraph, the telephone, and the modern Internet.
Technological upgrading reduces the cost of information creation and transmission, which naturally brings about a rapid explosion of information.
Faced with the overload of massive information, in order to improve the efficiency of information acquisition, so the search system emerged, typical of Google and Baidu.
But search is active, with a clear intention. Most of the time, people’s attention, focus is vague, uncertain, and individual preferences and interests vary greatly. The recommendation system further improves the efficiency of information acquisition by distributing different information to people with corresponding preferences. People no longer get information in a one-way way, but information can also actively establish contact with people.
2. Large-scale application of recommendation system
In today’s world, recommendation system has penetrated into every corner of life.
From titoutiao Tiktok, E-commerce taobao, Pindoduo, and even meituan and Dianping, recommendation systems are being used on a large scale to improve the stickiness between users and platforms.
3. Recommend the system architecture
How to build a recommendation system? What modules does the recommendation system consist of?
Recommended System Architecture
The first is recommended content source:
The source of the content is usually relevant to the business area. For example, for information or video applications, content is generally uploaded to the platform by users (UGC) or MCN institutions (PGC). The platform will review the content for safety and quality. At the same time will also carry out the general black production, ash production of the strike. After the content is stored in the library, the content will be understood according to different types. For example, the content of the text and text will be extracted with labels. Video content will be tagged using multimodal technology. After parsing, each item corresponds to an ID and label system and is stored in the content pool for use by the recommendation system. E-commerce, music, movies and other scenes are similar.
From the perspective of engineering implementation, recommendation system is mainly divided into three chunks.
- Online modules: Deciding what to tweet
- Offline module: learning user habits to improve the accuracy of the system’s portrayal of users
- Management platform: experiment management, recommendation results analysis, etc
3.1 Online Module
Take YouTube as an example. Every day, millions of pieces of content are uploaded to the platform, reviewed and stored. The content pool can be tens of millions
How can you pick and choose from millions of content pools?
To solve this problem, a recommendation system typically has several stages.
- Indexes & Characteristics: Several types of indexes are built in advance based on content characteristics
- Recall phase: when the user requests, thousands/ten thousand items will be pulled from various indexes.
- Rough sorting stage: grade the items for the first time and then select hundreds or thousands of items. The sorting model at this stage is generally simple and can filter out things that are obviously irrelevant to the user’s interests
- Refinement stage: after hundreds of items are obtained, a relatively refined model will be established in the refinement stage to sort according to users’ portraits, preferences, context and business objectives. Generally, 50-100 pieces are returned to engine after finishing.
- Rearrangement stage: Engine side get 50 refined items. There is also a lot of manual intervention and product logic, such as diversity between items, product strategy logic, such as popular, top, mix of multiple content and so on. Eventually 5-10 items are returned and exposed to the client.
Depending on the business nature, online processes also have many finer modules, such as de-duplication of services to avoid recommending duplicate content to users. Feature preprocessing, feature extraction and other modules. These will be covered separately later.
3.2 Offline Module
How does a recommendation system learn user habits?
After a number of recommended items are delivered to the user, the system records the user’s click, play, stay time, like and favorites shopping cart and other business-related user behavior indicators. After processing by the sample platform, training samples are obtained.
At present, there are two training methods in the industry, one is batch training, hourly training, and the other is streaming training, after the sample stitching, directly through the deep learning platform training.
Deep learning frameworks bloom. Open source TensorFlow, PyTorch, MXNet, etc. However, in the recommended scenario, the current training framework may have some performance problems (distributed training, super-large scale sparse parameters, etc.), so each large factory basically adopts the combination of self-development and open source.
Such as:
- Ali’s XDL: Based on Tensorflow.
- Paddlepaddle: a self-developed deep learning platform incompatible with TensorFlow
- Tencent’s numerous: from its previous completely self-developed, it is slowly catching up with TensorFlow.
- Byte BytePs: developed the PS parameter server part, the core training framework is also compatible with TensorFlow.
After the model is fitted, the training platform is pushed to the online service, which provides recall + coarse + fine + rerank and other calls via online serving. In this way, a closed loop of data is formed. The user behavior data is generated online and re-applied to the online service through offline learning and fitting
The design of this relatively complex sample platform and the investigation of training framework will also be introduced in separate articles later.
3.3 Recommended Management platforms
How to efficiently collaborate and iterate among system modules?
In the recommendation scenario, users’ needs are not particularly clear, and it is a very subjective feeling for users to feel whether the recommendation results are good or bad. Therefore, each platform will set some click-through rate, interaction indicators and so on to quantify and judge. Frequent AB tests are required for product, operation and technical optimization. Therefore, efficient experimental management, flow division, experimental results of confidence, independence and so on becomes very important.
In addition, the recommendation system is actually a relatively complex system. Due to the complexity of data links, the recommendation results are often required to be explicable, diagnosable and visualized, so as to minimize the blackening of the recommendation system.
4. Challenges of recommendation systems
Recommendation system development up to now, although the large structure has been relatively stable, such as recall + coarse + fine + mixed. However, considering the system performance, versatility, iteration efficiency and other aspects, it is still a big problem how to organize and connect various modules.
Such as:
- At what stage should feature extraction be done? Feature service or sample service? How to ensure feature consistency.
- Sample reporting, how to avoid feature crossing?
- How does sample splicing improve efficiency in parallel experiments
- Index selection, cost and efficiency considerations
- Real-time performance of features, sample stitching performance, training platform, real-time updating of serving model and other engineering issues under streaming training
5 Writing Plan
In the next column, I plan to divide it into six modules, respectively introducing the industry practices and engineering practices of each module
- Deduplicate the service module
- Feature engineering module
- The index module
- Recall of the module
- The sample module
- Training framework module
5. Reference materials
www.zhihu.com/column/c_13…
zhuanlan.zhihu.com/p/143816066
www.infoq.cn/article/Y4G… *ePCv1we82rJVC5I
www.woshipm.com/pd/4202594….
www.woshipm.com/pd/4223123….
www.woshipm.com/it/3563857….
zhuanlan.zhihu.com/p/114590897