Dishes are the core elements of the delivery transaction process, and the understanding of dishes is also the key to achieve the matching of delivery supply and demand. Today, we will push three articles at a time, systematically introducing the construction and application of Meituan takeaway food knowledge map. Iteration and Application of Meituan Takeout Food Knowledge Graph will introduce the whole system of takeout knowledge graph, including dish categories, standard dishes, basic food attributes and food business theme attributes. “Standardization construction and Application of Takeout Products” will focus on the construction ideas, technical solutions and business applications of standardization of takeout dishes. Since the business characteristic of takeout is to match as a single meal, the exploration and Application of Takeout meal Collocation will specifically introduce the iteration and application practice of takeout meal collocation technology. Hope to engage in the relevant work of the students can bring some inspiration or help.
This paper take-away food knowledge map the first in a series of articles, this article systematically introduces the Meituan take-away food knowledge map tag system structure, including food category target sign, standard dishes names, food basic attributes (ingredients, taste, cooking, etc.) and food business subject attribute category (business signs, classic, etc.). At the technical level, specific construction methods of tag system are introduced with examples, such as classification model based on BERT pre-training. In terms of application, the specific application of food knowledge map in Meituan takeout business is introduced, including supporting dish representation of set meal matching, improving user experience of search and merchant recommendation.
1. The background
Knowledge graphs are designed to describe entities and their relationships in the real world. In Meituan takeout business, food products are the basis for Meituan to provide services to users. The construction of food knowledge map can help us provide users with more accurate, richer and more personalized food services. In addition, Meituan delivery business to provide customers with service “home” to have a meal, to store food and beverage business is to provide customers with “the shop” dinner service, and delivery and to store business and food have considerable overlap, food the alignment of data, for our online (takeout scene) offline (to store scene) data contrast analysis also provides a good “gripper”.
This paper introduces the construction of knowledge map of takeaway food, which is based on the mining and analysis of outbound business data (takeaway transaction data, commodity label information input by merchants, professional description PGC, user comment UGC, commodity pictures, etc.) and off-site data (encyclopedia, recipes, etc.). The classification system (food category target sign) and standardization system (standard dish name label) for takeaway food have been formed, and many basic food attribute systems including taste and ingredients have been further constructed for different types of food products. At the same time, relying on the business characteristics of Meituan takeout, the theme attribute system of food products involved in takeout business is constructed, such as business signs, main business, classic categories, etc. At present, the label structure of takeaway food knowledge map is shown in Figure 1:
The take-away food knowledge map contains labels of the following four dimensions (take “Kung Pao Chicken” as an example, as shown in Figure 2 below) :
- Category target sign: including staple food, snacks, dishes and other categories, and under each category, more than 300 hierarchical subcategories are formed. For example, the category of “Kung Pao chicken” is “dishes”. Category target sign is the basic classification information of food commodities. According to different categories, basic attributes of food commodities are different. For example, there are “meat and vegetable” and “cuisine” in the category of “dishes”, while there is no such attribute label in the category of “wine and beverage”.
- Standard dish name label: The standard dish name label is mainly standard commodity information, for example, the standard commodity of “Kung Pao chicken (sign must order)” is “Kung Pao chicken”. Due to the diversity of goods imported by businesses, the construction of standard dish labels has realized the aggregation of the same cuisine.
- Basic attributes: According to different categories of food products, basic attributes including food ingredients, cuisines, flavors, production methods, meat and vegetable are constructed. For example, the cuisine of “Kung Pao Chicken ding” is “Sichuan cuisine”. The ingredients are “chicken breast” and “peanut”, and the label of meat and vegetable is “meat”. The mining of basic attributes plays a key role in our understanding of commodities, providing basic data features in the screening, display, characterization and other business needs of commodities.
- Theme attribute: The theme attribute mainly reflects the business theme of food, including the transaction behavior of food in takeout, the positioning of food in merchants, the praise of food in user feedback, etc. For example, a merchant’s “Kung Pao chicken (signboard must point)” is the merchant’s “signature dish”.
Dish alignment, involving dish data, comes from takeout online dishes, review recommended dishes, Meituan merchant packages, etc.
2. Needs and challenges
At present, the knowledge map of takeaway food has been applied in many scenarios of Meituan Takeout, such as recommendation, search, package collocation, operation analysis, etc. The in-depth development of the business also puts forward more complex requirements for the construction and iteration of the gourmet knowledge graph, such as:
- As food products become more and more diverse, the map of food knowledge needs to be more and more precise and accurate. For example, the category target signature of the gourmet knowledge atlas was built from scratch, and a category target signature system containing more than 100 categories was established. However, with the development of business, some categories have obvious space for refinement.
- Map label mining tends to static label mining. For the food under the same map label, there is a lack of business-related topic attribute description. For example, “Drunkard peanut”, which also contains “peanut”, is more representative of “peanut-related cuisine” than “Kung Pao chicken”.
- The takeaway food knowledge map mainly describes takeaway food products, while the food products of the same merchant may also appear in the offline cashier and other businesses of the store. By aligning the gourmet products of different businesses, the gourmet knowledge map can be used to perfect the description of the gourmet products of merchants at the gourmet entity level, so as to guide the commodity and business operation.
In order to meet business requirements, we iterated and optimized the class target signatures and basic attributes. At the same time, business-relevant topic attributes are built. In addition, we have carried out solid alignment between takeaway dishes and on-arrival dishes. Among them, mining topic attributes, namely mining business-related map knowledge, is a complex process that needs to comprehensively consider the attributes of takeout business and commodity itself. The alignment of takeout dishes and meals needs to consider the diversity of dishes and the unity of dishes.
The difficulties of iteration of takeaway food knowledge map are mainly reflected in the following points:
- There is no ready-made system for mining business-related topic attributes, and a lot of analysis and architecture design work is involved in the construction process.
- The most important thing in the mining of theme attribute is to start from the needs of users, analyze the demand points of users for commodities, and reflect them in the graph level of commodities, and form the corresponding theme attribute label. At the same time, the commodity information of merchants is a dynamic process, such as sales volume, supply, commodity label, etc., and the information before and after two days may be completely different. Therefore, the mining of business topic attributes, on the one hand, needs to build a relatively perfect system, on the other hand, also needs to adapt to the dynamic change process of business data, which brings great challenges in map mining and demand matching.
- When merchants record dishes, there are diversified expressions for dishes, such as differences in weight, taste, ingredients and other aspects of the same dish. When dishes are aligned, it is necessary to balance these diverse expressions, such as whether to ignore portion factors. But there is no ready-made alignment standard to refer to.
3. Iteration of takeaway food knowledge graph
Due to the limitation of space, this paper mainly introduces the mining of map labels of dishes category, different categories, tastes, food ingredients, meat and vegetable, classic food under practice, and healthy food. Among them, the data sources involved and technologies adopted in atlas label mining are roughly shown in the following table:
The label | technology |
---|---|
Food category item | BERT classification model |
Different categories, flavors, ingredients, meat and vegetable dishes, and practices of classic food | Data statistics, entity identification, relationship identification, product definition (considering sales volume and supply volume comprehensively) |
Healthy meals | Classification model + Product definition (products that meet certain ingredients, practices and functions) |
3.1 Category of dishes
The mining of the target sign of dish category mainly solves the problem of what kind of food dish is. The challenge to achieve this goal is twofold: first, how to establish the category system, and second, how to link goods to the corresponding category nodes. At the beginning of the system construction, we set up a hierarchical category system containing more than 100 categories from scratch based on the characteristics of gourmet products and the specific needs of the business. Some examples are shown in Figure 3 (left). At the same time, a classification model based on CNN+CRF is constructed to classify food products by category, as shown in Figure 4 (left).
However, with the development of the business, the existing categories can no longer support the needs of the existing business. For example, in the original category system, the description of hot dishes was not detailed enough, for example, there was no way to distinguish hot dishes. To this end, we cooperated with the food delivery supply planning Department to expand the category system to more than 300 subdivided targets, which are divided into more detailed and comprehensive coverage. Some examples are shown in Figure 3 (right).
Class subdivision requires a more accurate model. In category recognition, the available data include dish name, store sidebar classification name, merchant name, etc. Considering that most of the available information is text information, and the text input by merchants is not standardized, and dishes have various names, in order to improve the model accuracy, we upgraded the original CNN+CRF classification model and adopted BERT pre-training + Fine-tuning model with larger model capacity. The model structure is shown in Figure 4 (right) below.
3.2 Classic food labels under different categories, tastes, ingredients, meat and vegetable dishes and practices
In the construction of the theme attribute, we first consider the sales and supply of goods comprehensively in the dimension of the basic attribute tag, and select the best dishes. For example, the classic food under the category. However, in the process of construction, we found that the recognition result of classic cuisine of cuisine tends to be “home dishes” under cuisine if the recognition is carried out according to sales volume and supply. Therefore, classic cuisine is identified separately.
Category classic food refers to the category of food commodities with high sales volume and abundant supply, such as classic food of staple food and classic food of snack. Flavors, ingredients, labels for classic recipes, etc., are similarly defined.
In the process of construction, we found that if we identify directly in the dimension of goods, because the update frequency of goods is relatively high, it is not friendly to the newly entered gourmet goods with temporary no sales or low sales, the sales level needs to consider the impact of online time. Therefore, we use standard dishes to identify categories and classic flavors, and generalize to specific food products through standard dishes.
Among them, “standard dishes” borrows the concept of “standard products” from other e-commerce businesses. Although the production of most dishes is not a standardized process, we only focus on the main common parts and ignore the minor differences. For example, “tomato eggs” and “tomato scrambled eggs” are all in the same category. As a result, the “standard dishes” we aggregate now reach the order of hundreds of thousands, and can cover most of the food products.
With the help of standard dishes, we will category, taste, ingredients, meat and vegetable, practice label aggregation to the standard dishes dimension, and the sales volume, supply of standard dishes dimension calculation, so as to solve the problem of the length of online goods. In the specific marking process, for example, we sorted standard dishes in category dimension based on sales volume and supply, and selected Top N % standard dishes for marking as commodities under category classic. For example, under the category of “pasta”, both the sales volume and supply volume of “tomato and egg noodles” are at the level of Top N %, so it is considered that “tomato and egg noodles” is a classic pasta food.
3.3 healthy meals
Healthy meals here mainly refers to the low fat, low card meal, namely, low calorie, low fat, high fiber, simple, natural, healthy nutritious food, generally for fruits and vegetables (such as basil, kale, okra, avocado, etc.), rich in high grade protein meat (such as salmon, shrimp, shellfish, chicken, etc.), grain (mainly composed of coarse grains, such as oats, sorghum, quinoa, etc.). Cooking methods also adhere to the principle of “less oil, less salt, less sugar”, the main methods for steaming, boiling, less frying, cold and so on.
Healthy meals identification, the main challenge is to own samples is less, but because of the particularity of healthy meals, merchant in the goods entry, generally described, for example, points out that the food product is “health”, “low card”, “fitness” type, so we build a classification model, the healthy meals for identification. Available data, including commodity name, merchant navigation bar, merchant name, merchant description of commodity, etc. However, the merchant category and the commodity category are in an iterative state, so this part of information is not used.
The identification process is as follows:
- Training data construction: Since the proportion of healthy food itself is relatively small, keywords related to healthy food are summarized first, and text matching is carried out with keywords. Healthy food data with relatively high sampling probability is marked for outsourcing data. Here, we summed up “salad, cereal rice, cereal bowl, low oil, low calorie, no sugar, fat loss, weight loss, light food, light calorie” and other key words.
- Model building: the same commodity because the ingredients used among them are different, in health food recognition will be different, for example, the name of the dish is “signboard salad” commodity, if the cheese is added in the salad, it is possible that the commodity will not be identified as a healthy meal. In order to comprehensively consider the commodity information entered by the merchant, the name of the commodity, the name of the merchant, the name of the navigation bar, and the description of the commodity entered by the merchant are used. These four kinds of data are data sources of different scales, and commodity names are relatively short texts. Therefore, a structure similar to text-CNN [1] should be considered for character-level feature extraction during model construction. Product description is a relatively long text, so in the construction process, we should consider using a structure similar to Transformer[2] for feature extraction, and use the mechanism of multi-head Attention to extract features at the “word” level in the long text. The specific structure is as follows:
- Two structures are used: Multihead-attention (Transformer) and text-CNN. Experimental results show that the combination of the two structures is more accurate than the single structure.
- In modeling, word-level feature processing is used to avoid errors caused by word segmentation and the influence of unknown words.
- Data iteration enhancement: Because keywords are used for sample construction, during model training, the model will learn in the direction containing these keywords, so there is the situation of missing recall. Here, we carried out some training data enhancement. For example, in the evaluation, we selected the merchants that could identify the healthy meal and supplemented the missed recall data of the merchants with training data. At the same time, some key words with obvious characteristics are supplemented and examples are extended. By expanding the training samples for many times, the high accuracy recognition of healthy food is finally completed.
3.4 Entity alignment of dishes
Considering the same businessman dishes in different lines of business of food may be slightly different, we designed a complete dinner name matching algorithm, the classifier by dismantling food names, pinyin, before the suffix, string, sequence characteristics, such as using a food category recognition, standard dishes names extraction, synonym for dishes such as entity relationship matching alignment. For example: charcoal roasted pigeon = charcoal roasted pigeon, Chongqing spicy chicken = Chongqing Gele mountain spicy chicken, eggplant meat paste covered rice = eggplant meat paste covered rice, tomato scrambled eggs = tomato scrambled eggs and so on. At present, the following menu system is formed:
4. The application
Here is an example to illustrate the application of the map of food knowledge for sale. It mainly involves set meal collocation, gourmet goods display, etc.
4.1 Set meal collocation – represents dishes
In order to meet the needs of users to match into a single, to explore the technology of package collocation. The key to combo technology lies in the recognition of food products, and the takeaway food knowledge Map provides the most comprehensive data base. Based on the commodity information and historical single information in the same merchant, we fit the collocation relationship of commodities, refer to pointer network [2] and other structures, and build the ENC-DEC model based on multi-head Attention[3]. The specific model structure is as follows:
- Encoder: Modeling business menu. Since the menu is unordered data, modeling is conducted in the way of Attention. Commodity information mainly includes commodity name, commodity atlas label and transaction statistics.
- Self-attention calculation is carried out for dish name and commodity label to obtain vector information corresponding to dish name and commodity label, and then Concat with transaction statistics data as preliminary representation of commodities.
- Self-attention calculation is performed on the initial presentation of goods to have a perception of the goods of the same merchant.
- Deocoder: Learn the collocation relationship and estimate the next possible collocation based on the currently selected goods.
- When matching output, beam-search is used to output multiple matching results.
- In order to ensure the diversity of products in the output collocation, the Coverage mechanism was added [2].
- After training, Encoder is partially separated and off-line scheduling is carried out to achieve daily vector output.
The specific model structure is shown in the figure below:
The set meal collocation model based on the knowledge graph of takeaway food has achieved transformation improvement in multiple entrances (” full reduction artifact “, “dialogue ordering”, “dish details page”, etc.).
4.2 Interactive Recommendation
By analyzing the needs of takeout users, it is found that users have the need to compare similar goods across stores, which provides a convenient cross-store comparison decision-making method to break the characteristics of the purchase process of merchants. Interactive recommendation, through the new interactive mode, to create a breakthrough point for recommended products. In the user interaction process, according to the user’s historical preference, real-time click behavior, to recommend the user may like the food products. As shown in Figure 8 (left) below, when recommending similar foods to users, the standard dishes labels in the food knowledge graph provide the main data support.
4.3 the search
As the core traffic entrance of takeout, search carries users’ clear takeout needs. Users can search dishes by inputting keywords. In practice, the keywords searched may be a specific dish, a certain food material or a certain cuisine. In the food knowledge atlas, the high accuracy and high coverage of the atlas tag are conducive to improving the user experience of the search entry, which is also demonstrated by the latest experiment (some new labels of food ingredients, cuisines and efficacy have a positive effect in the online experiment of search).
Plan for the future
5.1 Mining Scenario-based Labels
Food is closely related to our life. Meituan provides food services to tens of millions of users every day. However, the needs of users are diverse, in different environments, different scenarios, the demand for food is not the same. At present, there is a lack of scene-related labels in map mining of food knowledge, such as some solar terms and festivals. Atlas knowledge under specific weather conditions; Map knowledge of specific groups (muscle gain, weight loss), etc. Next, we will explore the mining of scene tags.
In terms of mining methods, the current mining data is mainly text information. In terms of product picture, description, structured tag and other information fusion, the mining is not deep enough, and the effect of the model also needs to be improved. Therefore, we will also make corresponding exploration in multi-mode recognition model.
5.2 Research on recommendation technology based on atlas
Meituan takeout recommends delicious food to users based on their understanding of delicious food, so as to better meet users’ demands for delicious food. The takeaway food knowledge graph and takeaway business data, as the data basis to achieve this, contain billions of node information and billions of relational data. By modeling and analyzing users’ behaviors such as product search, click and purchase, products can be recommended to users more in line with users’ needs. For example, by integrating food knowledge graph and takeout behavior data, random walk is carried out to recommend relevant foods to users. In the following exploration of atlas application, we will further explore the recommendation technology based on gourmet knowledge atlas and user behavior.
6. References
- [1] Kim Y. Convolutional Neural Networks for Sentence Classification [J]. ArXiv PrePrint arXiv:1408.5882, 2014. (in Chinese)
- [2] See A, Liu P J, Manning C D. Get to the point: Summarization with Pointer-Generator Networks [J]. ArXiv Preprint arXiv: 174.04368, 2017.
- [3] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Advances in neural information processing systems. 2017: 5998-6008.
- [4] Hamilton W, Ying Z, Leskovec J. Inductive representation learning on large graphs[C]//Advances in Neural Information Processing Systems. 2017:1024-1034.
7. Author profile
Yang Lin, Guo Tong, Hai Chao, MAO, etc., are from the United States group takeout technology team.
Read more technical articles from meituan’s technical team
Front end | | algorithm back-end | | | data security operations | iOS | Android | test
| in the public bar menu dialog reply goodies for [2020], [2019] special purchases, goodies for [2018], [2017] special purchases such as keywords, to view Meituan technology team calendar year essay collection.
| this paper Meituan produced by the technical team, the copyright ownership Meituan. You are welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication. Please mark “Content reprinted from Meituan Technical team”. This article shall not be reproduced or used commercially without permission. For any commercial activity, please send an email to [email protected] for authorization.