Query rewriting is to expand the user Query rewriting words, with better expression, to help users recall more results in line with requirements. Query rewriting is a very important method of expanding recall for text Boolean retrieval system. Optimization of the algorithm module can directly and significantly improve the search experience. This paper mainly describes the iterative direction and implementation ideas of the query rewriting project under the search scene of Meituan, hoping to inspire or help students engaged in search, advertising, recommendation recall related work.

1. The introduction

In the search scenario, there are a lot of different expressions between the user’s search term Query and the retrieval text Document. Under the framework of text retrieval, the problem of missing recall caused by such text mismatches seriously affects the user experience. There are two kinds of solutions to this kind of problem in the industry: the user side to expand the user’s query words – that is, query rewriting, or Document side to expand the Document keywords – that is, Document label. This article focuses on the former solution for missing recalls: Query Rewriting, or Query Expansion. The application method of Query rewriting is to expand the rewriting words that are highly related to users’ needs from the original Query. Multiple rewriting words are searched together with users’ search words, so that users can find more merchants, goods and services that meet their needs with better expressions.

Under the technical framework of Meituan search, query rewriting controls the text in recall grammar, Named Entity Recognition (NER for short) [1] controls the retrieval field in recall grammar, and intention Recognition controls the relevance of recall, as well as the distribution of each business and product form. These are the three core query understanding signals. The query rewriting strategy is effective on all the traffic of Meituan search. In addition to extending users’ search terms, it is a signal of basic semantic understanding in the whole meituan search technology architecture, which affects user experience in many aspects, such as index expansion, sorting characteristics and front highlighting. It also has a direct and significant impact on the index of search recall results, such as the no-result rate, the number of recall results and the search click rate.

This article will introduce the iterative experience of query rewriting task in meituan search scenario, which is mainly divided into three parts. The first part will briefly introduce the challenge of query rewriting task in meituan search scenario. The second part will introduce the practical experience of the whole technology stack construction in query rewriting task and the third part is the summary and prospect. At present, there is little public sharing of text recall strategy in the industry. I hope this paper can inspire or help students engaged in recall related work in search, advertising and recommendation.

2. Background and challenges

2.1 Usage of query rewriting signal in Meituan search scenario

In the search scenario of Meituan, query rewriting is mainly used to solve the problem of missed recall caused by the following four semantic gaps:

  • Semantic extension: mainly synonyms, inferior words and common upper and lower case numbers and simplified transformation, such as “haircut”, “haircut”, “styling”, “hair art”, “hairdressing”, “haircut” and so on.
  • The Gap between user expression and merchant expression: A non-verbal synonym. For example, the user’s description is colloquial “learn guitar”, and the merchant’s description is written “guitar training”; User input does not exactly match the merchant name: “Hilton Hotels” (the merchant is more commonly described as “Hilton Hotels”).
  • Scenario expansion: For example, in the search scenario of “picking strawberries” on Meituan, users’ cognitive corresponding demand for the platform is “Strawberry Garden”.
  • Other missed recall problems: some problems such as “house sweeping” corresponding to “domestic cleaning”; Theoretically, query rewriting can solve all the problems of missing recall by adding rewriting words, such as “four pieces of winter” including “ice sugar gourd, baked sweet potato, fried chestnuts, hot milk tea” such time-effective concept of network celebrity, can also be solved by rewriting.

2.2 Difficulties and challenges of querying rewriting signals in meituan search scenario

Search is to improve user reach efficiency and commercialization index as much as possible under the constraints of user search terms and supply, while Meituan’s search scene adds the third constraint of “region”. Specific industry pairs are shown in the figure below:

By comparing the search scenes in the industry, it can be found that most of the users’ demands and service merchants in the search scenes of Meituan are local oriented, while the business in the life service field is very fragmented, and the local supply is relatively small compared to the users’ demand for a certain field of life service.

At the same time, Meituan search also aggregates the results of various forms of contract performance, including natural results of group buying, take-out, food buying, optimization and other businesses, as well as recommended results when there are no results of local related businesses. At limited exposure positions, the unrelated results of each natural result cluster crowd out the benefits of the other clusters, so the correlation problem cannot be solved by relying on sorting. This requires the query rewriting of Meituan search scene to be more relevant and efficient in multiple business scenarios, and the algorithm level needs to solve coverage problems, accuracy problems and multi-business problems. Starting from this requirement, query rewriting also faces the following two challenges in specific algorithm iteration:

① The query of users is faced with complex demand scenarios

  • There are many linguistic ambiguities: short Query increases the possibility of ambiguity. For example, in the Meituan scene, “a haircut” is a merchant name, which cannot be changed to “haircut”. The same Query has different meanings in different cities. For example, “Gongda” refers to different schools in different cities.
  • Cognitive relevance: Users’ search naturally has the cognition of “looking for a store” on Meituan platform, which requires the scene correlation knowledge similar to “matching glasses” equals “looking for an eye store”.
  • Multiple scenes: with the development of business, the objective demand increases, the query rewriting to undertake more and more scenes, more and more refined, at present, has access to catering, to comprehensive, hotel tourism, takeout, goods, advertising and other business scenes.

(2) The supply of platform needs to take into account the characteristics of supply construction and development stage

  • Most of meituan merchants do not do keyword SEO (Search Engine Optimization) : text mismatch caused by leakage recall problem is more serious, the need for rewriting is great.
  • The explicit form of merchants leads to the unclear intention of real interaction: most merchants provide a variety of dishes, commodities and group order services at the same time. For example, a music training institution often provides training courses for a variety of Musical Instruments.
  • Associated with business characteristics and development stages: Meituan is an aggregate life all aspects service platform, and the business demand for rewriting, weak correlation for some heavy trading business rewrite can accept, but for some heavy experience of business, more strict requirements for rewriting, require a certain degree of differentiation.

3. Technical selection

Figure 4 summarizes the technical framework for the current query rewrite iteration and the issues that are being addressed. We have made in-depth exploration in various sub-core modules, such as offline candidate mining algorithm exploration, semantic relation discrimination model, vectorization recall, online rewriting word generation. In addition to the iteration of the signal itself, good online profits have also been achieved in the use of the signal by rewriting the hierarchical signal to add ranking, recall correlation and other linkage.

Below, we will comprehensively introduce the iteration of each module technology under query rewrite task from offline to online.

3.1 Original corpus mining

High quality data can significantly improve head flow rewriting and determine the ceiling of subsequent model performance. In terms of the generation of candidate set, search log based mining, translation based, graph based computing and Embedding are common methods in industry and academia. In terms of filtering discrimination of candidate set, there are classification of relation between sentences and Embedding similarity calculation. We summarize the advantages and disadvantages of each method based on meituan search scenario, and combine user behavior and semantic information in each mining algorithm component. Offline corpus mining will be introduced in detail below.

3.1.1 Search log mining candidate corpus

Search log mining is a common synonym acquisition method in the industry. The main directions of mining are as follows:

  • The user searches and then clicks on the common merchant: the correlation is built using two queries that click on the same Document. This correlation can mine a large number of word pairs, but the disadvantage of this simple assumption is also obvious, and the Query of click co-occurrence may drift to varying degrees. In the meituan scenario, there are many stores providing comprehensive services, and two types of group orders will appear in large numbers under the same merchant. It is more likely to find semantic drift noise from “tooth extraction” to “tooth filling”. In addition, this method relies on the effects of an existing search and cannot mine for a query-less rewrite.
  • Mining from the search Session: Session refers to an interaction process of “opening App→ browsing multiple pages, clicking and paying multiple functions → leaving App” within a period of time. This method uses the Query entered continuously by the user during the whole App access to build the correlation. Session mining relies less on search results and therefore has stronger generalization ability. However, the corresponding disadvantages are that Session time cutting is difficult to determine, and the correlation between each search term in the sequence is covert, or even unrelated. Mining needs to be carried out based on the design duration and the introduction of clicks (for example, the search terms of a Session are not clicked before clicking, because specific requirements may not be met).
  • Word alignment: Word alignment draws on the idea of translation. The specific method is to design some alignment strategies such as word alignment (including the same word), pinyin alignment (the same pinyin) and structure alignment (the same word position after word segmentation) by removing the remaining part of the merchant title in Query recall as parallel corpus. The disadvantage of this method is that it relies heavily on the existing search results.

  • Merchant/product SEO: under the commodity scenario, some merchants will do SEO when they put on the shelves, such as: “extended dog leash dog leash dog collar dog walking dog teddy golden retriever pet large medium-sized small dog chain”. The disadvantage of this kind of mining source is that there is relatively large noise, and the noise correlation is relatively large and difficult to distinguish (there are subpositional type, appositives type, cheating and other noise types).

The above simple methods can mine a large number of related word pairs, but the assumptions and design rules are very simple, with obvious advantages and disadvantages. Here are several optimized mining methods.

3.1.2 Mining based on graph method

Graph methods, such as classical collaborative filtering and Graph Embedding, construct Graph structure by utilizing the relationship between user and Document to recommend more similar documents in recommendation scenes. In the search scenario, the user’s search Query and the Document of the platform can also be modeled into a graph structure by clicking and placing an order. In the use of Meituan search, we made the following two improvements to the composition: (1) The edge weights between Query and Document are Wilson smoothed based on the number of Query clicks on Document and click rate, not just the number of Query clicks on Document, so as to improve the correlation; ② In the bipartit graph, the Query rewritten by the user in the Session is also regarded as the Document node, and the Document title clicked is composed together to improve the amount of data mined.

We used SimRank++ algorithm [2] to verify the feasibility of two optimization points in composition method in early stage. SimRank++ algorithm is a similarity measurement algorithm in isomorphic information network. Its idea is: if two users are similar, the items associated with these two users are also similar; If two items are similar, then so are the users associated with them. The advantage of this algorithm is that Spark can be used for large-scale global optimization, and the edge weight can be adjusted as needed. After composition optimization, the volume of query rewriting data before and after SimRank++ optimization increased by about 30%, while the accuracy rate increased from 72% to 83%.

Subsequently, we tried other graph neural network models (GNN) with the same idea. DeepWalk[3] adopts the method of random walk in constructing Sentence context. Random walk generally builds a graph of the relation between Query. By random walk from a point, multiple paths are established. The Query on each path forms a sentence. The advantage of random walk is that the relation is transitive. Unlike Query co-occurrence, the Query of indirect relation can be connected. A small amount of data can generate enough training data through wandering. For example, in Session1, users first search Query1 and then change to Query2 and then query, in Session2, users first search Query2 and then change to Query3 and then query, co-occurrence method cannot directly establish the association between Query1 and Query3, and random walk can solve the problem well. In the task of rewriting word mining, the graph-based method is more efficient and accurate than the method of mining word pairs directly from search logs.

3.1.3 Based on semantic vector mining

After Word2vec [4], the Embedding idea spreads rapidly from NLP field to almost all machine learning fields, claiming that “everything can be Embedding”. As long as it is a sequence problem, the nodes can be expressed from the perspective of context. In addition, the advantages of Embedding in data sparse representation are also conducive to the subsequent exploration of deep learning. Embedding Query Embedding into low-dimensional semantic space and searching related words by calculating the similarity between Embedding is a common and easy to practice method in mining similar words. In addition to simply training large-scale word vectors on user comments and other corpora (FIG. 7a), the following two contextual construction methods are also tried in practice:

  1. Doc2Vec is built by Query recall merchant [5] : The merchant is recalled by Query or clicked as a context to train the Embedding to characterize Query (FIG. 7b). In the Meituan scenario, there are many services and commodities provided by the same merchant, so this method makes a lot of noise without considering the intention of Query category itself.
  2. The overwrite sequence is constructed by user Session [6] : a Session sequence is used as the context to train the Embedding representation Query (FIG. 7c). The advantage of this method is that it makes full use of the limitation of user’s own word change, and the mining coverage and accuracy are relatively higher.

After different context structures are designed to obtain Embedding, the following basic steps are taken to further improve the accuracy: (1) After word segmentation, the word vector of Query is trained by fastText. The Ngram feature of word level is considered in fastText training. Simple sum or average of words and words that are not added to Query can be realized. Solve OOV (out-of-vocabulary) problems; ② In the target word list, the word vector represents the word; ③ LSH was used to search for candidate words with vector cosine similarity higher than a certain threshold, or DSSM twin tower model [7] was used to improve the accuracy through supervised training. (4) XGBoost is combined with feature engineering to further filter.

BERT[8] has profoundly changed the research and application ecology in the field of natural language processing since it was proposed. We have tried some methods of BERT Embedding, among which the most effective one is to obtain the word vector through fine-tuning senttion-Bert [9] or SimCSE[10] models.

BERT calculates semantic similarity through the downstream task of inter-sentence relationship. The method is to use special characters to connect two sentences into a whole for classification. The problem is that a lot of redundant computing is caused by pin-pair combination, so it is not suitable for semantic similarity search or unsupervised clustering tasks. Sentence-bert uses the framework of twin network model to input different sentences into the BERT model shared by two parameters to obtain the representation vector of each Sentence, which can be used for semantic similarity calculation or unsupervised clustering tasks.

The method we practice is basically the same as the sentence-bert idea. We use the method in the left figure in the following figure to construct supervised rewriting pair training data, and use the method in the right figure to perform vector calculation in historical search Query of different intent types.

Compared with the previous methods, BERT’s method with two-tower structure has stronger ability to capture semantics, and supervised training combined with some adjustment of model structure can reduce various cases with serious drift. In addition, sentence-bert does not rely on statistical features and parallel data, and can facilitate migration and fine-tuning in any business, and is very friendly to some cold-start business scenarios. On this basis, the vector retrieval method of Faiss[11] is used to construct an offline retrieval process, which can support efficient retrieval in a 100-million-level candidate pool. Rewrite candidates constructed by this method can reach tens of millions or even hundreds of millions of levels of data, and the measurement accuracy is high. In recent years, comparative learning and other methods have continuously updated the list in the field of text representation, and continuous exploration can be made in the aspects of vector construction and vector interaction.

3.2 Semantic discrimination model

3.2.1 BERT semantic discrimination model

From the above methods, tens of millions of similar word pairs can be obtained, but there are still a large number of semantic drift cases, among which the problem of synonym drift is the most serious. The reason is that the assumption that Embedding is based on the same context is too strong, while the context of synonyms is very similar, including in the context of merchant and merchant category (a merchant usually provides multiple services) and the context of user Session change words (the concept of users’ browsing intention for several times in a certain intention). This makes it easy to find “cello” → “violin” cases, and makes it harder to filter bad cases from other dimensions such as user click behavior or intent classification.

However, the graph method focuses on relevance but ignores semantic drift, and has few relations on some Query nodes with small search volume, leading to comparison of cases such as “electric vehicle registration” → “electric vehicle monopoly”, and similarity score has no absolute meaning. In order to filter similar difficult cases from semantic dimension, we solve these problems by introducing BERT’s semantic information. BERT uses the idea of pre-training and fine-tuning to solve the problem of natural language processing. Besides the deeper network, the design idea of bidirectional language model can make better use of context information to avoid the problem of appositive-word drift. The following are some explorations of BERT sentence relationship tasks in query rewriting tasks.

At the beginning of BERT’s proposal, mT-Bert [12] used mining data and a small amount of manual annotation data to perform two-stage Tuning of inter-sentence relationship tasks in meituan scenario corpus pre-training. However, in practice, it is found that the model trained on the existing mining data is not highly differentiated in some cases, such as “cello” → “violin” and “wine” → “grape” cases with a small literal editing distance mentioned before. Therefore, how to construct high quality positive and negative example data is the key to approach the upper limit of BERT’s performance in query rewriting task.

We first tried a collaborative training approach, which is a semi-supervised approach and focuses on how to improve model performance with a large amount of unlabeled data when labeled data is relatively small. Considering the high noise of offline data mining, we explored the cooperative training method of NMT (Nature Machine Translation) and MT-BERT to achieve the effect of simultaneously improving data quality and model quality. The framework diagram of the overall system is as follows:

The whole process of collaborative training is as follows:

  • Step1 The BERT discriminant model produces NMT training data: The MT-Bert model, after fine-tuning of parallel corpus mined offline, is predicted on the full amount of data to be predicted, and after setting a certain threshold value, high-quality positive examples are returned to NMT.
  • Step2 Fine-tuning NMT: Add some manual annotation data to high-quality positive examples returned by BERT, and train them as NMT model training data to obtain NMT model and indicators.
  • Step3 training data of NMT output discrimination model: a certain number of queries are randomly selected to generate TopN pairs of rewritten words with NMT model, and fine-tuning data of BERT discrimination model at the next stage are produced.
  • Step4 Fine-tuning of BERT discriminant model: Take K pairs of words in the head as positive examples, X words in the tail as negative examples, and fine-tuning of BERT discriminant model.
  • Repeat the above steps until convergence: Repeat the above steps until both models converge on the measure set.

In the actual experiment, the collaborative training converges after three iterations, and the BERT and NMT effects improve significantly in the first two rounds of manually constructed Benchmark set, and the final effect is significantly better than the direct use of training samples + manual annotation data Tuning.

Collaborative training can effectively solve cases with high literal text similarity, such as “wine” → “grape”, but cases with high frequency of noise data, such as “Maqin” → “erhu”, which have no obvious literal matching and relatively similar context, still exist. The method of relation extraction is used to mine these difficult cases. For example, some Pattern methods are used to mine negative cases of appositives, mining sentence patterns like “A, B, C, etc.” mentioned in UGC, and constructing high-quality negative case data of appositives after filtering. After the optimization of negative example data, the accuracy of the model is further improved.

Finally, the training process of BERT semantic discrimination model is divided into four stages: (1) unsupervised: Continue Train using meituan scene corpus based on BERT model; (2) Semi-supervision: Co-training Tuning using the data mined by the algorithm; (3) Enhanced supervision of samples: high quality negative example Tuning by manual mining; ④ Use manually marked data for final Tuning. The accuracy of the final model is more than 94%, and a large number of semantic drift cases are solved.

3.2.2 BERT semantic discriminant model for goods

With the enrichment of Meituan business scenarios, the proportion of e-commerce search and supply traffic starts to increase, and the problem of mis-rewriting in the commodity field begins to increase. By analyzing the user’s Query and rewrite the Case found that the model is not very good migration into the field of goods, the main reason in addition to training data coverage, commodities search scenario users search corresponding rewrite requirement is the same thing, higher to rewrite the accuracy requirement, the expression is required by the user but merchants scene, The requirement for rewriting is to express the same requirements. In addition, from the perspective of Document, the product recall field is relatively single, and there is no problem that one merchant corresponds to multiple services in merchant search. After the simplified scene, the algorithm space is relatively large. Therefore, the rewriting discriminant model of commodity field is optimized separately:

  • Training data construction: The first problem of commodity rewriting model is that there is no training sample. Using SPU co-occurrence, vector recall and other rule methods, continuously follow up manual methods such as quality inspection annotation data, and finally build millions of high-quality training data through mining methods such as category combination, click and UGC construction difficulty negative cases.
  • Data enhancement: Random Negatives, Batch Negatives, and Hard Sample Negatives are used to enhance the model’s ability to identify and robustness to mis-rewriting during the sampling process of model training.
  • Model structure optimization: The BERT model structure of Baseline intersentence relationship is explored, and R-drop [13] and Child-tuning[14] are attempted to improve model expression ability. Overall F1 improved by 2.5pp.
  • Graph vector fusion: the method of constructing graph model based on search results is tried, and the discriminant ability is enhanced by combining online actual search results. Through entity recognition of online recalled commodity titles, each entity is used as a node to compose a picture with Query, and the goal is to predict the edge type between Query and recalled entity. Graph Embedding produced by GCN[15] and GAT[16] is integrated into BERT sentence relation discrimination model by vector Pooling method, and finally F1 improves 1.6PP, which solves ambiguity problems such as “bao Bao” is changed into “doll” and “toy doll” is recalled by mistake.

3.3 Online Services

Through the above mining methods and the discriminant model to further improve the accuracy, we can get about ten million level of high-quality rewriting pairs of data. However, the generalization of the application mode of online dictionary is inefficient. The following will explain how to further improve the overall effect of query rewriting through online model.

Meituan query rewriting line has the following several solutions :(1) high-precision dictionary rewriting; (2) High-precision model rewriting (statistical translation model +XGBoost ranking model); (3) End-to-end generative rewriting of semantic NMT (neural network translation model) covering long-tail Query; (4) Online vectorization retrieval covering merchant name search traffic.

Dictionary rewriting is the general method, it is important to note synonym substitution need combined with the context information, such as “the people” and “civilian” alone can be synonymous, but in large pharmacy “people” and “civilian large pharmacy” is a serious drift rewrite, rewrite the type classification in the dictionary after combining entity recognition information design strategy can solve most of these problems. The following section will introduce the last three online modules adapted by Meituan search query.

3.3.1 SMT (Statistical translation Model)

The problem with overwriting a Query by offline mining is insufficient coverage, but the short Term contained in a Query can be overwritten, such as the common example in the life service world: “XX is broken” = “repair XX”. Thinking in this way, the query rewriting task can be abstracted into a typical machine translation task. FFF can be set as the user search term, and EEE as the target rewriting term. SMT as a whole can be abstracted into a noise channel model, which can be solved according to the Bayes formula.


e ~ = a r g m a x e e p ( e f ) = a r g m a x e e p ( e f ) p ( f e ) p ( e ) p ( f ) = a r g m a x e e p ( f e ) p ( e ) \tilde{e} = arg \underset{e \in e^*}{\,max\,}p(e|f) = arg \underset{e \in e^*}{\,max\,}p(e|f) \frac{p(f|e)p(e)}{p(f)} = arg \underset{e \in e^*}{\,max\,}p(f|e)p(e)

The two probabilities p(F ∣e)p(F ∣e)p(F ∣ E) are the probability transfer model, p(e)p(e)p(e) is the language model. Thus, the adaptation model is divided into two parts, and word alignment candidates are obtained by combining the existing ten-million-level parallel corpus. Decode model (probability transfer model) and language model are trained respectively. In the actual construction process, some customized optimization was made for the details of the model according to the Case encountered:

  • Alignment dictionary filtering: Due to the noise of parallel corpus and the incorrect data generated by alignment, we use BERT semantic discriminant model of offline training to filter and optimize the generated alignment word list with rules (such as category cross entropy of two Term distributions, whether they contain features of equal dimensions).
  • Structured Decode model: SMT Decode adopts BeamSearch algorithm, and parameters of BeamSearch are mainly divided into two parts: (1) parameters of mapping probability, which separately add some characteristics to measure similarity between two terms on the basis of score of aligned word list; (2) The parameters of transition probability are added with the information of Term binding degree based on the language model.
  • Ranking model: The ultimate goal of rewriting is to recall more relevant and high-quality merchants, goods and services. Therefore, after obtaining a large number of rewriting results generated by SMT, two problems still need to be considered, one is semantic relevance, the other is validity. The solution is to introduce XGBoost ranking model, using features that consider both semantic relevance and the business statistical effect of rewriting word recall results dimensions. In terms of semantic relevance, the features used include click feature and text feature of the original word rewritten words respectively. Candidate pair text editing distance, text semantic similarity score, Session transfer times and time interval, etc. In terms of effectiveness, two aspects of regional and epidemic information are introduced, including co-occurrence characteristics such as exposure, click and order in Document and Document categories. The ranking model selects Top3 quality rewrites, which reduces the pressure of searching index and effectively ensures that the rewrites can recall local quality related results.

Finally, the overall framework online is similar to the classic Learning to Framework in the industry [17-18], where the coverage rate of online Rewriting traffic has increased by nearly 12% after the model goes online, achieving considerable benefits in QV_CTR and other indicators.

3.3.2 NMT (Neural network translation model)

After the introduction of synonym replacement and SMT statistical translation rewriting, the online traffic coverage rate with rewriting is close to 70%. However, there is still insufficient coverage in mid-long tail Query caused by the following two types of problems:

  1. The mining efficiency problem introduced by word segmentation granularity: both synonym replacement and SMT translation adaptation based on shorter Term depend on word segmentation results and candidates. However, the granularity of Chinese synonyms is usually only the transformation of a certain word, such as “learn XX” → “train XX”, and the rewriting of the single word dimension is easy to cause Case, and it is impossible to dig out all the “learn XX” to achieve the purpose of improving coverage.
  2. Semantically similar and unaligned complex Query rewriting problem: when the user enters some natural language Query, such as “where is the cheap mobile phone” on the merchant side is “mobile phone discount”, the word fragment candidate method is weak in generalization and cannot solve the similar problem.

Based on the above problems, a generative rewriting model independent of candidates is needed, and we consider using the deep semantic translation model NMT to solve such problems.

At the end of 2016, Google published neural Network Machine Translation (GNMT) [19], which declared that the performance of neural network machine translation exceeded that of IBM machine Translation model (SMT, phrase-based machine translation model) in 1989. This great development is driven by the end-to-end model of Sequence to Sequence (Seq2Seq) introduced into the Attention mechanism [20]. However, in practical use, it is found that there are two problems in NMT generated rewriting words that do not conform to the semantics (obscure or not smooth) and rewriting with semantic drift, which leads to a low effective proportion of new rewriting online, and even lead to serious drift cases. Therefore, the introduction of NMT rewriting must be combined with the use of search scenarios to optimize the above two problems, the goal is to generate no intentional drift, can produce the actual recall effect of rewriting words. Based on the analysis and thinking of the above problems, it is the general goal to guide NMT to generate higher-quality rewriting by introducing environmental factors. From this perspective, we investigated the method of reinforcement learning.

Reinforcement learning is a cycle in which agents take actions to change their State, obtain rewards, and interact with the Environment. We hope to use the idea of reinforcement learning to take the pre-trained NMT rewriting model as the Agent, and in the process of reinforcement learning iteration, the rewriting (Action) generated by it generates the final exposure and Reward through the search system (Environment) to guide the NMT optimization model parameters (State).

After further investigation, we refer to Google QA system [21] and Zhihu’s work [22], that is, through the method of reinforcement learning, the search system is regarded as an Environment, and the rewriting model is regarded as Agent, so as to take the quality of the results of big search into consideration. However, due to the strong correlation between ranking and ranking factors such as location and user in The Meituan scenario, it is not possible to use the whole dasoo as a feedback mechanism of Environment to recall the rewritten words and order them forward. In addition, requesting online ranking will lead to a series of engineering problems such as slow training speed. Combined with the actual performance of NMT, priority is given to ensuring the semantic similarity of generated rewritten words. Dasso recall log and BERT semantic discrimination model are used as the Environment, aiming at the crossover degree and natural semantic similarity of rewritten words in merchant sets in the interaction of search system. The final overall frame diagram is shown as follows:

The algorithm module design and process are described in detail below:

Step 0 pre-train the NMT generator

  • Model: The generator uses pre-trained NMT and the model structure is the classic Seq2Seq model. Considering that the user’s search terms are short in the actual search scene, initial Embedding based on word segmentation is used, and Attention is introduced between Encoder and Decoder. Attention mechanism can replace the alignment mechanism in SMT, which is very suitable for the task background of query rewriting. In addition, we also use BERT initialization Encoder by fine-tuning, which has the advantage of introducing more semantic information and faster model convergence.
  • Pre-training data: pre-training is very important in the whole system. Due to the use of offline history log simulation search system, in the process of reinforcement learning, if the rewriting words generated by the generator are very rare, Reward will be sparse, and the system cannot converge or the effect is not optimized. Therefore, we made the following optimization in data. Considering that the advantage of NMT rewriting is semantic generalization, we removed the proper nouns and aliases such as merchant name and address from the data obtained from the above offline mining. In the whole training reinforcement learning to rewrite the word and the word “punish” the same results, is part of the aftermath of the proper nouns as some fixed term rewriting such as the commodity name for other goods with drift, so added a small amount trade name in the training data of rewrite the original word, limit model generalization ability in this type of Query.
  • Model optimization: The short Query translation in Query rewriting is prone to over-translation (repeated translation of some words) and over-translation (missing some words in the translation process) in practice. There are two reasons for over-translation. One is the noise or inadequate training of the training corpus; the other is that when OOV occurs in the translation process, information will be lost in the subsequent translation process because NMT is a sequence problem, leading to over-translation. In addition to enriching and optimizing training data, Coverage mechanism is also introduced to solve the over-turning problem. In other words, a Coverage Vector is maintained for each source word to indicate the degree of translation of the word. In the decoding process, the Coverage information will be passed into the Attention Model, so that it can pay more Attention to the untranslated source word. The problem of omission is that NMT cannot learn accurate alignment information due to a large number of one-to-many rewriting pairs in the training corpus. This problem can be solved by introducing editing distance into reinforcement learning.

Step 1 Rewrite the original word and input the word into the environment to calculate the feedback

From the perspective of users, a good rewritten word should have the same semantic meaning, and the merchants newly recalled by the rewritten word should be very similar to the merchants recalled by the original word, and the distribution of merchants clicked by users should be relatively consistent. In addition, in order to be effective, the rewritten words should be popular search terms with smooth and rich recall results. From the above analysis, it can be concluded that the rewrites with high similarity should return positive feedback, while the rewrites with dissimilarity and incoherence should return negative feedback. Because the environment is divided into two parts, offline search log simulation search system and BERT semantic discrimination system, respectively from the search system and user interaction and semantic discrimination of the generated Action feedback.

  • Offline search log simulation search system: The offline search log contains the merchant list finally exposed by the search system in a search and users’ clicking behaviors on the list. By collecting the search log of a long period of time, we assume that the historical Query is rich enough and store two wide tables of the recalled merchant ID list and the user clicking merchant ID list of the historical Query. In the process of reinforcement learning, by retrieving the historical recall merchant ID list and historical click merchant ID list of the original word and NMT generated rewriting words on the two wide tables, the process can be compared to retrieving the recall and click two-dimension one-hot vector to represent the search words. The mathematical representation of recall similarity and user intention similarity is obtained by calculating the coincidence degree of the two vectors. Through this method, we seem to get a reasonable environment output, but there are still several problems. First, the feature of the original word is missing in the historical Query, and we design a small fixed positive feedback for NMT rewriting to the original word to solve this problem. The second is the case where the overwritten word is not in the historical Query. We consider that the overwritten word is different from the original word and should not be uncommon, so we give a fixed negative feedback to the case where the overwritten word cannot be found. In addition, Sigmod smoothing is also done for the calculation of similarity to capture a larger gradient of change.
  • BERT Simulated Semantic Discriminant System: Why is semantic discriminant needed with simulated search system? The reason is that a user’s click on a merchant listing page is not necessarily indicative of his intention. On the one hand, as mentioned earlier, a merchant may have multiple semantically parallel services. For example, most dental hospitals provide “tooth extraction” and “filling” services, in the two search terms of the merchant recall and click cross is very large; On the other hand, there may be wrong rewriting in the existing search system, especially when the rewriting word is a popular search word or a substring of the original word, the user’s click may be generated by the popular picture or business, which does not represent the original intention of the user. Therefore, semantic discrimination information is introduced into the reinforcement learning system, so as to avoid the omission of such search system and user behavior, and to solve the problem of Reward sparsity to a certain extent.

Step 2 Add weights to the feedback generated by the environment.

According to environment for feedback score based on weighted superposition generated after normalization of Reward, according to the business scenario and the actual problem here made several rounds of iteration, the design of the weighted feedback to play, to search, user behavior, semantic discrimination and literal match different weights, finally normalized between 0 and 1 as the final feedback.

Step 3 Iterate the results of the divider into loss for the generator to continue training.

According to the score of the marker, Reward is superimposed in the NMT fine-tuning model Loss. Here, several methods are compared. Among them, Google uses Batch average sentence length to make normalized superposition of plus and average Loss, which has the best effect.

  • Steps 1~3 are repeated on the Query corpus of reinforcement learning until the model converges.

By analyzing the effect after the launch, the introduction of reinforcement learning NMT can solve the semantic type rewrite (pick muscle to dial the reinforcement, labor disputes, labor disputes, and burning firewood oven), uncommon abbreviations (sweet shop – French dessert, foot, foot nails), input errors caused by shorthand (yoga coach – yoga instructor, sand bath, sauna bath). Natural language type Query to the word (hair dye where to buy hair dye, freckle which hospital good freckle hospital).

In general, reinforcement learning can improve the quality of rewritten words, especially the accuracy and efficiency of rewritten words after the introduction of the search system and user feedback in relevance. The accuracy of offline evaluation increases from 69% to 87%. It improves the coverage of complex long-tail Query rewriting online without introducing BadCase that may affect user experience, and makes good gains in the QV_CTR and other indicators of meituan’s long-tail Query search.

3.3.3 Online vectorization recall

Vectorization recall with the recent popularity of sentance-Bert, SimCSE and other vector representation methods in the academic world, more and more companies have begun to attempt large-scale application, such as Facebook[23], Taobao Search [24], Jingdong [25], etc. Due to the characteristics of strong expression ability of pre-training model, it has better generalization ability and accuracy compared with traditional DSSM and other methods.

The use of vector recall in rewriting scenarios has two advantages: on the one hand, the Query and rewriting terms are short and of similar length, and the semantics and types are relatively consistent, and the twin towers with consistent parameters can ensure certain accuracy; On the other hand, rewriting words are retrieved from the candidate pool rather than generated, which can control the validity of rewriting words and limit semantic types. By analyzing the missing recall problem of Meituan search, it is found that the missing recall problem of merchants’ precise search is relatively large. In addition, considering the situation of Meituan, merchants provide rich services and the long text on the Document side has scattered intentions. We first tried vectorization recall (hereinafter referred to as “fuzzy rewriting”) in the case of few no results caused by text mismatch under merchant intention and achieved very good results, which will be described in detail below.

Firstly, this paper summarizes such cases and thinks that the problems to be solved by fuzzy rewriting are as follows: when users have clear merchant intention, there are no results or recall problems caused by text mismatch or NER segmentation error. In such cases, users have clear intention but Query expression is vague. For example: search “Jiujiang heniu Braised pork” without recalling POI “Jiujiang Craft barbecue”, search “Ningbo Rice Small train” without recalling POI “Ningbo Train Laisi Theme Park”. This kind of problem is mixed with a variety of text variations and is difficult to be solved within the existing structured recall framework. After determining the boundary of the problem, this kind of Case has the following characteristics :(1) Query is changeable, but the merchant name recall pool is limited and certain; (2) The text length of Query and merchant name is short, which is very suitable for vectorization recall algorithm; (3) It can get rid of the limitations of the existing Boolean retrieval recall framework and avoid missing recall caused by simple text matching. Therefore, we developed a fuzzy rewriting process based on vector recall:

The core model of fuzzy rewriting and online service processing flow will be introduced in the following two parts.

Model structure: The model structure adopts double Query Tower QQQ and POI Tower PPP. For a given Query and POI, QQQ is used in the formula to represent the input Query text and PPP is used to represent the input POI title text. The calculation process of the model is as follows:


f ( q . p ) = G ( Q ( q ) . P ( p ) ) f(q,p)=G(Q(q),P(p))

Where Q(Q)Q(Q)Q(Q) represents the BERT tower of output Query Embedding, P(P)P(P)P(P)P(P)P(P) represents the BERT tower of output POI title text Embedding, and GGG represents the scoring calculation function. For example, Inner Product, L2 Distance, etc.

Vector Pooling: According to the characteristics of BERT model that the farther each layer is away from the downstream, the stronger the task generalization ability is, it is verified by several experiments that the output results of Avg Pooling using the second-to-last layer vector have higher accuracy and recall rate.

Negative Sampling: Facebook emphasizes the problem of Negative sample distribution in the paper Disgorg-based Retrieval in Facebook Search [23]. We have three Negatives for negative sampling: Random Negatives, Batch Negatives, and Hard Sample Negatives, and add a set of superparameters to adjust the proportions of the three Negatives. The more Random Negatives to recall merchants, the quality is higher but the correlation decreases; Batch Negatives are used to improve the correlation after increasing SimCSE; The Hard Sample Negatives are used to screen a set of merchant names that are similar in text using rules, and to add incorrect Negatives and correct pairs to each round of training at a low ratio to further improve correlation expression.

Loss function: The Pointwise Loss function of Binary cross-entropy is used for Loss because in the case of standard merchant name Label, the model predicts that rewriting merchant name “absolutely correct” performs better than rewriting merchant name “relatively correct” predicted by Pairwise. This is also reflected in the actual comparative experimental results.

Online service construction: as shown in Figure 12, online service is divided into four parts: pre-flow partitioning module, online text vectorization at Query end, ANN vector retrieval and post-rules.

  • Pre-traffic partition module: the pre-traffic partition module controls the fuzzy rewriting of service invocation logic and is only invoked when there is no result of traditional text recall under the intention of merchant name search. On the one hand, the effect is guaranteed, and on the other hand, the flow pressure on the downstream TFServing vector prediction service and vector retrieval service is reduced.
  • Online text vectorization on Query side: Online performance of pre-trained models has always been one of the difficulties in landing large NLP models in search systems. Ftyl-transformer was tried on the model, and the model was converted to FP16 precision for acceleration. In addition to the overall service cache, considering that Query vector is independent of the city, a layer of cache is also designed in this module to further reduce real-time calls.
  • ANN retrieval: Vector retrieval uses Antler vector retrieval engine developed by Meituan search team. Based on Faiss library encapsulation, the service realizes IVFFlat, HNSW and other vector retrieval algorithms, and supports distributed vector retrieval, real-time index, multi-field fragmentation, vector quantum space, scalar filter and other retrieval capabilities. It provides high-performance multi-field retrieval support for fuzzy rewriting to retrieve different POI libraries in different cities. ANN parameter adjustment also has a 2PP effect on the overall effect.
  • Post rules: by designing simple text filtering rules such as editing distance and simple word weight strategies, the relevance of the core part of the merchant name is preferentially guaranteed to further improve the effect of fuzzy rewriting.

Fuzzy rewrite the project after the launch of “nine artisan, and the cattle of burning flesh” did not recall POI “nine artisan craft barbecue” such target Case solution is very good, when users search for merchant name appear in word, more words, less words of generalization ability is very strong, and add a synonym for training data also solve the replaced part of the word, synonym substitution of recall problem. From the point of view of online effect, the indexes such as QV_CTR, no result rate and long tail BadCase all have great benefits, which effectively improve the user search experience of this part of traffic.

In addition to the engineering experience accumulated vector retrieval algorithm, we summarize the success of this project is through a series of problems found and problem classification means to define a clear boundary, and made the appropriate technology selection, intended use limit application of vector signal recall foster strengths and circumvent weaknesses, finally yields than expected. Vector retrieval in recent years, companies in the industry have a try, we believe in the business of search traffic, as well as goods and huge mining space search traffic, combining Meituan merchants field, service in the scene, many business difficulties, variation of the model has very much to try, we will be in a subsequent article introduce online quantitative retrieval direction of exploration, stay tuned.

3.4 Platform of query rewrite Service Capability

The query rewriting project has contributed good business benefits in different development periods of Meituan search through the iterations described above. With the expansion of its technological influence, it has gradually established cooperation with business parties such as Dianping App search, takeout App search and search advertising, and accumulated corresponding data and technical capabilities in products, takeout and hotel tourism searched on Meituan. At the same time, each business also put forward some unique requirements for query rewriting according to its own development stage and business characteristics. For this, we abstract the core functions of query rewriting, and the development direction of the whole technical framework is as follows:

  • Data refinement: At the data level, several core businesses are distinguished, and word relations provided include semantic dimensions such as synonymous, upper and lower, same and irrelevant. Synonyms can also be used in search and recall, recommendation advertising, etc., and unrelated words can be provided for correlation or ranking model to train negative examples. In the process of continuous mining, data with different precision can be sorted out through model and manual verification, and the expected benefits can be obtained for sorting features.
  • Algorithm instrumentalization: it provides comprehensive algorithm mining tools in terms of data coverage, iterates model accuracy in terms of semantic discrimination, solves ambiguity problems in short texts combined with application scenarios, and tracks and explores cutting-edge methods in the industry.
  • Online service operation and maintenance: online service supports fast business access, online AB experiment and intervention, etc.

4. Summary and outlook

This paper introduces the exploration and practical experience of query rewriting task in Meituanscene, explores semantic discriminant model, semantic retrieval model, graph model and other cutting-edge algorithm technologies in vertical domain search and recall combined with actual business scenarios and user needs, and accumulates phrase associative cognition data in life service field. In the part of offline data, it introduces the mining methods of strategy, statistical translation, graph method and Embedding, and summarizes the starting point, effects, advantages and disadvantages of each method in practice. Online models combining with the characteristics of vertical search structured retrieval and designed the high accuracy of rewriting dictionary adaptation, high accuracy of model (based on SMT statistical translation model and XGBoost sorting model), covering the long tail of Query optimization based on reinforcement learning method of NMT model, for merchants to search the vectorization of recall four online solutions.

Currently, rewriting traffic accounts for 73% of meituan App searches, and 67% of Dianping App searches. The query rewriting capability and service platform are built to support the search within each business channel and search advertising platform, and have achieved good income. At present, the peak of Query rewriting service cluster QPS (Query Per Second) has reached 60,000 times Per Second. We will further invest in research and development to enhance the technical influence within the company and even the industry.

How to better connect users and services, businesses and commodities on the platform is a problem that needs long-term and multifaceted investment to solve. We may conduct future iterations in the following directions:

  1. Further exploration of vector retrieval: in the life service scenario, user needs and services provided on the platform are huge, and will be more and more detailed and diverse. In recent years, contrast learning and other methods in the field of text representation continue to refresh the list, the industry has also been exploring and layout online vector recall. We think there is a lot of room for growth in this area and we will continue to invest.
  2. Semantic discrimination model exploration: semantic understanding is inseparable from context, especially for short text search term understanding. Similar to the previous attempts of Graph Embedding, multi-mode Embedding and other methods can be considered in the future to better solve the semantic discrimination problem in the search context.
  3. Generative rewriting exploration: Reinforcement learning can also try to SeqGAN[26] and use better generators to solve the rewriting problem of long tail search words.
  4. Further refine the capacity building of word relationships: in the process of cooperation with various businesses, we found that all kinds of word relationships are useful, which can be used in recall, correlation, sorting and other modules according to the strength of correlation. In this regard, AliCoCo constructed by Alibaba in the commodity field is currently well defined [27]. In the field of life service, word relationships are more abundant. We hope to build the largest, most detailed and richest structured data of word relationships in the field of life service to better serve users.

5. Author profile

Yang Jian, Zong Yu, Xie Rui, Wu Wei, all from Meituan Platform/Search and NLP Department NLP department.

6. References

  • [1] Wen Lihong, LUO Xingchi et al. Exploration and practice of NER technology in Meituan Search.
  • [2] Antonellis, Ioannis, Hector Garcia-Molina, and Chi-Chao Chang. “Simrank++ query rewriting through link analysis of the clickgraph (poster).” Proceedings of the 17th international conference on World Wide Web. 2008.
  • [3] Perozzi, Bryan, Rami Al-Rfou, and Steven Skiena. “Deepwalk: Online learning of social representations.” Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 2014.
  • [4] Mikolov, Tomas, et al. “Efficient estimation of word representations in vector space.” arXiv preprint arXiv:1301.3781 (2013).
  • [5] Grbovic, Mihajlo, et al. “Context-and content-aware embeddings for query rewriting in sponsored search.” Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. 2015.
  • [6] Djuric, Nemanja, et al. “Hierarchical neural language models for joint representation of streaming documents and their content.” Proceedings of the 24th international conference on world wide web. 2015.
  • [7] Shen, Yelong, et al. “Learning semantic representations using convolutional neural networks for web search.” Proceedings of the 23rd international conference on world wide web. 2014.
  • [8] Devlin, Jacob, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018).
  • [9] Reimers, Nils, and Iryna Gurevych. “Sentence-bert: Sentence embeddings using siamese bert-networks.” arXiv preprint arXiv:1908.10084 (2019).
  • [10] Gao, Tianyu, Xingcheng Yao, and Danqi Chen. “SimCSE: Simple Contrastive Learning of Sentence Embeddings.” arXiv preprint arXiv:2104.08821 (2021).
  • [11] Johnson, Jeff, Matthijs Douze, and Hervé Jégou. “Billion-scale similarity search with gpus.” IEEE Transactions on Big Data (2019).
  • [12] Yang Yang, Jia Hao et al. Meituan BERT’s exploration and practice.
  • [13] Liang X, Wu L, Li J, et al. R-Drop: A Regularized Dropout for Neural Networks[J]. ArXiv PrePrint arXiv:2106.14448, 2021.
  • [14] Xu, Runxin, et al. “Raise a Child in Large Language Model: “Towards Effective and Generalizable fine-tuning.” arXiv preprint arXiv: 209.05687 (2021).
  • [15] Kipf, Thomas N., “Semi-supervised classification with graph convolutional networks.” arXiv preprint arXiv:1609.02907 (2016).
  • [16] Veličković, Petar et al. “Graph attention Networks.” arXiv Preprint arXiv:1710.10903 (2017).
  • [17] Yin, Dawei, et al. “Ranking relevance in yahoo search.” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016.
  • [18] He, Yunlong, et al. “Learning to rewrite queries.” Proceedings of the 25th ACM International on Conference on Information and Knowledge Ma.
  • [19] Wu, Yonghui, et al. “Google’s neural machine translation system: Bridging the gap between human and machine translation.” arXiv preprint arXiv:1609.08144 (2016).
  • [20] Vaswani, Ashish, et al. “Attention is all you need.” Advances in neural information processing systems 30 (2017): 5998-6008.
  • [21] Buck, Christian, et al. “Ask the right questions: Active question reformulation with reinforcement learning.” arXiv preprint arXiv:1705.07830 (2017).
  • [22] wide. The application of Query understanding and semantic recall in Zhihu search.
  • [23] Huang, Jui-Ting, et al. “Embedding-based retrieval in facebook search.” Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020.
  • [24] Li, Sen, et al. “Embedding-based Product Retrieval in Taobao Search.” arXiv preprint arXiv:2106.09297 (2021).
  • [25] Zhang, Han, et al. “Towards personalized and semantic retrieval: An end-to-end solution for e-commerce search via embedding learning.” Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020.
  • [26] Yu, Lantao, et al. “Seqgan: Sequence generative adversarial nets with policy gradient.” Proceedings of the AAAI conference on artificial intelligence. Vol. 31. No. 1. 2017.
  • [27] Luo, Xusheng, et al. “AliCoCo: Alibaba e-commerce cognitive concept net.” Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2020.

Read more technical articles from meituan’s technical team

Front end | | algorithm back-end | | | data security operations | iOS | Android | test

| in the public bar menu dialog reply goodies for [2021], [2020] special purchases, goodies for [2019], [2018] special purchases, 【 2017 】 special purchases, such as keywords, to view Meituan technology team calendar year essay collection.

| this paper Meituan produced by the technical team, the copyright ownership Meituan. You are welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication. Please mark “Content reprinted from Meituan Technical team”. This article shall not be reproduced or used commercially without permission. For any commercial activity, please send an email to [email protected] for authorization.