Knowledge-based Question Answering (KBQA) refers to the semantic understanding and analysis of a given natural language Question, which is then queried and reasoned out by using the Knowledge base. Meituan has a lot of consulting problems in many scenarios of the full link of platform services before, during and after sale. Based on the question and answer system, we help businesses to improve the efficiency of answering users’ questions and solve users’ problems faster by means of automatic intelligent reply or recommended reply. Combined with KBQA’s specific practice in Meituan scene and a paper published in EMNLP 2021, this paper introduces the overall design of KBQA system, the breakthrough of difficulties and the exploration of end-to-end Q&A, hoping to be of help or inspiration to students engaged in related research.
1 Background and challenge
Question Answering System (QA) is an advanced form of information retrieval System, which can answer users’ questions in natural language accurately and concisely. The main reason for the rise of this research is people’s need for fast and accurate access to information, so it is widely used in various business scenarios in the industry. Users of Meituan have a large number of problems to consult merchants in many scenarios of the full link of platform services before, during and after sale. Therefore, based on the question and answer system, we help merchants to improve the efficiency of answering users’ questions and solve users’ problems faster by means of automatic intelligent reply or recommended reply.
For different questions, Meituan’s intelligent Q&A system contains multiple solutions:
- PairQA: Use information retrieval techniques to return the closest answer to the current question from questions already answered by the community.
- DocQA: Based on reading comprehension technology, extracts answer fragments from unstructured information of merchants and user comments.
- KBQA (knowledge-based Question Answering) : Reasoning the answers from the structured information of merchants and commodities based on the Knowledge graph Question Answering technology.
This paper mainly shares the practice and exploration in KBQA technology landing.
In the user’s problems, including a large number of commodities, businesses, scenic spots, hotels and other related basic information and policy information consulting, based on KBQA technology can effectively use the information in the commodity, business details page, to solve this kind of information consulting problems. After users input questions, KBQA system analyzes and understands the questions queried by users based on machine learning algorithm, queries and deduces the structured information in the knowledge base, and finally returns the exact answers queried to users. Compared with PairQA and DocQA, KBQA’s answers are mostly based on merchant data, which is more reliable. At the same time, it can perform multi-hop query, constraint filtering, and better deal with complex problems on the line.
In practical application, KBQA system faces various challenges, such as:
- In a wide range of business scenarios: Meituan platform business scenario is numerous, inclusion in the hotel, travel, food, and more than ten class service business life, and the user intention there is difference in the different scenarios, such as “breakfast probably how many money, for food businesses need to answer the price per capita, for the hotel business price details you need to answer the hotel restaurant.
- Constrained questions: There are usually many conditions in users’ questions, such as “does the Palace Museum offer discounts to students?”, which requires us to screen the preferential policies associated with the Palace Museum, rather than giving all the preferential policies to users.
- Multi-hop problem: the user’s problem involves the path composed of multiple nodes in the knowledge graph, for example, “when does the swimming pool of XX Hotel open?”, we need to find the hotel, swimming pool and business time successively in the graph.
The following details how we designed a high accuracy, low latency KBQA system to handle the scene, context and other information, accurately understand the user, capture the user intent to meet the above challenges.
2 Solution
The input of the KBQA system is the user Query and the output is the answer. The overall architecture is shown in Figure 1 below. The top layer is the application layer, which includes multiple portals such as conversation and search. After obtaining the user’s Query, KBQA online service calculates the result through the Query understanding and recall sorting module, and finally returns the answer text. In addition to online services, the construction and storage of knowledge graph are also very important. Users will not only care about the basic information of the merchants, but also ask questions about opinions and facilities, such as whether the scenic spots are interesting and whether the hotel is convenient for parking. In view of the above problem of no official supply, we built a set of information and opinion extraction process, which can extract valuable information from the unstructured introduction of merchants and UGC comments, so as to improve the satisfaction of users’ consultation. We will introduce it in detail below.
For the KBQA model, there are currently two mainstream solutions, as shown in Figure 2 below:
- Semantic Parsing-based: Parsing questions in depth and combining the Parsing results into executable logical expressions (such as SparQL) to query the answer directly from the graph database.
- Information extraction: first, the main entity of the question is parsed. then, multiple triples associated with the main entity are queried from KG to forma subgraph path (also known as multi-hop subgraph). Then, the question and subgraph paths are coded and sorted, and the path with the highest score is returned as the answer.
The method based on semantic parsing is more explicable, but it needs to annotate a large number of natural language logical expressions, while the method based on information extraction is more inclined to the end-to-end scheme, which performs better in the case of complex problems and small samples. However, if the subgraph is too large, it will significantly reduce the speed of calculation.
Therefore, considering the advantages of both, we adopt a scheme combining both. As shown in Figure 3 below, the whole process is divided into four steps, taking “Does the Palace Museum have student tickets on weekends” as an example:
- Query comprehension: Input the original Query and output the Query comprehension result. Query is syntactic analyzed to identify the main entity of user Query as “Forbidden City”, the business field as “tourism” and the problem type as one-hop.
- Relationship recognition: Enter Query, domain, parsing result, candidate relationship, and output the score of each candidate. In this module, we use dependency analysis to strengthen Query’s problem backbone, recall the correlation in tourism field, conduct matching sorting, and identify the relationship in Query as “ticket”.
- Subgraph recall: Enter the master entities and relationships resolved in the first two modules and output the subgraphs (multiple triples) in the graph. For the above example, all the subgraphs whose main entity is “Palace Museum” and relationship is “ticket” under the tourism business data will be recalled.
- Answer sorting: Input Query and subgraph candidates, output the score of subgraph candidates, if Top1 meets a certain threshold, output as the answer. Based on the result of syntactic analysis, the constraint condition is identified as “student ticket”. Based on this condition, the query-answer pair is finally sorted and the satisfying answers are output.
The following will introduce our construction and exploration of key modules.
2.1 understand the Query
Query Understanding is KBQA’s first core module, responsible for fine-grained semantic understanding of sentence components. The two most important modules are:
- Entity identification and entity linking output questions with meaningful business related entities and types, such as business name, project, facility, people, time, etc.
- Dependency analysis: Identifies the main entity, the questioned information, and the constraints of the question with the input of word segmentation and part-of-speech recognition.
Entity recognition is an important step in syntactic analysis. We first identify entities based on sequence annotation model, and then link to nodes in the database. For this module, we mainly made the following optimization:
- In order to improve the recognition ability of OOV (Out-of-vocabulary) words, knowledge injection is performed on the sequence annotation model of entity recognition, and the known prior knowledge is used to assist the discovery of new knowledge.
- Considering the problem of entity nesting, our entity recognition module will output coarse-grained and fine-grained results at the same time to ensure the full understanding of Query by subsequent modules.
- In the long Query scenario of QUESTION and answer, the context information is used to link entities and get node IDS.
Finally, the module outputs the types of each important component in the sentence, as shown in Figure 4 below:
Dependency analysis is a kind of syntactic analysis, which aims to identify the asymmetric dominance relationship between words in a sentence. The output result is represented by a directed arc from the subordinate word (DEP) to the dominant word (head). For the KBQA task, we defined five relationships, as shown in Figure 5 below:
There are two methods of dependency analysis: transition-based and graph-based. Dependency analysis based on transition models the construction of dependency syntactic tree as a series of operations, and predicts the actions of each step (shift, left_arc, right_arc) by the model. Unprocessed nodes are continuously pushed and assigned relationships, and finally a syntactic tree is formed. The graph-based approach is aimed at finding a maximum spanning tree in the graph, which is the global optimal solution of the sentence global dependency relationship. Considering that the method based on graph is to search the whole world with higher accuracy, we adopted the classic model of “Deep Biaffine Attention for Neural Dependency Parsing”, whose structure is shown in Figure 6:
In this model, the splicing vectors of words and parts of speech are encoded by BiLSTM, and then h(Arc-Head) and H (Arc-DEP) vectors are encoded by two MLP headers to remove redundant information. Finally, the vectors at each moment are spliced together to obtain H(Arc-Head) and H(Arc-DEP), and a unit vector is spliced on H(Arc-DEP), and the intermediate matrix U(ARC) is added for affine transformation to obtain the point integral matrix S(Arc) of DEP and HEAD, and the head dependent on each word is found.
With the results of dependency analysis, we can better identify relationships and complex problems. The specific features will be introduced in the following sections.
2.2 Relationship Recognition
Relationship recognition is another core module in KBQA. The purpose of the Predicate is to recognize the relationships asked by the user’s Query, which can then be combined with the Subject to determine the unique subgraph and get the answer (Object).
In practice, considering that the number of edge relations in the graph will continue to increase, we model the relationship recognition as a text matching task, input user Query, Query feature and candidate relation, and output the score of relation matching. In order to solve the multi-domain problem mentioned at the beginning, we add domain information into the input features, so as to store certain domain-related knowledge in the domain representation, so that the model can better judge. To improve the understanding of complex queries, syntactic information is incorporated into the input so that the model can better understand constrained, multi-hop problems.
With the emergence of large-scale pre-training language models, BERT and other large models have achieved SOTA results on matching tasks. Generally, methods commonly used in the industry are mainly classified as follows:
- Representation: also known as the “twin tower model”, its main idea is to convert two paragraphs of text into a semantic vector, and then calculate the similarity of the two vectors in the vector space, focusing on the construction of the semantic vector representation layer.
- Interactive: This method focuses on learning the alignment between phrases in sentences, learning to compare the alignment between them, and finally aggregating the information after alignment integration into the prediction layer. Since the interactive model can make use of the alignment information before the text, it has higher accuracy and better effect. Therefore, we adopt the interactive model to solve the matching problem in this project.
In order to make full use of BERT’s semantic modeling ability and at the same time consider the online delay requirements of actual business, we made the following three optimizations in terms of inference acceleration, data enhancement and knowledge enhancement:
- Hierarchical pruning: BERT learns different knowledge from each layer. General syntactic knowledge is learned when DistillBERT is close to the input side, while task-related knowledge is learned when DistillBERT is close to the output side. Therefore, we refer to DistillBERT and Skip to keep only the three layers that are most effective for the task. The f1-score of pruning was 4% higher than that of pruning with only the first three layers. At the same time, the experiment found that the effect difference of different pruning methods could reach 7%.
- Pre-fine tuning of domain task data: after pruning, due to limited training data, the effect of the three-layer model has a considerable decline. Through the understanding of the business, we found the “ask everyone” Meituan module data and the consistency of the data is very high, and the data cleaning, the title and related problems as a positive example, literal similarity between 0.5 to 0.8 were randomly selected as a negative example sentences, to generate a large amount of weak supervision of text, three layer model after pre fine adjustment on the accuracy by more than 4%, Even better than a 12-tier model.
- Knowledge enhancement: Due to the diverse ways of expression of users, accurate identification of users’ intentions requires semantic depth and syntactic information. In order to further enhance the effect, at the same time solve part of the Case, we joined in the input field and syntactic information, blend in explicit prior knowledge BERT, in under the influence of attention mechanism, combined with syntactic dependency tree structure at the same time, accurately modeling the dependencies between word and the word, we are in the business data and five large public do validation data sets, Compared with BERT Base model, the accuracy is improved by 1.5% on average.
After a series of iterations above, the speed and accuracy of the model have been greatly improved.
2.3 Understanding complex problems
In real scenarios, most questions can be classified into the following four categories (green is the answer node), as shown in Figure 8 below:
The hop number of questions is determined by the number of entities. Single-hop questions usually only involve the basic information of merchants, such as the address, telephone number, business hours, policies, etc., which can be answered by a group of SPOS (triples) in the knowledge graph. Two jump problems mainly for businesses in some facilities, services, information questions, such as the hotel gym in several layers, the breakfast begin, as well as the price of the airport transportation service, etc., need to find businesses – > the main entity (facility/service/product, etc.) of path, find the basic information of the main entity triples, SPX, XPO two triples. Constraint problem refers to the constraint condition on the main entity or answer node, generally time, crowd or attribute.
Here are some of the improvements we made for different types of complex problems.
2.3.1 Constraints
By mining online logs, constraints are divided into the following categories, as shown in Figure 9 below:
The answer to constrained problems involves two key steps: constraint identification and answer sorting.
Through KBQA dependent analysis module in the system and we can identify the user in the entity or relationship information and the constraints of the limit, but the constraint is more, and different node constraint type is different also, so when we are in constructing database query SQL to ensure that the recall rate, try to recall all candidate nodes under the path to the entities and relationships, In the final ranking module, the answer constraints are graded.
In order to improve efficiency, we first optimized the knowledge storage layer. As for the storage of Compound attribute values, Freebase proposes Compound Value Type (CVT), as shown in Figure 10 below, to solve the storage and query problems of this kind of Compound structured data. Happy Valley, for example, has different opening hours for different shows. This composite attribute value can be carried on in the form of CVT.
However, CVT storage increases query complexity and consumes database storage. Take “Happy Valley business hours CVT” as an example:
- This information is stored in a typically paired CVT, with one CVT involving three triplet stores.
- For the question “when does happy Valley summer night show start?”, there are four hops involved in the query, which are: < entity -> business hours CVT>, < business hours CVT -> season = summer >,< business hours CVT -> time = night show >,< business hours CVT -> time >. For industry-fast graph databases such as Nebula, a typical query time of more than three hops is in the tens of milliseconds, and much longer in real line use.
- Once the attribute name and attribute value have different but agreed expressions, another step of synonym combination is needed to ensure that the query can match and there is no recall loss.
To solve the above problems, we adopt a structured form of key-value to carry attribute information. Key is the constraint information of the answer, such as the crowd, time, etc., which can be used as the constraint information of the attribute Value, can be put in Key, and Value is the answer to be checked. For the example above, we combine the information of all possible constraint dimensions into keys, as shown in Figure 11 below:
Later, in order to solve the problem of too many expressions of constraint values, in the actual query process, when no perfect match can be found, we use the Key of the attribute Value to match the constraint information in the problem to calculate the correlation, and the Key with the highest correlation is the corresponding Value. Therefore, Key can be expressed in various forms:
- String form: Use the text similarity method to calculate and constrain text correlation.
- Text Embedding: If text Embedding of Key is added, it performs similar calculation with constraint information, and the effect is better than string Embedding when the training data is reasonable.
- Other Embedding algorithms: for example, Graph Embedding of virtual nodes, joint training of constrained text and corresponding virtual nodes, etc.
This form of storage is equivalent to storing only one triplet, namely < entity -> business time KV>, and the query process is compressed into a one-hop + text matching sort. Text matching based on semantic model can solve the problem of incomplete matching caused by different text expression to a certain extent. After optimizing the semantic model, the matching time can be compressed to more than ten milliseconds.
After the optimization of complex conditions, entities, relations and constraints are identified through the front module to form the constraint text, and then matched with the Key value candidate of the current recall subgraph to get the final answer.
2.3.2 Multi-hop problem
Multi-hop problem is a kind of problem naturally suitable for KBQA. When the user asks the information of facilities, services, goods and other entities in the merchant, we only need to find the merchant in the map first, then find the entity under the merchant, and then find the basic information below. If you use the FAQ, you need to set a standard question for each complex question, such as “where is the gym”, “where is the swimming pool”, etc. In KBQA, we can compress these questions very well, and ask for the position of any entity, but the starting entity is different.
In KBQA system, we first rely on the dependency analysis module to identify the dependency relations between sentence components, and then judge the hop number and relationship of the relationship inquired by the sentence through the relationship recognition module. The specific process is shown in Figure 12 below:
With the help of the type of entity recognition, we can replace the important elements in sentences, so as to compress the number of candidate relation configurations and improve the accuracy of relation recognition. After a full understanding of the sentence, the system will query the subgraph based on the master entity, relationship and hop number, and input it to the answer sorting module for finer-grained constraint recognition and scoring.
2.4 Question and Answer
In addition to the above basic information queries, users will also ask opinion questions, such as “Is It easy to park at Disneyland?” “, “XX hotel sound insulation good? And so on. For questions with subjective opinions, corresponding comments can be found from user comments based on FAQ or reading comprehension technology. However, this method can only provide one or several comments, which may be too subjective to summarize group opinions. Therefore, we put forward an opinion question and answer program, giving the number of pros and cons of an opinion. Considering the interpretability, we will also give the comment evidence of most opinions. The actual display in the App is shown in Figure 13 below:
In order to automate the batch mining of user opinions, we disassembled a two-step solution: opinion discovery and Evidence mining, as shown in Figure 14 below.
The first step is to explore multiple viewpoints in user reviews through opinion discovery. We use a serial-tagging model to discover entity and point of view descriptions in sentences, and use a dependency analysis model to judge entity-point relationships.
The second step is to dig deeper into Evidence in comments after finding a certain number of opinions, as shown in Figure 15 below. While some ideas can be found in the first step of opinion discovery, many user comments are implicit in their ideas. For example, when asked whether it is ok to bring pets, users do not directly specify in their comments, but instead say “dogs are having a great time here”. This requires us to carry out a deep semantic understanding of the comment sentence, so as to summarize the viewpoints in it. In the implementation process of the scheme, we initially used the classification model to classify the viewpoints, input user comments, use the encoder to understand the sentences, and then judge the positive degree of the viewpoints by the classification head of each viewpoint. However, with the increase of automated viewpoints mining, in order to reduce the cost of manual labeling and classification task, we transform it into matching task, that is, input opinion Tag and user comment, and judge the support degree of comment sentences for this viewpoint. Finally, in order to optimize the speed, we trimmed the 12-layer Transformer, which only reduced the effect by 0.8% while increasing the speed by three times, and realized the offline mining of mass view.
2.5 Exploration of end-to-end solutions
In the above, we designed different schemes for multi-hop, constrained and other complex problems. Although the problems can be solved to a certain extent, the complexity of the system also increases. Based on the pre-training idea of the relationship recognition module, we explored more general, end-to-end solutions, In this year’s EMNLP, he published large-scale Relation Learning for Question Answering over Knowledge Bases with pre-trained Language Models paper.
For KBQA, there are many studies focusing on graph learning methods, hoping to use graph learning to better represent the subgraph, but ignoring the semantic of the graph node itself. Meanwhile, BERT class pretraining model is trained on unstructured text, without exposure to structured data of the atlas. We expect to eliminate the inconsistencies between the two through task-related data, thus proposing three types of pre-training tasks, as shown in Figure 16 below:
- Base Extraction: A large number of one-hop ([CLS] S [SEP] H, R, T [SEP]) and two-hop ([CLS] S1, S2 [SEP] H1, R1, T1 (H2), R2, T2 [SEP]) text pair training data were generated based on large-scale relationship extraction of open source data sets. Let the model learn the relationship between natural language and structured text.
- Relation Matching: In order to better capture the relational semantics of the model, we generate a large number of text pairs based on relational extraction data. The texts with the same relationship are positive examples of each other, otherwise they are negative examples.
- Relation Reasoning: In order to enable the model to have certain knowledge Reasoning ability, we assumed that (H, R, t) in the atlas was missing, and used other indirect relations to deduce whether (h, R, t) was established. The input format was: [CLS]h, R, T [SEP] P1 [SEP].. pn [SEP].
After the pre-training of the above tasks, the BERT model significantly improves its reasoning ability for Query and structured text, and performs better in the case of incomplete KB, as shown in Figure 17 below:
3 Application Practice
After more than a year of construction, KBQA service has been connected to meituan’s tourism, hotel, comprehensive and other businesses, assisting businesses to answer users’ questions in time, and improving users’ satisfaction and conversion rate.
3.1 Ask the hotel
Hotel is one of the necessary needs for users to travel, but some small and medium-sized businesses have not opened artificial customer service entrance, unable to timely answer user information. In order to meet the user’s quick search for information in the details page, the intelligent assistant will assist the hotel merchants who have not opened the customer service function to automatically reply, and improve the user’s order conversion rate. The user can ask for various information about the hotel and room type page, as shown in Figure 18 below:
3.2 Ticket push
Ticket promotion is dedicated to helping tourism businesses solve the main ticket selling business. During peak hours of scenic spots, online ticket buying is more convenient than queuing, but many users still keep the habit of offline ticket buying. Meituan has improved the convenience of selling tickets for merchants and users by introducing QR codes and simple interaction. At the same time, the “Intelligent Ticket Assistant” is built into the ticket purchasing page to solve the problems in the ticket purchasing process and help users to buy appropriate tickets more quickly, as shown in Figure 19 below:
3.3 Merchant recommendation reply
In addition to the travel scenario, users will also have a lot of questions when browsing other local services, such as “Does the barber shop need an appointment?” “, “When is the last time the store closes? , these can be consulted through merchant customer service. But the manpower of the business itself is limited, and it is hard to avoid being overwhelmed during the peak period. In order to reduce the waiting time of users, our q&A service will provide the merchants with the speech recommendation function, as shown in Figure 20 below. KBQA is mainly responsible for answering information related to merchants and group buying.
4 summary and prospect
KBQA is not only a popular research direction, but also a complex system, which involves entity recognition, syntactic analysis, relationship recognition and many other algorithms. It not only needs to pay attention to the overall accuracy, but also to control the delay, which poses great challenges to algorithms and engineering. After more than A year of technology exploration, our team not only implemented several applications in Meituan, but also won the A Ranking First, B Ranking Second and Technology Innovation Award of CCKS KBQA assessment in 2020. At the same time, we released part of Meituan data and cooperated with Peking University to hold the CCKS KBQA assessment in 2021.
Going back to the technology itself, although our KBQA has solved most of the head problems so far, the long tail and complex problems are the bigger challenges. There are many cutting-edge technologies to explore, and we want to explore the following directions:
- Unsupervised field migration: since KBQA covers meituan hotel, tourism and comprehensive business scenarios, including more than ten small fields, we hope to improve the fee-shot and zero-shot capabilities of the model and reduce the labor cost caused by data annotation.
- Business knowledge enhancement: In the context of relational recognition, the focus of model core words on irrelevant words will cause serious interference to the model. We will study how to use prior knowledge to inject pre-trained language model to guide the correction of Attention process to improve model performance.
- More types of complex problems: In addition to the constraint and multi-hop problems mentioned above, users will also ask comparison and multi-relationship problems. In the future, we will make more optimization of graph construction and Query understanding modules to solve users’ long tail problems.
- End-to-end KBQA: Whether for industry or academia, KBQA is a complex process, how to use the pre-training model and its own knowledge to simplify the overall process, even end-to-end solutions, is the direction we should continue to explore.
Also welcome the students who are interested in KBQA to join our team and explore more possibilities of KBQA together! Resume address: [email protected].
Author’s brief introduction
Mei Ru, Di Liang, Rui Si, Zhi Hong, Yang Ming and Wei Wu are all from the Knowledge Mapping group of NLP Center, Department of Search and NLP.
Read more technical articles from meituan’s technical team
Front end | | algorithm back-end | | | data security operations | iOS | Android | test
| in the public bar menu dialog reply goodies for [2020], [2019] special purchases, goodies for [2018], [2017] special purchases such as keywords, to view Meituan technology team calendar year essay collection.
| this paper Meituan produced by the technical team, the copyright ownership Meituan. You are welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication. Please mark “Content reprinted from Meituan Technical team”. This article shall not be reproduced or used commercially without permission. For any commercial activity, please send an email to [email protected] for authorization.