Navigation is the core user scenario of AmAP, a leading mobility solution provider in China. Route planning, as the premise of navigation, is tailored for users according to the starting point, end point and route strategy Settings.
As a necessary initial step of route planning, the accuracy of starting point is very important for the quality of route planning and user experience. This paper will introduce some exploration and practice of AmAP in improving the accuracy of starting point road catching, especially in introducing machine learning algorithm model.
What is starting point catch road
First of all, let’s briefly introduce what is the starting point to catch the road. Starting point path capture is a route planning request initiated by a pointer to a user. Based on the obtained user location information, the starting point of the user is bound to the actual road.
As can be seen from the Amap App, there are three ways for users to choose starting points for route planning:
1. Manually select a point (users manually mark the location on the map).
2.POI selection points (points of Interest in geographic information system can be stores, residential areas, bus stations and other geographical location marking information).
However, due to the influence of GPS, base station and network positioning accuracy, the positioning coordinates are prone to drift, and the location captured by positioning equipment may differ from the actual road of the user by several meters, dozens of meters or even hundreds of meters. How to accurately locate the user to the real road under the limited information is the main problem we need to solve.
Why machine learning
Before the introduction of machine learning, manual rules were used to sort candidate roads. The core idea is: the distance as the main feature, combined with Angle, speed and other characteristics, weighted calculation score, and then affect the sorting, the weight and threshold value involved in the manual rules are determined by the comprehensive actual combat experience.
With the continuous growth of Amap’s business and the increasing number of planning requests and scenarios, the limitations of artificial rules are becoming more and more obvious, which are embodied in the following aspects:
• Even with a lot of experience included, the manually set thresholds and weights are still not perfect, and it is an unchangeable fact that deviations or blind spots are easy to occur.
• In policy maintenance, when upstream data is updated, new features cannot be added to policies as quickly as possible.
• Manual rule setting requires a lot of experience, and it is difficult to make the most agile response to personnel changes.
In the era of big data and artificial intelligence, it is an inevitable trend to use the power of data to replace part of human work, realize process automation and improve work efficiency.
Therefore, based on the current situation and problems of manual rules of starting point road catching, we introduce a machine learning model and the relationship between automatic learning features and road catching results. On the one hand, autonavi has a large amount of planning and actual data, so it has a natural advantage in acquiring training data of machine learning models. On the other hand, the machine learning model has stronger expression power and can learn the complex relationship between features to improve the accuracy of road catching.
How to realize machine learning
Regression machine learning itself, the following is to introduce how we build a starting point road grab machine learning model. Generally speaking, the application of machine learning methods to solve practical problems can be divided into the following aspects:
• Definition of target problem
• Data acquisition and feature engineering
• Model selection
• Model training and effect evaluation
1. Definition of target problem Before the introduction of machine learning model, the problem to be solved needs to be mathematically abstracted.
In this way, the starting point route catching problem is transformed into selecting the road most likely to be the actual location of the user from the set of roads around the registration point.
The whole process is similar to search sorting, so we also adopt the way of search sorting when developing the modeling scheme.
I. Obtain location information in the user route planning request A.
Ii. Recall roads within a certain range around the registration point to form alternative set B.
Iii. Sort the alternative roads, and the first alternative road is the output result of the model, that is, the road C where the user is actually located.
Finally, we define the starting point routing as a supervised search sorting problem. Having defined what we needed to achieve, we started to think about data acquisition and feature engineering.
It is often said in engineering that data and features determine the upper limit of machine learning, and models and algorithms only approximate this upper limit. The visibility of data and features is critical to the final outcome of the project.
To train the starting point route-catching machine learning model, we need to obtain two types of data from the original data:
• Truth data, which is the actual road information when the user sends the route planning request.
When machine learning is applied to the starting point path capture project, the first problem is the acquisition of truth data. Users in A location request initiated A route planning, due to the location accuracy limit, we can’t confirm the actual location, but if there is A real users near the launch plan request information, can match the real go information to the network to generate A trajectory, we can get through this path to the request anchor point of the actual road.
We mined the navigation request data of Amap, combined the user’s actual route and route planning information, and got the data set of one-by-one mapping between request and truth value.
• Feature data In the starting point road capture model, we extracted three categories of features for sample set construction, namely, anchor point related features, road itself features and combination features between anchor points and roads.
3. Model selection In the definition of target problem, we regard the starting point path grabbing analysis as a search sorting problem, and machine learning ranking technology mainly includes three categories: point-wise, pin-wise and list-wise.
According to the characteristics of starting point road catching service, we adopt List-wise, whose learning to Rank framework has the following characteristics:
• The input information is a multi-feature vector (that is, a query) of all roads corresponding to the same route planning request.
• The output information is a scoring sequence corresponding to the request (that is, the same Query) feature vector.
• For the scoring function, we use a tree model.
We choose NDCG (Normalized Cumulative Gain) as the model evaluation index. NDCG is a kind of index that comprehensively considers the relationship between model ordering results and real sequences, and is also a commonly used index to measure the ordering results.
4. Model training and effect evaluation We extracted the request information within a certain period of time, obtained the corresponding truth value and feature data according to the way described in Step 2, built the sample set by marking, divided it into training set and test set, trained the model and checked whether the results met expectations.
To evaluate the effect of the model, we used manual rules and machine learning model to catch the road for the requests of the test set, and compared them with the truth value to obtain statistical accuracy.
The comparison results show that for randomly selected requests, the difference rate between the model and manual rules is 10%. In these 10% difference groups, the accuracy rate of model road catching is 40% higher than that of manual rules, and the effect is significant.
Write in the last
Above, we have introduced some applications of big data and machine learning in starting point road catching. The successful launch of the project has also verified that machine learning can play an important role in improving accuracy and optimizing process.
In the future, we hope to continue to refine the existing model scenarios, find new revenue points, explore together from the perspectives of data and models, and continue to optimize the road catching effect of machine learning.