Road network data is important for many urban applications, such as vehicle navigation and route optimization. The traditional road data acquisition method relies on the acquisition vehicle and consumes a lot of manpower and material resources. With the popularity of GPS devices, huge amounts of track data are generated in cities, enabling us to use track data to generate road networks. This problem has been extensively studied in the last decade, but the precision of many methods is not high, especially for upper and lower roads, parallel roads and so on. Since track data is not evenly distributed throughout the city, is there a way to improve the accuracy of road network projections in areas where vehicles are frequently used?
In this paper, we will introduce RoadRunner: A paper jointly published by Massachusetts Institute of Technology (MIT) and Hamad Bin Khalifa University (HBKU) in ACM SIGSPATIAL 2018. Improving the Precision of Road Network Inference from GPS Trajectories makes the Road Network collocation more accurate without loss of coverage (or recall). In this paper, the road network prediction problem is divided into two stages. Firstly, the RoadRunner algorithm proposed in this paper is used to predict the high accuracy map in the area of high track density, and then combined with the traditional road network prediction method to meet the requirement of recall rate. The core idea of RoadRunner is to use the connectivity of each track to determine whether the intersecting track is on the same road or two parallel roads.
First, background
Extrapolating road networks from tracks is a very challenging problem. The left two columns of Figure 1 show the performance of two traditional algorithms based on probability density estimation (KDE) and K-means clustering in three cities (Los Angeles, Boston and Chicago). The generated map has three problems:
1) The upper road will connect with the lower road;
2) Adjacent roads that do not actually intersect will be connected;
3) Detailed topologies are difficult to identify, such as freeway intersections.
In this paper, RoadRunner is proposed, which builds road network based on trajectory flow in an incremental way. In each iteration, RoadRunner generates road sections by taking into account the set of sub-tracks with the same precursor through a track filter operator. This method is very important for removing the interference of adjacent roads and is robust to GPS noise and road topology. Although The RoadRunner was more accurate, the filtering operation resulted in lost roads in areas with less track. In order to further improve the estimated recall rate of road network, this paper proposes a combination operation to integrate the estimated results of RoadRunner with those of traditional methods. The right-hand column of Figure 1 shows the effect of the method proposed by the text.
Second, the RoadRunner
The algorithm flow of RoadRunner is shown in Figure 2. The input of the algorithm is an initial road network, which can be derived from the existing road network or the road network inferred by some other method. We first add all vertices in the initial road network to queue Q, and the vertices in the queue are called active vertices (lines 2-3). In each iteration, RoadRunner picks an active vertex V from the queue and extracts the outflow direction of the Trace at vertex V through the Trace operation (lines 5-6). For each outflow direction θ, we extend the current network by adding a small section starting from V in the direction θ at a distance of fixed length D (lines 7-11). The Merge operation then attempts to Merge another vertex U of this small segment into the existing network. If the merge fails, we add u to Q for the next iteration (lines 12-14). When Q is empty, the algorithm stops and returns to the current road network.
Figure 2. RoadRunner algorithm framework ▲
It is worth noting that in order to effectively Trace and Merge, in each iteration, RoadRunner only reserved part of the sub-track set related to the current section for road network generation. FIG. 3 shows the satellite map and the corresponding trajectory data distribution at a certain point. Suppose we now want to expand the network from the blue vertices below Figure 3. Since the three highlighted roads are spatially close to each other at that point, and their orientation is nearly the same, if we consider all the trajectory data, we may connect the red road to the green road, or combine the red road and the blue road. But by excluding tracks that are not near the currently expanding segment, we are able to get a much cleaner set of subtracks (covering only the tracks of red roads). We call this path filtering operation the way Path filter. This is done as follows: Given a sequence of circles with a center along a section of road (the radius represents the width of the road), the path filter operator only preserves tracks that pass through these circles in order. For an active vertex, we can calculate a path of length K based on the current road network (ending with the active vertex), and then generate a sequence of circles along this path (the radius of each circle can be dynamically estimated from the trajectory data) to construct the filtering conditions.
Figure 3. Motivation of introducing path filter operator ▲
Next, let’s take a brief look at the implementation of the Trace and Merge operations.
1, Tracing
The purpose of the Trace operation is to extract the main outflow direction of the Trace at vertex V. As shown in figure 4, we want to extract the direction of the trajectory at the blue vertices. We first apply the path filter operator to obtain the sub-track set of the paths before passing through the blue vertex (shown in green), and we find that the tracks are obviously divided into three groups at the intersection. We draw a circle with the blue vertex as the center and D as the radius (as shown in the pink circle), and then generate 72 angles at the vertex for bisecting the circle. Then take the intersection of each Angle with the circle as the center and r as the radius to construct a small circle (as shown in the yellow circle). And then we use path filtering again to filter out the path before and the path T’ through the small circle.
. Finally, the number of tracks of this Angle is recorded as
, where M is a constant used for noise filtering. We calculated the number of tracks from each Angle and stored them in a 72-dimensional vector, which was smoothed by Gauss check, and then detected the local peak value. Figure 4. The following figure visualizes the distribution of smoothed counting values, and the algorithm detects local peaks in three directions.
Figure 4. Example of Trace operation ▲
2, the Merging
When we generate a road section, we need to merge it with the existing road network. But it’s not easy, you have to make sure that you have hierarchies, parallelism, multilevel relationships, and so on. In order to overcome these challenges, in this paper, the two sections are merged only if the future distribution of the track through the sections matches. In Figure 5, we show the future distribution of the trajectories of the blue and green paths. It is obvious that in example (a) the distribution is not identical, but in example (b) the distribution is almost the same.
▲ Figure 5. Example of a Merge operation ▲
The concrete implementation of the Merge operation is shown in Figure 6. For a vertex v to be merged, we calculate the future distribution of the trajectories of the paths passing through v. Then we find the vertex U around V. If the future distribution of the trajectories of the paths passing through U matches that of V, we add the segment (u,v) and return True; Returns False if the distribution of v differs from that of the surrounding vertices.
▲ Figure 6. Merge operation ▲
Three, two stage road network speculation
Although RoadRunner has a high accuracy, many sections of low-frequency access areas will be lost due to the path filtering operator. Therefore, in the second stage, in order to improve the recall rate, the results of RoadRunner need to be combined with those of other road network estimation algorithms.
It is assumed that G1 is the road network speculated by RoadRunner, and G2 generates algorithm output results for other road networks that can capture low-frequency traffic sections. We first delete the sections within the range of Rmerge in G2 and G1 to get G2′, because RoadRunner has successfully speculated that these sections have been completed. And then we put G1 and G2 prime together to get G. However, the segment added from G2′ is not connected to the rest of the segment. In order to connect these roads, for each vertex V of degree 1 in G, we make it connected with the surrounding road segment (u,w) when the following two conditions are satisfied: 1) V to
The distance of (u,w) is less than Rmerge; 2) The trajectory of the path V → P → u or V → p → w exceeds a certain threshold, where P is the projection point of V on the road section (u, W).
4. Experimental results
In this paper, the effectiveness of the proposed method is verified in four cities (Los Angeles, Boston, Chicago and New York). A 4kmx4km area is selected for each city, and the accumulated trajectory data is about 60,000 pieces. OpenStreetMap was used as a real road network for verification.
FIG. 7 shows the curves of error rate and recall rate of different methods under different parameter Settings (the closer to the upper left corner, the better the performance is; the results of different data sets are averaged). Experimental results showed that the RoadRunner+KDE method (RR-2+BE-2) had a 33.6% lower error rate than the KDE-only method (BE-2). The RoadRunner+kMeans method (RR-2+ KharITA-20) had a 60.7% lower error rate than the Kmeans-only method (KharITA-20).
Figure 7. Experimental results ▲
Five, the summary
This paper proposes a two-stage road network prediction framework, which can improve the accuracy without losing the recall rate. The core module in this framework is RoadRunner, which makes use of the connectivity of track data to generate accurate road network and performs well compared with the existing system in the face of complex road conditions.
Recommended reading:
-
How to use ClickHouse for temporal data management and mining?
-
What does a connected vehicle platform with 200,000 commercial vehicles look like?
-
NLP brings “Sci-fi Feeling” Beyond your Imagination – ACL2020 Paper Interpretation (1)
Welcome to [JD Zhilian Cloud] to learn about the developer community
More wonderful technical practice and exclusive dry goods analysis
Welcome to [JD Zhilian cloud developer] public account