This is the first day of my participation in the August Challenge. For details, see:August is more challenging

The minimum spanning tree in the last article is not enough feeling? I don’t know how well you have mastered, have you figured out the principles of prim and Kruskal algorithms? At the interview, if you can’t write it, at least give a general idea, of course, if you are going to graduate school, you will need to deeply understand and remember the entire algorithm code.

What is the shortest path

Today we’re going to look at another classic problem in the application of graphs, which is the shortest path problem. This problem is different from the minimum spanning tree, which requires you to connect all the nodes and take the path with the least weight. And the shortest path is the path that has the least weight from one vertex to another. This path is not necessarily included in the minimum spanning tree, so they are not very closely related.

From this graph, our shortest path from node 1 to node 2 is 2, which is pretty obvious. How about going from node one to node three? Not just the middle path with weight 6, but the path 1->2->3, which adds up to weight 5.

And then if we look at node 3, the shortest path to node 1 should be the path 3->4->1, which is the path with weight 6, not the path with weight 7 of the line in the middle.

Yeah, that’s the shortest path concept. In shortest paths, we usually deal with one-way graphs, but what about in real life? The most typical map-related applications are bidirectional. But that doesn’t stop us from learning. We can think of the nodes in this example as urban train stations. Even the train lines connecting node 1 and node 3 don’t always come and go at the same time. For example, the full journey time of Z2 from Changsha to Beijing is 14 hours and 42 minutes, and that of Z1 back is 14 hours and 10 minutes. So if we can choose the other train, such as a train from changsha to shijiazhuang may only need 8 hours, then only need to 2 hours from shijiazhuang to Beijing, so we choose this line the total time is only need 10 hours (of course, this is just a example, everybody in the high-speed rail or more will definitely select originating station under the condition of the train to sit).

Multi-source shortest path Floyd algorithm

First, let’s talk about a multi-source shortest path algorithm. So what is multiple sources?

So this is the algorithm that’s going to be able to figure out the shortest path from node to node. That’s right. So with this one algorithm, the shortest path from one node to another node is calculated at once. Magic? No, no, no, it’s even more amazing, and you’re gonna say Oh! My God! Is its core code, only five lines!!

function Floyd($graphArr){
    $n = count($graphArr);
    
    for($k = 1;$k< =$n;$k{+ +)// Let k be the node passed by
        for($i = 1;$i< =$n;$i{+ +)for($j = 1;$j< =$n;$j{+ +)// If passing through k makes the path from I to j shorter, then update the path between I and j to the result after passing through k
                if($graphArr[$i] [$j] > $graphArr[$i] [$k] + $graphArr[$k] [$j]) {$graphArr[$i] [$j] = $graphArr[$i] [$k] + $graphArr[$k] [$j]; }}}}for($i = 1;$i< =$n;$i{+ +)for($j = 1;$j< =$n;$j{+ +)echo $graphArr[$i] [$j].' ';
        }
        echoPHP_EOL; }}// Please enter the terminal number: 4
// Please enter the edge number: 8
// Please enter an edge in the format of weight 1 2 2
// Please enter the edge, the format is: 1 3 6
// Please enter an edge in the format of weight 1 4 4
// Please enter an edge in the format of weights 2 3 3
// Please enter an edge in the format of weight 3 1 7
// Please enter an edge in the format of weight 3, 4, 1
// Please enter an edge in the format of weight 4 1 5
// Please enter the edge, the format is: 4 3 12
// 0 2 5 4 
// 9 0 3 4 
// 6 8 0 1 
// 5 7 10 0 
Copy the code

We can verify the result first, which is the last output matrix in the comment. The shortest distance from node 1 to nodes 2, 3 and 4 is 2, 5 and 4. The shortest distance from node 3 to node 1, 2, 4 is 6, 8, 1. In other words, the adjacencies matrix of the original graph becomes the shortest path matrix. Each row represents the shortest distance from each node to the other.

Okay, that’s fine, so what is the code? What is this k? Don’t worry. Let’s take it step by step.

  • If the distance between two points is not the shortest, then there must be another point as the medium to jump from point I to this point and then to point J. Such a path is closer than the direct path from I to J. Let us define this point as point K

  • But we don’t know which node we’re going to go to, and maybe it’s not just one K, maybe we’re going to go through multiple k’s from I to j, and in that case, we’re going to start at k, which is the first level of the loop

  • Under the first level loop, we carry out our normal traversal loop of I and j to obtain the length of I directly to j, namely [I][j]. And in this case, since we have the outermost k, we also know that if I goes from K to j, it’s going to be this distance, which is [I][k] + [k][I]

  • Obviously, if [I] [k] + [k] [I] distance than [I] [j] short, update [j] [I] value for [I] [k] + [k] [I]

  • After the internal I and J cycles are completed, the traversal of the first node as a media jump is also completed, and the weight of each node in the current matrix is already the shortest path after the first node as a media

  • But, that’s not exactly true, maybe the shortest path is through I, K1, k2, j, so the outer k loop continues to iterate through the second node as the medium node

  • We loop until all nodes have done the intermediate medium node once, and then we get a matrix of the shortest paths, which is the output of our test code above

So let’s take node 4 and node 3. We define 4 as I, and node 3 as j.

When initialized, [I][j] is 12, which is fine, and the weight of the edge with the arrow is 12.

[I][j] = 1, [I][k] = 5, [k][j] = 6, [I][j] = 11, [I][j] = 11, [I][j] = 11

Also, if I is 4 and j is 2, their shortest paths are also assigned 7 in the loop k=1. Originally 4 to 2 had no direct edges, now they have a path with the connection of node 1, which is [4][2] = [4][1] + [1][2] = 7.

When k = 2, [I][j] = 11, then [I][k] = 11. [I][k] + [k][j] = 10, [I][j] = 10.

The loop continues, but there is no smaller value than this path, so the shortest path [4][2] ends up being 10.

Do you look faint? Take a pen and draw on a piece of paper or a notebook, and for each step k, draw what the current shortest path matrix looks like. Once you do this, you immediately know the core secret of Floyd’s algorithm.

I have to say, the wisdom of predecessors is really great, but it is predecessors, in fact, Floyd big guy published this algorithm in 1962, but the core idea of this algorithm is the idea of dynamic programming in mathematics. So, algorithms and mathematics can’t be separated from each other, which is not the number-one in mathematics.

Single source shortest path Dijkstra algorithm

After talking about multi-source shortest paths, let’s talk about a well-known single-source shortest path algorithm. While the multiple sources above are great, the time complexity and time efficiency is really bad. Three N times of nested loops is O(N3). If you have a little bit more data you can basically go from Oh! My God! Become Oh! FxxK! . And most of the time, what we need is a fixed shortest path problem from one point to another, which is a single source shortest path problem. At this point, you can use this slightly more efficient algorithm to solve the problem quickly.

// Origin indicates the source point, which is the shortest path from one node to another
function Dijkstra($graphArr.$origin)
{
    $n = count($graphArr);
    $dis = []; // Record the minimum value
    $book = []; // Record whether the node has been accessed
    // Initialize the weights from the source point to each point
    for ($i = 1; $i< =$n; $i{+ +)$dis[$i] = $graphArr[$origin] [$i]; // Default weights from the source point to other points
        $book[$i] = 0; // None of the nodes have been accessed
    }

    $book[$origin] = 1; // The source point itself is marked as accessed

    // The core algorithm
    for ($i = 1; $i< =$n - 1; $i{+ +)$min = INFINITY;
        // Find the node closest to the target node
        for ($j = 1; $j< =$n; $j{+ +)// If the node has not been accessed and the weight of the current node is less than the min value
            if ($book[$j] = =0 && $dis[$j] < $min) {
                $min = $dis[$j]; // min change to the path value of the current node
                $u = $j; // The variable u becomes the current node
            }
            // After iterating through all nodes, u is the nearest vertex
        }
        $book[$u] = 1; // Mark u as accessed
        for ($v = 1; $v< =$n; $v{+ +)// If [u][v] vertex is less than infinity
            if ($graphArr[$u] [$v] < INFINITY) {
                // If dis[v] is greater than dis[u]+g[u][v]
                if ($dis[$v] > $dis[$u] + $graphArr[$u] [$v]) {
                    // Assign current dis[v] to dis[u]+g[u][v]
                    $dis[$v] = $dis[$u] + $graphArr[$u] [$v]; }}}// The closest node completes and continues to the next closest node
    }

    for ($i = 1; $i< =$n; $i{+ +)echo $dis[$i], PHP_EOL; }}// Please enter the terminal number: 4
// Please enter the edge number: 8
// Please enter an edge in the format of weight 1 2 2
// Please enter the edge, the format is: 1 3 6
// Please enter an edge in the format of weight 1 4 4
// Please enter an edge in the format of weights 2 3 3
// Please enter an edge in the format of weight 3 1 7
// Please enter an edge in the format of weight 3, 4, 1
// Please enter an edge in the format of weight 4 1 5
// Please enter the edge, the format is: 4 3 12

// Test the shortest path from the fourth node to the other nodes
Dijkstra($graphArr.4);
/ / 5
/ / 7
/ / 10
/ / 0
Copy the code

That’s a lot of code, but if you look at the core algorithm, it’s just two layers of nested loops, and the time complexity is down to O(N2), which is a huge improvement over Floyd. Of course, its scenario is also limited, that is, only one node one node calculation.

All right, let’s see what Dijkstra is doing. We are still using the simple graph above, and we are still looking at the execution of the algorithm from node 4 to the other nodes.

  • First of all, we initialize the default value from node 4 to all other nodes, so there’s no direct path from node 4 to node 2, so it’s infinity, 5 to node 1, 12 to node 3.

  • Then mark node 4 as accessed, i.e., book[4] = 1

  • We go into the core algorithm, we go through the node from the beginning, and this is labeled I sub, because this is a single source shortest path, so we don’t have to look at ourselves to get to our own shortest path, we just need n minus one loop

  • Start the circulation of the j layer to determine whether the current node has been marked, has not been tagged to see whether its value is the smallest, the last cycle is completed for a starting from node 4 weight minimum path, and will arrive this path node under the tag for u, mark u subscript nodes to have access to this

  • Enter the V loop and determine whether the node from U to V is infinite. If not, then determine whether the weight of the node from U to v plus the original dis[u] is less than the weight recorded in dis[v]. If it is smaller than this, Update dis[v] to add the original dis[u] weight to nodes from U to V

  • The comparison algorithm is repeated in a loop

For node 4, DIS has undergone the following changes:

  • First, by default dis = [5, 9999999, 12, 0]

  • After the first loop, node 1 completes the search, and finds in the loop of V that it can go from node 1 to node 2 and node 3 with smaller values than the original, so dis = [5, 7, 11, 0]

  • After the second loop, node 2 completes the search. This loop finds that the distance from node 2 to node 3 is shorter, so dis = [5, 7, 10, 0]

  • Dis = [5, 7, 10, 0], dis = [5, 7, 10, 0]

Is that clear? If you don’t understand, try it out yourself. Either a breakpoint or a dis and book output in the middle will help you better understand how each step of the algorithm is executed. As you can see from the code, Dijkstra’s time complexity is O(N2), which is much faster than Floyd.

conclusion

So that ends the two most typical applications and algorithms of graphs. Of course, the content of the diagram can be much more than these, the more typical or schedule network diagram algorithm, especially when doing some project management system will be very useful. Of course, the more advanced content is to study graph Theory. This is far beyond my level, and I hope that students with more mathematics related foundation can continue to study. And I, first go to evil fill next mathematics!!

Test code:

Github.com/zhangyue050…

Reference documents:

Data Structures, 2nd Ed., Weimin Yan

Data Structures, 2nd Ed., Yue Chen

“Data structure high score notes” 2020 edition, day attendance entrance examination

The Aha! Algorithm