preface

Hello everyone, this is the topological sorting problem in the Qi Sister Chat Algorithm series.

Topological sort, also known as Topological order, is a bit confusing because Topological sort is not a pure sorting algorithm, but a linear order that can be performed on a graph.

This algorithm sounds lofty, and today’s interviews are also very popular. For example, when I met our company, I had a whole round of topology-based sorting design.

But it’s actually an easy algorithm to understand, and follow my lead, so you’ll never forget her again.

Directed acyclic graph

We just mentioned that topological sort is only for a particular kind of graph, so what kind of graph is it for?

A: Directed Acyclic graph (DAG), Directed acyclic graph That is:

  1. The edges of the graph have to be oriented;
  2. There are no loops in the diagram.

So what is direction?

For example, wechat friends are directed, you add his friends, he may delete you you do not know… Then the friendship is one way.

What is a ring? A ring is something that depends on the direction, if you start at a point and you come back to yourself, that’s a ring.

So the left-hand side is not a ring, the right-hand side is.

So if you have a ring in a graph, like the one on the right, you have to do 3 if you want to do 1, 2 if you want to do 3, and 1 if you want to do 2, it’s an endless loop, and you can’t find the right way to open it, so you can’t find a topological order for it.

Conclusion:

  • If the graph is not DAG, it has no topological order;
  • If it is DAG, then it has at least one topological order;
  • Conversely, if it has a topological order, the graph must be DGA.

So this is a necessary and sufficient condition.

A topological sort

So what does the topological order of such a graph mean?

We use this course schedule from Baidu Baike to illustrate.

Class code Course name Ap courses
C1 Higher mathematics There is no
C2 Fundamentals of programming There is no
C3 Discrete mathematics C1, C2
C4 The data structure C3, C5
C5 Algorithmic language C2
C6 Compile technical C4, C5
C7 The operating system C4, C9
C8 General physics C1
C9 Computer Principles C8

There are nine courses, some of which have a prerequisite requirement that you must have taken “the course required in the far right column” before you can take an “advanced” course.

So what topological sorting means in this example is: it means to solve for a feasible order that will allow me to do all of this.

So how do you do that?

First of all, we can describe it with a graph. The two elements of the graph are vertices and edges, so here it is:

  • Apex: Per course
  • Side: The course at the beginning is the prerequisite of the course at the end

Let me draw this:

This graph is called AOV (Activity On Vertex) network.

  • Vertex: indicates an activity;
  • Edge: indicates the sequence of activities

So an AOV network should be a DAG, or directed acyclic graph, otherwise some activities will not be carried out.

So all of these activities can be arranged in a feasible linear sequence, which is thetaThe topological sequence.

Then the practical significance of this sequence is: according to this sequence, at the beginning of each project, it can ensure that its precursor activities have been completed, so that the whole project can proceed smoothly.

Back to our example:

  1. We can see at a glance that we should learn C1 and C2 first, because there are no requirements for these two courses, so we can learn them when we are freshmen.

  2. In sophomore year, we can learn C3, C5 and C8 in the second line, because the prerequisite courses of these three courses are C1 and C2, and we have learned them all.

  3. In junior year, you can learn the third line OF C4, C9;

  4. In the last year, I’m going to pick C6, C7.

So, we’re done, and we have a topological ordering of this graph.

Note that sometimes the topological order is not unique, as in this case, C1 before C2, or C2 before C1, either is the correct topological order for the graph, but these are two orders.

So ask your interviewer if you want any solution or if you want to list all of them.

So to sum up,

The edges in this diagram represent a dependency, if you want to take the next class, you have to take the previous class first.

It’s the same as in the game. To get an item, you have to do mission A first, then complete mission B, and finally reach the destination.

Algorithm,

In the diagram above, you can easily see the topological order, but as projects get bigger and bigger, dependencies become more complex and require a systematic approach to solving them.

So if we think back to the process of finding our own topological order, why did we look at C1, C2 first?

Because they’re not dependent on anyone else, which means they have an input of 0.

Degree of entry: The degree of entry of a vertex refers to the number of “edges pointing to that vertex”; Out-degree: The out-degree of a vertex is the number of edges that point to other points.

So we’re going to do the points with zero degree of entry first, and that’s going to record the degree of entry for each vertex. Because we can only execute it if its entry is equal to 0.

In our example, we started with C1 and C2 having an input of 0, so we can do these two first.

So the first step in this algorithm is to get the degree of entry for each vertex.

Step0: preprocess to obtain the entry degree of each point

We can use a HashMap to store this information, or an array is more elaborate.

In order to facilitate the presentation in this article, I use the table:

C1 C2 C3 C4 C5 C6 C7 C8 C9
The degree of 0 0 2 2 1 2 2 1 1

Step1

Now that we have this, we can execute these points with zero degree of entry, which is C1, C2.

So let’s put all of these points that can be executed into a container that needs to be executed, and then we can take vertices one by one from that container.

Which data structure the container chooses depends on what we need to do and which data structure serves it.

So you can first put [C1, C2] in a container,

Then think about what we need to do!

The most common thing we do is put points in and take points out, so we need a data structure that is efficient for offer and poll operations, and a queue is sufficient.

(Anything else is fine. Vertices put into the container are all of the same status and are executable regardless of the order in which they came in, but why bother? A simple queue in regular order will suffice.

Then you need to take some points out and execute them.

When we take C1 out for execution, what does that mean?

A: The “edges” for “pointing to other points” that mean “vertex C1” have all disappeared, that is, the degree of C1 has become 0.

So in the picture below, these two edges can disappear.

Now we can update the values of C3 and C8 as follows:

C3 C4 C5 C6 C7 C8 C9
The degree of 1 2 1 2 2 0 1

So what we see here is a crucial step, the degree of entry for C8 becomes zero!

This means that C8 now has no dependencies and can be put in our queue for execution.

So our queue is going to be C2, C8.

Step2

The next time we do C2,

So C2 points to C3, C5 at minus 1,

Update Form:

C3 C4 C5 C6 C7 C9
The degree of 0 2 0 2 2 1

So C3 and C5 are now free of any constraints and can be executed in a queue.

Queue now becomes: [C8, C3, C5]

Step3

So the next step is to do C8,

Corresponding C8 refers to C9 entry -1. Update form:

C4 C6 C7 C9
The degree of 2 2 2 0

So C9 doesn’t have any requirements, and it can be put in the queue.

Queue now becomes: [C3, C5, C9]

Step4

And then C3,

Corresponding C3 refers to the entry of C4 -1. Update table:

C4 C6 C7
The degree of 1 2 2

But C4 doesn’t go to zero, so there’s nothing to queue at this point.

Queue now becomes [C5, C9]

Step5

And then C5,

Then the input of C4 and C6 referred to in C5 is -1. Update the table:

C4 C6 C7
The degree of 0 1 2

Here all C4 dependencies are gone, so we can put C4 in the queue:

queue = [C9, C4]

Step6

And then I do C9,

So C9 means minus 1 for C7.

C6 C7
The degree of 1 1

So C7 doesn’t have a zero input, it can’t join the queue yet,

Queue = [C4]

Step7

And then we go to C4,

So the input of C6 and C7 pointed to by C4 is -1, update the table:

C6 C7
The degree of 0 0

Both C6 and C7 are now 0. Put them in the queue and continue until the queue is empty.

conclusion

Ok, so let’s comb through this algorithm:

Data structure here we can use map to store the entry table, about map students who are not clear can see the previous ARTICLE I wrote HashMap oh ~

Map: <key = Vertex, value = Vertex >

Graph nodes can be represented by the index of the array, and values can be represented by the value of the array, which is much more elaborate than Map.

We then use a normal queue for nodes that can be executed.

So what we do is we put those vertices that have an input of zero in the queue, and then by executing each vertex in the queue, we can make the input of those vertices that depend on that vertex being executed minus one, and if any of those vertices have an input of zero, we can put them in the queue until the queue is empty.

Details Here are a few implementation details:

When we check for new vertices with entry == 0, we don’t need to go through the entire map or array, we just need to check the changes that have just been made.

If the graph is DAG, there may not be a viable solution. An easy way to do this is to compare the number of vertices in the final result to the number of vertices in the graph, or to add a counter, if not, there is no valid solution. So this algorithm can also be used to determine whether a graph is directed acyclic.

In many cases, the condition may be the edge list of the graph, which is also a common way to represent the graph. So this list that I’ve given you represents the edges in the graph. B: Yes, I do. In fact, the graph questions usually do not give you the graph directly, but give you a scene, you need to change it back to a graph.

Time complexity

Note ⚠️ : The time complexity analysis of the graph must be two parameters, many students in the interview is O(n)…

For a graph with v vertices and e edges,

The first step is to preprocess the map or array, and you need to go through all the edges, so O(e);

In the second step, the operation O(v) is performed for the points whose entry degree == 0. If it is a DAG, all the points need to join and leave the team once.

Step three, every time you execute a vertex, you eliminate the edge that it points to, e times in total;

Total: O(v + E)

Spatial complexity

There’s an array of indegree for all points, and then there’s a queue that puts at most all points in it, so it’s O(v).

code

There are two questions on Leetcode about the order of the course, one is 207, asking if you can complete all the courses, that is, whether topological order exists; The other problem is 210, which tells you to return any topology order, and if you can’t do it, return an empty array.

We’re going to write it on 210, which is a little bit more complete and a little bit more frequent.

The input given here is the edge list we just talked about.

Example 1.

Input: 2, [[1,0]] Output: [0,1] Explanation: There are 2 courses in total here, and the prerequisite course of 1 is 0. So the correct order is [0, 1].

Example 2.

Input: 4, [[1, 0], [2, 0], [3, 1], [3, 2]] Output:,1,2,3 [0] or [0,2,1,3] Explanation: this example draw here below

Example 3.

Input: 2, [[1,0],[0,1]] Output: null Explanation: this lesson can not be taught

class Solution {
    public int[] findOrder(int numCourses, int[][] prerequisites) {
        int[] res = new int[numCourses];
        int[] indegree = new int[numCourses];

        // get the indegree for each course
        for(int[] pre : prerequisites) {
            indegree[pre[0]] + +; }// put courses with indegree == 0 to queue
        Queue<Integer> queue = new ArrayDeque<>();
        for(int i = 0; i < numCourses; i++) {
            if(indegree[i] == 0) { queue.offer(i); }}// execute the course
        int i = 0;
        while(! queue.isEmpty()) { Integer curr = queue.poll(); res[i++] = curr;// remove the pre = curr
            for(int[] pre : prerequisites) {
                if(pre[1] == curr) {
                    indegree[pre[0]] -;if(indegree[pre[0= =]]0) {
                        queue.offer(pre[0]); }}}}return i == numCourses ? res : new int[]{};
    }
}

Copy the code

In addition, topological sorting can also be implemented using DFS-depth-first search, which I don’t have space to do here, but refer to GeeksforGeeks for reference.

The practical application

We have already mentioned a use case of course selection system, which is also the most frequently tested topic.

The most important application of topological sorting is the critical path problem, which corresponds to AOE (Activity on Edge) network.

AOE network: vertices represent events, edges represent activities, and weights on edges represent the time required for activities. AOV network: Vertices represent activities and edges represent dependencies between activities.

In AOE network, the path with the maximum length from the starting point to the end point is called critical path, and the activities on the critical path are called critical activities. AOE networks are generally used to analyze the process of a large project, how much time it takes to complete at least, and how much time each activity has to maneuver.

And to see how that works, you can look at the 14 minutes and 46 seconds of this video, and it’s a good example.

This is true for any graph with dependencies between tasks.

For example, when a POM dependency imports jars, have you ever wondered how it imports jars that you didn’t directly import? For example, if you don’t import the AOP JAR, it will automatically appear. This is because some of the packages you import depend on the AOP JAR, then Maven will automatically import it for you.

Other practical applications such as:

  1. Preprocessing of speech recognition system;
  2. Manage dependencies between object files, like I just said about jar package imports;
  3. Network structure processing in deep learning.

Please feel free to comment in the comments section if you have any additional information.

The above is the whole content of this article, topological sorting is very important is also very love to test a class of algorithms, before the interview must be proficient.

If you like this article, please give me a thumbs up and leave a comment. Your support and recognition are the biggest motivation for my creation. See you in the next article!

I’m Xiao Qi, New York programming girl, lifelong learner, every night at 9 o ‘clock in the cloud study room!

See my Github for more dry articles:Github.com/xiaoqi6666/…