0x01 Basic Concepts
The core concept
Photo: a set of nodes connected by edges (or vertex) | some discrete nodes and the number of a collection of edges connecting them, is the abstract model of network structure.
** edge: ** A line segment connecting two nodes.
** weight: the weight of ** edges
** Path: ** A continuous sequence of nodes, with a-b-F representing the path from node A to node F. A path that does not contain duplicate nodes is called a “simple path,” and the paths we’ll talk about in this installment are all simple paths
** Node distance: ** sum of path weights
Nodes and edges are the basic units of a graph
Other concepts
Ring: A non-empty path that repeats only the first and last nodes. A graph with a ring is called a ring graph, and a graph without a ring is called an acyclic graph
Parallel edge: An edge connected to the same node
Adjacent node: Two nodes connected on the same edge
Node degree: the number of adjacent nodes of a node
Connected graph: GRAPH in which paths exist on any two nodes
Directed graph & undirected graph
Directed graphs and undirected graphs are divided according to whether an edge has direction or not
In a graph, either all sides have direction, or all sides have no direction
0x02 Storage structure
Graphs can have multiple storage structures, and there is no one right way to use a graph. The actual way to use a graph depends only on the problem to be solved and the type of graph
1. Adjacency matrix
The rows and columns of a matrix represent nodes, and the values of the matrix represent the weights of the edges. The weight of the unconnected edge is 0. Symmetric matrices for undirected graphs. For many graphs, a large number of values are zero, which is a waste of storage space
2. Association matrix
The rows of a matrix represent nodes and the columns represent edges. It is applicable to the situation where the number of edges is larger than that of nodes, greatly saving space and memory.
3. Adjacency list
The adjacency list consists of a list of adjacent nodes for each node in the graph. The benefit is that you don’t waste storage space.
Compared with the storage mode of matrix, adjacency list can be stored by a variety of data structures, such as number group, linked list, etc., matrix can only be stored by two-dimensional array. The advantage of matrix storage is that it can be optimized for a specific type of graph, whether in storage space or search efficiency.
0x03 Traversal/search
The only difference between traversal and search is that there is no specified end point, there is no difference in the algorithm core
1. Breadth-first search
A search algorithm that preferentially traverses adjacent nodes
There are only two core steps:
-
Access a node, mark the node as visited, and clear it from the node to be accessed.
-
The adjacent nodes of the marked node are nodes to be accessed. If they have been visited or are already nodes to be accessed, they are not marked.
The specific traversal process starting from A is as follows:
-
Select A as the starting node;
-
Access A, and its adjacent nodes are B and C. Add nodes B and C to be accessed. 【B, C】
-
Access B, and its adjacent nodes are A, D, and E. Add nodes D and E to be accessed. 【C, D, E】;
-
Access C, its adjacent nodes are A, D, E. No new node to be accessed is added. 【D, E】;
-
Access D, and its adjacent nodes are B, C and F. Add node F to be accessed. 【E, F】;
-
Access E, and its adjacent nodes are B, C and F. No new node to be accessed is added. [F];
-
Access F, and its adjacent nodes are D and E. No new node to be accessed is added. 【 】.
-
If the node to be accessed is empty, the traversal is complete.
The access sequence is A, B, C, D, E, and F
Depth-first search
A search algorithm for adjacent nodes that traverse adjacent nodes first
The core steps are also two steps :(exactly the same as breadth first)
-
Access a node, mark the node as visited, and clear it from the node to be accessed.
-
The adjacent nodes of the marked node are nodes to be accessed. If they have been visited or are already nodes to be accessed, they are not marked.
The difference between breadth-first search and breadth-first search lies in the different storage modes of nodes to be accessed: 1. Breadth-first: queue-fifO 2. Depth-first: stack – First in, last out
The specific traversal process starting from A is as follows:
-
Select A as the starting node;
-
Access A, and its adjacent nodes are B and C. Add nodes B and C to be accessed. 【C, B】
-
Access C, its adjacent nodes are A, D, E. Add nodes D and E to be accessed. 【E, D, B】;
-
Access E, and its adjacent nodes are B, C and F. Add node F to be accessed. 【F, D, B】;
-
Access F, and its adjacent nodes are D and E. No new node to be accessed is added. 【D, B】;
-
Access D, and its adjacent nodes are B, C and F. No new node to be accessed is added. [B];
-
Access B, and its adjacent nodes are A, D, and F. No new node to be accessed is added. 【 】.
-
If the node to be accessed is empty, the traversal is complete.
The access sequence is A, C, E, F, D, and B
0x04 Two classical algorithm applications
1. Shortest/long path (minimum/maximum distance) of two nodes
Among all paths between two nodes, the path with the least weight overlap is the shortest path
There are many classical algorithms to find the shortest path. Here is only one that is easier to understand:
Bellman-ford algorithm (edge as basic unit)
Algorithm is the core
-
Set the starting weight to 0 and the other nodes to infinity. The weight of a node is the temporary distance of the shortest path from the starting point to the node.
-
Iterate over all edges, updating the weights of nodes on both sides of edges
-
Weight calculation rule: Take the two directions of the edge to calculate the weight of the two nodes respectively, end weight = math.min (original end weight, starting weight + edge weight)
-
In an undirected graph, you actually only need to recalculate the weighted node of the two nodes
-
Repeat step 2 until no more node weights are changed
Algorithm demo
A is the starting point, F is the ending point, find the shortest path from A to F
There are 8 edges, A-B, A-C, B-D, B-E, C-D, C-E, D-F, e-F
First traversal
-
A-B: the weight of B = 0 + 9 = 9, the original weight ∞, the final weight is 9
2. A-c: the weight of C = 0 + 1 = 1, the original weight ∞, the final weight is 1
3. B-d: the weight of D = 9 + 2 = 11, the original weight ∞, the final weight is 11
4. B-e: the weight of E = 9 + 5 = 14, the original weight ∞, the final weight is 14
5. C-d: the weight of D = 1 + 3 = 4, the original weight is 11, the final weight is 4
6. C-e: the weight of E = 1 + 7 = 8, the original weight is 14, the final weight is 8
7. D-F: C weight = 4 + 3 = 7, the original weight ∞, the final weight is 7
8. E-f: the weight of E = 7 + 5 = 12, the original weight is 8, the final weight is 8
After the first iteration, all the infinity is eliminated
The second time through, the final result is as follows:
Compared to the first traversal, only node B’s weight changes from 9 to 6. So the shortest path from A to B is not a-B, at least for now a-C-D-B is shorter.
The third time through, the final result is as follows:
Compared with the second traversal, no node weight changes, the algorithm ends.
The final conclusion is that the shortest path from A to F is a-C-d-f, and the weight is 7. As shown in “Get color # ff6B00” below
Other algorithms
Dijkstra Algorithm, A* (A-STAR) Algorithm, Shortest Path Faster Algorithm (SPFA) Algorithm, Floyd Algorithm, etc. (see recommended books for details)
2. Minimum spanning tree
Construct the minimum cost spanning tree of the connected network. To connect multiple or all nodes in a connected graph with a minimum sum of paths. In simple terms, it means connecting n nodes with n minus 1 edges and making the path shortest
Prim algorithm (node as basic unit)
Algorithm is the core
-
Start with the start node and add it to the minimum spanning tree
-
Through all non-minimum spanning tree nodes, find the nearest adjacency to any node in the spanning tree, and add the adjacency to the minimum spanning tree.
-
If new members have been added to the minimum spanning tree, you need to update the minimum value of the remaining nodes
-
Repeat steps 2 and 3 until the minimum spanning tree is constructed (i.e., the minimum number of spanning tree nodes is reached)
Algorithm demo
- Add node A to the minimum spanning tree
2. Traverse the remaining nodes, find that C is closest to A, and add it to the minimum spanning tree
3. Traverse the remaining nodes, find that D is closest to C, and add it to the minimum spanning tree
4. Traverse the remaining nodes, find that B is closest to D, and add it to the minimum spanning tree
5. Traverse the remaining nodes, find that F is closest to D, and add it to the minimum spanning tree
6. Traverse the remaining nodes, find that E is closest to B or F, and add it to the minimum spanning tree
7. There are no remaining nodes. Since there are two nearest paths with equal distances in Step 6, the figure has two minimum spanning trees
Other algorithms
Kruskal algorithm – to generate the smallest tree in edge units. (See recommended books for details)
A practical example of the 0x05 diagram
1. Database Neo4j based on picture storage
Neo4j Graph Platform – The Leader in Graph Databases
neo4j.com
2. Six degrees of segmentation theory
Any two people in the world who don’t know each other need very few middlemen to establish a connection. Harvard psychology professor Stanley Milgram did a linkage experiment based on this concept in 1967, trying to prove that it took an average of six steps to contact any two people who didn’t know each other
3. Map navigation – path planning, shortest path, subway path planning, travel and flight selection, etc
Graphs are better at studying distances and relationships between nodes
0x06 Algorithm Books Recommended
An introduction to
Book.douban.com/subject/303…
Book.douban.com/subject/258…
The advanced
Book.douban.com/subject/199…
Book.douban.com/subject/266…
Book.douban.com/subject/204…