7.4 Storage structure of the figure
The storage structure of graphs is more complex than that of linear tables and trees. First of all, what we say verbally about “vertex position” or “adjacencies” is only a relative concept. In fact, from the definition of the logical structure of a graph, any vertex can be regarded as the first vertex, and there is no order relationship between the adjacent points of any vertex. For example, the four graphs in Figure 7-4-1 are actually the same graph, but the positions of the vertices are different, resulting in a different appearance.
Figure 7-4-1
Because the structure of a graph is complex, there may be a connection between any two vertices, so the relationship between the elements cannot be expressed by the physical location of the data elements in memory. That is to say, the graph cannot be expressed by a simple sequential storage structure. The multilist approach, where a vertex in a graph is represented by a node in a data field and a vertex in a pointer field, although you can do a graph, in fact, as we’ve already discussed in a tree, this is problematic. If the degree of each vertex differs greatly, designing node structure according to the vertex with the largest degree will cause a lot of waste of memory cells, and designing different vertex structure according to the degree of each vertex will bring inconvenience of operation. So, for a graph, how to implement physical storage for it is a difficult problem, but our predecessors have solved it, now let’s look at the five different storage structures provided by the predecessors.
7.4.1 Adjacency matrix
Consider that a graph consists of two parts: vertices and edges or arcs. It’s difficult to put them together, so it’s natural to think about storing them in two separate structures. Vertices have no size, priority or rank, so storing them in a one-dimensional array is a good choice. Edges or arcing, because they’re vertices to vertices, can’t be done in one dimension, so let’s consider storing them in a two-dimensional array. And so our scheme for the adjacency matrix was born.
The Adjacency Matrix storage of graphs is represented by two arrays. A one-dimensional array stores information about vertices in a graph, and a two-dimensional array (called the adjacencies matrix) stores information about edges or arcs in a graph.
Let the graph G have n vertices, then the adjacencies matrix is an n×n square matrix, defined as:
For an example, the left of Figure 7-4-2 is an undirected graph.
Figure 7-4-2
We can set two arrays, vertex[4]={v0, v1, v2, v3}, and arc[4][4] as a matrix shown on the right of Figure 7-4-2. For a brief explanation, the main diagonal values of the matrix, arc[0][0], arc[1][1], arc[2][2], arc[3][3], are all 0 because there is no edge from the vertex to itself, such as v0 to v0. Arc [0][1]=1 because the edges v0 through v1 exist, and arc[1][3]=0 because the edges v1 through v3 do not exist. And since it’s an undirected graph, the edges from V1 to v3 don’t exist, which means the edges from V3 to v1 don’t exist either. So the array of edges of an undirected graph is a symmetric matrix.
Huh? What is a symmetric matrix? It doesn’t matter if you forget. Review. A symmetric matrix is a matrix of order n whose elements aij=aji, (0≤ I,j≤n). That is, the main diagonal from the upper left corner to the lower right corner of the matrix is the axis, and the elements in the upper right corner and those corresponding to the lower left corner are all equal.
With this matrix, we can easily know the information in the graph.
1. It’s very easy to determine whether any two vertices have edges or not.
2. We want to know the degree of some vertex, which is essentially the sum of the entries in row I (or column I) of this vertex vi in the adjacencies matrix. For example, the degree of vertex V1 is 1+0+1+0=2.
3. If arc[I][j] is 1, it is the adjacencies.
Let’s consider another directed pattern, as shown in figure 7-4-3 on the left.
Figure 7-4-3
The vertex array [4]={v0, v1, v2, v3}, and the arc array [4][4] is a matrix like the one shown in figure 7-4-3. It’s still 0 on the main diagonal. However, since it is a directed graph, this matrix is not symmetric. For example, arc[1][0]=1 when v1 goes to v0, and arc[0][1]=0 when v0 goes to v1, there is no arc.
A directed graph pays attention to the input degree and the output degree. The input degree of vertex v1 is 1, which is exactly the sum of the numbers in column V1. Vertex v1 has an outdegree of 2, which is the sum of the numbers in row V1.
In the same way as the undirected graph, to determine whether there is arc from vertex vi to vj, we only need to find whether arc[I][j] in the matrix is 1. To obtain all adjacencies of vi is to scan the elements in row I of the matrix and find the vertices where arc[I][j] is 1.
In graph terminology, we have the concept of a net, that is, a graph with weights on each edge is called a net. So these weights need to be stored, so how do you deal with this matrix to accommodate that? We have a solution.
Let graph G be a netgraph with n vertices, then the adjacent matrix is an n×n square matrix, and is defined as:
Where wij is the weight over (vi,vj) or <vi,vj>. Infinity represents a value that the computer allows to be greater than all edge weights, i.e. an impossible limit value. Some of you might say, why not 0? The reason is that the weight WIJ is mostly positive, but occasionally it can be 0 or even negative. Therefore, an impossible value must be used to represent non-existence. Figure 7-4-4 shows a directed netgraph on the left and its adjacent matrix on the right.
Figure 7-4-4
So how does the adjacency matrix enable the creation of graphs? Let’s first look at the structure of the adjacency matrix stored in the diagram, as follows.
typedef char VertexType; /* Vertex types should be user-defined */
typedef int EdgeType; /* The weight type on the edge should be user-defined */
#define MAXVEX 100 /* Maximum number of vertices, which should be user-defined */
#define INFINITY 65535 /* Use 65535 to represent ∞ */
typedef struct
{
VertexType vexs[MAXVEX]; /* Vertex table */
EdgeType arc[MAXVEX][MAXVEX]; /* The adjacencies matrix can be viewed as an edge table */
int numVertexes, numEdges; /* The current number of vertices and edges */
}MGraph;
Copy the code
With this structure definition, we construct a graph, which is essentially the process of feeding data to the vertex table and the edge table. Let’s look at the code for creating an undirected netgraph.
It can also be obtained from the code that the time complexity of the creation of an undirected netgraph with n vertices and E edges is O(n+n2+e), where the initialization of the adjacent matrix G.arrc takes O(n2) time.
7.4.2 adjacency table
Adjacencies matrix is a good graph memory structure, but we also found that for the graph with fewer edges relative to vertices, this structure is a great waste of memory space. For example, if we are dealing with a sparse directed graph like Figure 7-4-5, and there are no arcs in the adjacencies matrix other than arc[1][0] weights, all that storage is wasted.
Figure 7-4-5
So let’s consider another way of structuring storage. Recall that when we talked about linear tables, the sequential memory structure had the problem of preallocating memory and wasting storage space, which led to the chained storage structure. In the same way, we can also consider using the opposite side or arc storage chain to avoid the problem of waste of space.
And remember when we talked about storage structures in trees, we talked about a child representation, where you put nodes in an array, and you chain the children of the nodes, so no matter how many children there are, there’s no waste of space. The same idea applies to storing graphs. We call this array-linked List storage method Adjacency List.
The adjacency list is handled like this.
1. Vertices in a graph are stored in a one-dimensional array. Of course, vertices can also be stored in a single linked list, but arrays are easier to read. In addition, for the vertex array, each data element also needs to store a pointer to the first adjacent point, in order to find the edge information of the vertex.
2. All the adjacent points of each vertex vi in the graph constitute a linear table. Because the number of adjacent points is uncertain, it is stored in a single linked list. The undirected graph is called the edge table of vertex vi, and the directed graph is called the edge table of vertex VI as the end of the arc.
For example, Figure 7-4-6 shows the adjacence list structure of an undirected graph.
Figure 7-4-6
From the graph, we know that each node of the vertex table is represented by data and Firstedge fields. Data is the data field, which stores information about vertices, and Firstedge is the pointer field, which points to the first node of the edge table, that is, the first adjacent point of this vertex. The edge table node consists of the adjvex and next fields. Adjvex is the domain of adjacencies and stores the subscripts of the adjacencies of a vertex in the vertex table, while next stores a pointer to the next node in the edge table. For example, v1 vertices are adjacent to v0 and v2, then in the edge table of V1, Adjvex is 0 of v0 and 2 of v2, respectively.
This structure is also very convenient for us to get relevant information about the graph. For example, if we want to know the degree of a vertex, we look up the number of nodes in the edge table of that vertex. To determine whether there is an edge from vertex vi to vj, just test whether there is a subscript J of node Vj of Adjvex in the edge table of vertex vi. If we find all the adjacencies of the vertices, we are actually traversing the edge list of the vertices, and the vertices corresponding to the adjvex domain are the adjacencies.
In the case of directed graphs, the adjacencies list structure is similar. For example, the adjacencies list of the first graph in Figure 7-4-7 is the second graph. But it’s important to note that with a directed graph because of the direction, we store the edge table with vertices as tails, so it’s very easy to get the exit degree of each vertex. Sometimes, however, in order to facilitate the determination of the vertices or of the arc with the vertex as the head of the arc, we may establish an inverse adjacencies list of the directed graph, that is, for each vertex VI a table is established with the link vi as the arc head. This is shown in the third picture of Figure 7-4-7.
Figure 7-4-7
It’s very easy to figure out what the in-degree or out-degree of a vertex is, and it’s very easy to figure out if there’s an arc between two vertices.
For the network graph with weights, one more weight data field can be added to the node definition of the edge table to store weight information, as shown in FIG. 7-4-8.
Figure 7-4-8
With a diagram of these structures, the following code for node definition is easy to understand.
typedef char VertexType; /* typedef int EdgeType; Int adjvex; /* int adjvex; /* int adjvex; /* * * * * * * * * * * * * * * */ struct EdgeNode *next; /* struct EdgeNode *next; */}EdgeNode; Struct VertexNode /* struct VertexNode */ {VertexType data; /* EdgeNode *firstedge; /* EdgeNode *firstedge; VertexNode, AdjList[MAXVEX]; typedef struct { AdjList adjList; int numVertexes,numEdges; */}GraphAdjList;Copy the code
For the creation of the adjacencies list, that’s a logical thing to do. The code for creating an adjacency list for an undirected graph is as follows.
The bold code here applies the head insertion method (3) that we explained in the singly linked list creation. Since an edge corresponds to two vertices in an undirected graph, I and J are inserted at one time in the loop. The time complexity of this algorithm, for n vertices and E edges, is easily O(n+e).
7.4.3 Cross list
I remember seeing an idea that I really liked. It refers to that in the United States, security guards are required to perform security work in such places as malls, supermarkets, docks, warehouses and office buildings through video surveillance at night, as shown in FIG. 7-4-9. Night shift is always expensive, so the personnel cost is high. A guy from My country, who was always video-chatting with his American friends at home, but was always struggling with the time difference between day and night, suddenly had a brilliant idea. He created a company, carries on the customer’s video monitoring task, because the United States is China’s the day of the night, use the Internet, his staff to work during the day can monitor to the actual situation of the warehouse at night, in the event of emergencies such as fire, theft, timely call to the local related personnel processing. By taking advantage of the time difference and personnel costs, the dude made a fortune. This idea lets us know that making full use of existing resources, positive thinking, reverse thinking and integrated thinking can create greater value.
Figure 7-4-9
So for directed graphs, adjacency lists are flawed. If you care about the out-of-degree problem, if you want to know the in-degree, you have to go through the whole graph to know it, whereas the inverse adjacencies list solves the problem where the in-degree is not known. Student: Is it possible to combine the adjacency list with the inverse adjacency list? The answer is yes, by putting them together. This is what we are now going to talk about as a method of storing directed graphs: Orthogonal lists.
We redefine the vertex table node structure as shown in Table 7-4-1.
Table 7-4-1
Where, firstin represents the head pointer of the incoming edge table, pointing to the first node in the incoming edge table of the vertex, and firstOut represents the head pointer of the outgoing edge table, pointing to the first node in the outgoing edge table of the vertex.
The node structure of the redefined edge table is shown in Table 7-4-2.
Table 7-4-2
Tailvex refers to the subscript of the arc starting point in the vertex table, headvex refers to the subscript of the arc ending point in the vertex table, headlink refers to the edge table pointer field, pointing to the next edge with the same end point, taillink refers to the edge table pointer field, pointing to the next edge with the same starting point. In the case of a network, an additional weight field can be added to store weights.
For example, in Figure 7-4-10, the vertices are still stored in a one-dimensional array {v0,v1,v2,v3}, and the solid arrow pointer is exactly the same as the adjacencies list in Figure 7-4-7. In the case of vertex v0, firstout refers to the first node v3 in the outbound table. So the headvex of v0 is 3, and tailvex is the index 0 of v0. Since v0 has only one outbound vertex, both headlink and Taillink are null.
Figure 7-4-10
The point is to explain what the dotted arrow means, which is actually a representation of the inverse adjacencies of this graph. For v0, it has two vertices v1 and v2 on the edge. Therefore, the Firstin of v0 points to the node whose headvex is 0 in the edge table nodes of vertex v1, as (1) in the right figure of FIG. 7-4-10. Then the headlink of the incoming node points to the next incoming vertex v2, ② in the figure. For vertex v1, it has an incoming edge vertex v2, so its firstin points to the node with headvex of 1 in the edge table nodes of vertex v2, ③ in the figure. Vertices v2 and v3 also have an incoming vertex, ④ and ⑤ in the figure.
The advantage of the cross list is that it integrates the adjacence list and the inverse adjacence list together, so that it is easy to find the arc with vi as the tail and the arc with vi as the head, so that it is easy to find the exit and entry degrees of the vertices. In addition to its complex structure, the time complexity of creating the graph algorithm is the same as the adjacencies list. Therefore, in the application of directed graphs, the cross list is a very good data structure model.
7.4.4 Adjacency Multiple List
We’ve talked about optimized storage structures for directed graphs, so any questions about adjacencies for undirected graphs? If we are in the application of the undirected graph, the focus is on the vertices, then the adjacency list is a good choice, but if we pay more attention to the edge of the operation, such as to have visited the side marking, delete operations such as one edge, that means, need to find the edge of the two side table node, it is more troublesome. For example, in Figure 7-4-11, if you want to delete the edge (v0,v2) in the left figure, you need to delete the two shadow nodes in the right table of the adjacent list structure, which is obviously more complicated.
Figure 7-4-11
Therefore, we also imitate the way of cross list, some changes to the structure of edge table nodes, maybe we can avoid the problem just mentioned.
The redefined edge table node structure is shown in Table 7-4-3.
Table 7-4-3
Where ivex and jvex are the indices of two vertices attached to an edge in the vertex table. Ilink points to the next edge of the dependent vertex ivex, and jlink points to the next edge of the dependent vertex Jvex. This is the adjacent multiple table structure.
Let’s look at the process of drawing a structure diagram, understand how it is wired, and understand the principle of adjacencies. As shown in Figure 7-4-12, the left figure tells us that it has 4 vertices and 5 edges. Obviously, we should draw the edge table nodes with 4 vertices and 5 edges first. Since this is an undirected graph, it doesn’t matter if ivex is 0, jvex is 1 or vice versa, but for the sake of drawing, set ivex to the same value as the vertex index on the other side.
Figure 7-4-12
We are connected, as shown in Figure 7-4-13. First of all, ①, ②, ③, ④ of the line is to point the vertex’s firstedge to an edge, and the vertex’s subscript has to be the same as the value of ivex, which makes sense. And then, since the (v0,v1) side of the vertex v0 is adjacent to (v0,v3) and (v0,v2). So the line ⑤⑥ is the goal of pointing to the next edge attached to the vertex v0. Note that the node to which Ilink points must have the same jvex value as its ivex value. In the same way, the line ⑦ refers to the edge (v1,v0), which is equivalent to the vertex v1 pointing to the next edge (v1,v2). V2 has three edges attached to it, so after ③ we have ⑧, ⑨. Line ⑩ is the next edge of vertex v3 after line ④. There are five edges on the left, so there are ten lines on the right, exactly as you would expect.
Figure 7-4-13
At this point, it should be clear to you that the only difference between the adjacencies multilist and the adjacencies multilist is that the same edge is represented by two nodes in the adjacencies multilist, but only by one node in the adjacencies multilist. To remove the edge (v0,v2) on the left, just change the ⑥⑨ link on the right to ∧. Since the various basic operations are implemented similarly to adjacencies, we won’t go through the code here.
7.4.5 Edge set group
An edge set is made up of two one-dimensional arrays. One is to store information about vertices; The other is to store edge information. Each data element of this edge array consists of the begin index (BEGIN), end index (end), and weight of an edge, as shown in Figure 7-4-14. Obviously, the edge-set set is concerned with the set of edges, and to find the degree of a vertex in the edge-set set requires scanning the entire array of edges, which is not very efficient. It is therefore more suitable for edge-by-edge operations than vertex-related operations. The application of edge set sets is described in Kruskal’s algorithm in Section 7.6.2 of this chapter and will not be detailed here.
Figure 7-4-14
The edge array structure defined is shown in Table 7-4-4.
Table 7-4-4
Begin, end, and weight respectively indicate the start index, end index, and weight of storage.