Introduction to the

Graph is a common data structure and representation, and is often used in visual scenes to show the data with associated relationships. When visualizing a graph, it is often necessary to arrange it automatically, and different layout methods are required for different problems and scenarios. This paper mainly introduces the idea of hierarchical layout of graph.

Some commonly used diagram layout methods. Image taken from G6

The hierarchical layout of the diagram

The Directed Acyclic Graph (DAG) is often used to display hierarchical layouts when the data has a certain hierarchical structure or sequence. Common scenarios include flow diagrams, organization architecture diagrams, and state transition diagrams.

The method of hierarchical layout was first elaborated by Kozo Sugiyama in 1981, so it is often called Sugiyama layout. Here we talk about the idea of Sugiyama layout.

purpose

According to the data of the graph, an easy-to-understand hierarchy graph can be drawn automatically.

  • The layout of nodes is Hierarchical
  • Side crossings should be as few as possible – Less Crossing
  • The path of the edge can be as Straight as possible – Straight
  • The path of the edge is as short as possible – Close
  • The layout is as Balanced as possible

Basic rules for drawing

  • How to place the nodes – at each level the nodes are on the same level and do not overlap
  • How do I draw edges — each edge is drawn by a line

In this way, by layering the nodes and placing the nodes of each layer in a reasonable horizontal position, the graph can be drawn.

layout

Based on the above principles, Sugiyama breaks the layout problem of a graph into multiple steps, each of which solves a different subproblem. Sugiyama algorithm is divided into the following four steps.

Step 1: Layer nodes

  • Nodes are divided into different layers according to the direction of the edges
  • If there is an edge that spans multiple layers, add a pseudo node to each layer that passes through to make sure that each edge only connects two adjacent layers
  • As shown in figure (a)

Step 2: Reduce crossover

  • Change the order of nodes at each level to reduce edge crossing
  • Figure (a) -> (b)

Step 3: Compute node coordinates

  • On the basis of ensuring the sequence of nodes in the previous step, adjust the horizontal position of nodes to meet the above Straight, Close, Balanced principles
  • Figure (b) -> (c)

Step 4: Draw

  • Draw the locations of the nodes generated in the previous step and remove the pseudo-nodes and edges.
  • As shown in figure (c)

The specific algorithm for each step

It may seem like a simple 4 steps, but the implementation of each step is not simple, and there are many different algorithms.

The node hierarchy

How do you divide the nodes into different layers? Here are some algorithms.

  • Longest Path algorithm

The first thing that probably comes to mind is the longest path algorithm. That is, the hierarchy of a node is equal to the longest path required to reach it. The advantage of the longest path algorithm is that it is very fast and can be layered by traversing the graph. However, it has obvious problems: nodes are placed at the lowest possible level, leaving a lot of white space at the bottom of the graph, and there may be many long edges. However, because of its high speed, it is often used for hierarchical preprocessing, and then handed over to other algorithms for optimization.

The result of a longest path algorithm, cut from D3-DAG

  • Tight Tree

Compact tree algorithm is an optimization method to reduce the number of long edges. As you can see from the name, “tight”, that is, adjust the layering of nodes so that more sides become “tight edges”.

Main ideas:

  • The premise
  • Each edge needs to be given a minimum length, say 1.
  • Calculate the relaxation delta: the length of the edge – the minimum length. If the relaxation is 0, it is a tight edge.
  • Add all the compact edges and nodes to the compact spanning tree
  • Take the edge with the least slack each time
  • If the source node is already in the compact tree, move the target node up the Delta level
  • If the source node is not in the compacting tree, move the source node down the Delta level
  • Loop until all nodes are contained in the compact tree
  • Network Simplex
  • Purpose: Reduce the total length of edges (and thus the number of pseudo nodes)
  • Ideas:
  • The graph is represented by a compact spanning tree and the hierarchy of nodes is obtained. The goal is to find an optimal spanning tree.
  • Cut value: In the spanning tree, if an edge is removed, the graph is divided into two parts. The cut value is the number of edges that all source parts point to target parts minus the number of edges that target parts point to source parts.
  • In general, the total length of the edge can be reduced by extending the edge with a negative cut value as long as possible
  • On the basis of the compact spanning tree, the cutting value of each edge is calculated, the edge with negative cutting value is removed, and another edge is found to build a better spanning tree until the cutting value of all edges is non-negative. As shown in Figure (a), the solid line is the current spanning tree, and the number on the edge is the cutting value. The cutting value of G ->h is negative. After it is removed from the spanning tree, a better spanning tree of Figure (b) is obtained.

Results obtained by a Network Simplex algorithm, truncated from D3-DAG

Reduce the cross

How to draw a graph is aesthetically and intelligibly subjective, but the standard of minimizing edge crossing is widely accepted.

Since the nodes have been layered, the ordinate has been determined, and each long edge has been divided into short edges between adjacent layers by adding pseudo-nodes, the problem of reducing edge crossing becomes how to sort nodes in the layer.

However, it has been shown that even in the simplest case, dealing only with intersections between two layers, the problem is still NP hard. Therefore, many heuristic algorithms have been proposed to solve this problem, and the classic ones are the barycentric algorithm and the center algorithm.

  • The center of gravity method (Barycenter)
  • The basic principle is that the more vertical edges there are in a hierarchy, the less they cross
  • Sweep layer by layer from top & bottom up, sorting layer I to reduce crossover, assuming the order is fixed in layer I -1 each time
  • The position of each node in layer I is determined by the average value of the positions of all adjacent nodes on layer I-1, which is the center of gravity of the node. In each layer, the center of gravity is ordered from smallest to largest.
  • Center method (Median)
  • Similar to barycentric positioning, except that it uses the position of the median adjacent node to calculate the sorting value.

Calculate the coordinate

According to the principle mentioned above, the main purpose of calculating the coordinates of nodes of each layer is to make the edges as vertical as possible, the layout of nodes as balanced as possible, and the long edges as straight as possible.

Sugiyama proposed a Quadratic Programming algorithm and a priority-based heuristic algorithm (neither of which I understood). Since then, many researchers have proposed different algorithms.

Here is a quick and simple heuristic algorithm developed by Brandes & Kopf used in Dagre that computes coordinates in linear time. (Although this algorithm seems to cause some exceptions, see issue)

  • If possible, align the nodes with their midbit neighbors to reduce the length of the edges and balance the layout.

  • steps
  • Vertical alignment.
  • Try to align each node with its upper-level or lower-level median adjacency.
  • There can be conflicts during alignment – that is, when edges cross or join to the same node. Left-first or right-first alignment is adopted, and subsequent conflicting nodes are ignored. Figure (a) -> (b) is an example of upaligned, left-first.
  • Horizontal compression.
  • Place the aligned nodes at the same horizontal coordinates, as shown in Figure (b) -> (c)
  • Compress in the horizontal direction to make the node as close as possible, as shown in Figure (c) -> (d).
  • The first two steps can be aligned up or down, and left or right precedence when resolving conflicts. Therefore, the first two steps are performed in each of the four directions (upper left, lower left, upper right, and lower right), and the final result is balanced by the four results.

Other consideration

During the actual layout process, there may be more points to consider. Here are some examples for your reference, I will not expand in detail.

  • Incremental layout
  • When adding nodes and edges to the diagram, try to keep the original layout unchanged.
  • Figure of nesting
  • Sometimes the nodes of a graph have nested relationships

A nested graph from ELK

  • The edge of the label
  • There may be some descriptive text on the edge, and the label needs to be taken into account when calculating the position
  • The edge of the illustration
  • In the Sugiyama algorithm described above, edges are drawn straight (long edges are polylines). In real scenarios, edges can be displayed as curves, or Orthogonal horizontal/vertical lines, etc. Different edges require different algorithms to deal with them.
  • Ports of nodes
  • In the above algorithm, nodes are regarded as points and edges are directly connected to nodes. However, in practical applications, some ports are often defined for nodes, that is, edges must be connected to or out of a certain port.

Implement your own layout

Sugiyama layout provides a basic framework for the hierarchical layout of graphs. However, requirements in actual business scenarios may have different focuses, and existing layout methods may not meet the requirements, so we need to implement our own layout.

Because the Sugiyama layout has broken down the layout into different steps and subproblems, it is flexible. It can be adjusted to use different algorithms at different steps to achieve the purpose. For example, in the flow chart below, the relationship between edges is not complicated and there is no long edge. Even if the edges are crossed, it will not have a big impact. The focus of the display may be the centralization effect of the whole graph.

A simple look at a JS implementation – Dagre

Dagre is a JS library that realizes the hierarchical layout. G6 is a direct reference to dagre to achieve the hierarchical layout.

The layout process is clearly written in the runLayout method, and although it looks like there are 27 steps, the overall idea is the same as above: layering, reducing intersections, and calculating coordinates. Dagre also adds preprocessing such as self-looping and de-looping, as well as support for other capabilities mentioned above such as edge label and nested layout.

function runLayout(g, time) { // Make space for the edge label
time("makeSpaceForEdgeLabels".function() { makeSpaceForEdgeLabels(g); }); Function () {removeSelfEdges(g); }); // If there is a loop, remove the loop by reversing some of the looped edges
time("acyclic".function() { acyclic.run(g); }); // The layout of the nested graph
time("nestingGraph.run".function() { nestingGraph.run(g); }); // Step 1
time("rank".function() { rank(util.asNonCompoundGraph(g)); }); // Generate a virtual node in the middle of the edge that represents the location of the label
time("injectEdgeLabelProxies".function() { injectEdgeLabelProxies(g); }); // removeEmptyRanks time("removeEmptyRanks", function() {removeEmptyRanks(g); }); // Remove the virtual root node and virtual edge
time("nestingGraph.cleanup".function() { nestingGraph.cleanup(g); }); // standardize the rank value
time("normalizeRanks".function() { normalizeRanks(g); }); // Find the maximum and minimum levels of the node group
time("assignRankMinMax".function() { assignRankMinMax(g); }); // Remove the label virtual node of the edge
time("removeEdgeLabelProxies".function() { removeEdgeLabelProxies(g); }); // Split the long edge into edges of length 1 and insert the virtual node
time("normalize.run".function() { normalize.run(g); });
time("parentDummyChains".function() { parentDummyChains(g); });
time("addBorderSegments".function() { addBorderSegments(g); }); // Step 2
time("order".function() { order(g); }); // backfill self-loop
time("insertSelfEdges".function() { insertSelfEdges(g); }); // Adjust the coordinate system layout up/down/left/right
time("adjustCoordinateSystem".function() { coordinateSystem.adjust(g); }); // Step 3. Locate the node
time("position".function() { position(g); });
time("positionSelfEdges".function() { positionSelfEdges(g); });
time("removeBorderNodes".function() { removeBorderNodes(g); }); // Remove the long inserted virtual node
time("normalize.undo".function() { normalize.undo(g); });
time("fixupEdgeLabelCoords".function() { fixupEdgeLabelCoords(g); });
time("undoCoordinateSystem".function() { coordinateSystem.undo(g); });
time("translateGraph".function() { translateGraph(g); }); // Compute the intersection of edges and nodes
time("assignNodeIntersects".function() { assignNodeIntersects(g); }); // Inverts the edge to be reversed in the loop removal phase
time("reversePoints".function() { reversePointsForReversedEdges(g); });
time("acyclic.undo".function() { acyclic.undo(g); }); }


Copy the code

conclusion

  • This paper mainly introduces the basic ideas, steps and related algorithms of graph hierarchical layout
  • Hierarchical layout is mainly divided into four steps: node layer, reduce cross, coordinate calculation, drawing. Or 5 steps (including first converting a cyclic graph to an acyclic graph through a loop removal algorithm)
  • Each step is a complex sub-problem, and many heuristic algorithms are proposed to solve related problems. Based on specific business scenarios, you can also use your own algorithm to solve the problem.

References

  • Github – dagre/dagre
  • Github – d3-dag
  • G6
  • Eclipse Layout Kernel
  • Wiki – Layered Graph Drawing
  • Diagram Visual diagram layout
  • Read the Dagre layout algorithm in depth
  • Methods for Visual Understanding of Hierarchical System Structures
  • A Technique for Drawing Directed Graphs
  • Fast and Simple Horizontal Coordinate Assignment
  • Layout of Compound Directed Graphs
  • An Efficient Implementation of Sugiyama’s Algorithm for Layered Graph Drawing
  • Port Constraints in Hierarchical Layout of Data Flow Diagrams
  • Improved Vertical Segment Routing for Sugiyama Layouts