Understand and implement trees

So far, our exploration of data structures has only touched on the linear part. Whether we use arrays, linked lists, stacks, or queues, we’re all linear data structures. We have seen the complexity of linear data structure operations, and most of the time, the complexity of insert and delete can be expressed as O(1). Search is a little bit complicated, order n. The only exception is PHP arrays, which are actually hash tables, and can achieve O(1) complexity if indexes or keys are managed in such a way. To solve this problem, we can use hierarchical data structures instead of linear data structures. Hierarchical data can well solve many problems that linear data structures are difficult to solve.

Whenever we talk about family trees, organizational structures, and network connection graphs, we’re really talking about hierarchical data. Trees are a special abstract data type (ADT). Unlike linked lists, trees are layered.

Definition and characteristics of trees

A tree is a layered collection of nodes or vertices connected by edges. Trees cannot have loops, and only edges exist between nodes and their descendents or children. Two children of the same parent cannot have any edges between them. Each node can have a parent unless it is the top node, also known as the root node. Each tree can have only one root node. Each node can have zero or more child nodes. In the figure below, A is the root node and B, C, and D are the children of A. We can also say that A is the parent of B, C, and D. B, C, and D are called siblings because they come from the same parent node, A.

A node without any children is called a leaf. In the previous figure, K, L, F, G, M, I, and J are leaf nodes. Leaf nodes are also called external nodes or terminal nodes. A node other than the root has at least one child node, called an inner node. Here, B, C, D, E, and H are the internal nodes. Here are some other common terms used to describe the data structure of a tree:

Descendant: This is a node that can be reached from the parent node by repeating the procedure. For example, M is a descendant of C in the previous figure.

Ancestor: This is a node that can be repeated from a child node to a parent node. For example, B is the ancestor of L.

Degree: The total number of children of a particular parent node is called its degree. In our example, A has 3 degrees, B has 1 degree, C has 3 degrees, and D has 2 degrees.

Path: The sequence of nodes and edges from the source node to the destination node is called the path between two nodes. The length of a path is the number of nodes in the path. The path from A to M is a-C-h-M, and the length of the path is 4.

Node height: The height of a node is determined by the number of edges between the node and the deepest node. For example, the height of node B is 2.

Hierarchy: Hierarchy represents the generation of nodes. If the parent node is at level N, its children will be at level N+ 1. Thus, the hierarchy is defined by the number of edges between the node and the root.

A. in the zero layer

B, C and D are layers 1

E, F, G, H, I, J are 2 layers

K, L, M are on the third floor.

Tree height: The height of a tree is defined by the height of its root node. The height of the tree above is 3.

Subtrees: In a tree structure, each child recursively forms subtrees. In other words, a tree consists of many subtrees. For example, B and E, K and L form a subtree, E, K and L form a subtree, and they are identified in each different shadow.

Depth: The depth of a node is determined by the number of edges between the node and the root node. For example, H has a depth of 2 and L has a depth of 3.

Forest: A forest consists of a group or more trees that do not intersect.

Traversal: This represents the process of accessing nodes in a particular order.

Key: Used for searching and represents the value of a node.

Use PHP to implement trees

So far, we’ve looked at the different properties of trees. If we compare trees with real-world examples, we find that organizational structures or family trees can be represented numerically. For an organizational structure, one root node can be the CEO of the company, followed by CXO level employees, followed by other levels of employees. Here, we’re not limiting any degree of a particular node. This means that a node can have multiple children. So, here is a node structure where we can define node properties, its parent, and its children:

class TreeNode { public $data = null; public $children = []; public function __construct(string $data = null) { $this->data = $data; } public function addChildren(TreeNode $treeNode) { $this->children[] = $treeNode; }}Copy the code

We can see that we declared two public properties: data and children. We also have a method to add the child to a specific node. Here, we’re just adding new child nodes to the end of the array. Trees are recursive structures that will help us recursively build and recursively traverse the tree.

Now that we have nodes, let’s build a tree structure that defines the root node of the tree and also traverses the entire tree. Thus, the basic tree structure would look like this:

class Tree { public $root = null; public function __construct(TreeNode $treeNode) { $this->root = $treeNode; } public function traverse(TreeNode $treeNode, int $level = 0) { if ($treeNode) { echo str_repeat('-', $level) . $treeNode->data . PHP_EOL; foreach ($treeNode->children as $child) { $this->traverse($child, $level + 1); }}}}Copy the code

The previous code shows a simple tree class that can store root node references or traverse the tree from any node. In the traversal section, we visit each child node and then immediately recursively call the traversal method to get the children of the current node. We use a level to print a dash (-) at the beginning of the node name so that we can easily understand the child data.

require './TreeNode.php';

$ceo = new TreeNode('ceo');

$tree = new Tree($ceo);

$cfo = new TreeNode('cfo');
$cto = new TreeNode('cto');
$cmo = new TreeNode('cmo');
$coo = new TreeNode('coo');

$ceo->addChildren($cfo);
$ceo->addChildren($cto);
$ceo->addChildren($cmo);
$ceo->addChildren($coo);

$seniorArchitect = new TreeNode("Senior Architect");
$softwareEngineer = new TreeNode("SoftwareEngineer");

$userInterfaceDesigner = new TreeNode("userInterface Designer");
$qualityAssuranceEngineer = new TreeNode("qualityAssurance Engineer");


$cto->addChildren($seniorArchitect);
$seniorArchitect->addChildren($softwareEngineer);

$cto->addChildren($userInterfaceDesigner);
$cto->addChildren($qualityAssuranceEngineer);

$tree->traverse($tree->root);Copy the code

The final output looks something like this, and the full code can be seen here

Different types of trees

There are many different types of trees in the code world, so let’s take a look.

Binary tree

A binary tree is a basic tree structure with a maximum of two children per node.

Binary search tree

A binary search tree (BST) is a special type of binary tree in which nodes are stored in sorted order, i.e. at any given point the node value must be greater than or equal to the value of the left child and less than the value of the right child. Every node must satisfy this property, which is called a binary search tree. Binary search tree algorithm is always better than linear search, and its time is O(n), which we will explain in more detail later.

Self-balanced binary tree

Self-balanced binary search tree or highly balanced binary search tree is a special type of binary search tree that tries to keep the height or level of the tree as small as possible by automatic adjustment. The binary search tree is shown on the left, and the self-balanced binary search tree is shown on the right:

A highly balanced binary tree is always a better choice because it facilitates faster search operations than regular BST. There are different implementations of self-balanced or highly balanced binary search trees. Some common ones are as follows:

  • AA tree
  • AVL tree
  • Red and black tree
  • Scapegoat tree
  • octree
  • 2-3 tree
  • Treap

We’ll cover them in the future, so stay tuned.

More content

PHP Basic Data Structures topic series directory: address. The basic data structures and algorithms are summarized using PHP syntax. There are also some basic knowledge that we tend to ignore in daily PHP development and some practical advice on specification, deployment and optimization in modern PHP development, as well as in-depth research on the characteristics of Javascript language.