preface
Depth First Search (DFS) and Breath First Search (Breath First Search) are two very important algorithms in graph theory. They are widely used in topology sorting, pathfinding (mazes), Search engines, crawlers, etc. It’s also used a lot in Leetcode, a lot of interview questions.
This article will be from the following aspects to tell about the depth first traversal, breadth first traversal, I think we will see the harvest.
- Depth-first traversal, breadth-first traversal brief
- Problem sets drill
- DFS, BFS application in search engine
Depth-first traversal, breadth-first traversal brief
Depth-first traversal
The main idea of depth-first traversal is to start from an unvisited vertex V in the graph, follow a path all the way to the end, then back to the previous node at the end of the path, and start from another path to the end… , recursively repeat this process until all vertices have been traversed. It is characterized by not hitting the south wall and not turning back. It first goes one way and then moves on another way.
Trees are a special case of graphs (connected acyclic graphs are trees), so let’s see how trees can be traversed using depth-first traversal.
1. We traverse from the root node 1, whose adjacent nodes are 2, 3, and 4. We traverse node 2 first, then child node 5 of 2, and then child node 9 of 5.
Has walked exactly 2, above a road (9 is the leaf nodes, and no traverse the nodes), at this time is from 9 backs up a node 5, see whether there is other than the 9 5 nodes, no continue to back to 2, 2, and no other than the 5 nodes, back to 1, 1 have a node other than the 2 3, Therefore, depth-first traversal is carried out from node 3, as follows
3. Similarly, from 10 to 6, there is no child node except 10 in 6. If you go back up again, you find that 3 has child point 7 except 6, so 7 will be traversed at this time
3. Backtracking from 7 to 3 and 1, it is found that node 4 of 1 has not been traversed, so the traversal is carried out along 4 and 8 at this time, and the traversal is completed
The complete nodes are traversed in the following order (represented by the blue numbers on the nodes)
I believe that it is not difficult for you to see the above traversal to find that this is the pre-order traversal of the tree. In fact, whether it is pre-order traversal, middle-order traversal, or post-order traversal, all belong to depth-first traversal.
So how to implement depth-first traversal, there are two forms of recursion and non-recursion, let’s take a binary tree as an example to see how to implement depth-first traversal using recursion and non-recursion respectively.
1, recursive implementation
Recursive implementation is relatively simple, because it is a front-order traversal, so we traverse the current node, the left node, the right node, for left and right nodes, traversal their left and right nodes, and so on, until the leaf node (recursive termination condition), the code is as follows
public class Solution {
private static class Node {
/** * Node value */
public int value;
/** ** left node */
public Node left;
/** ** right node */
public Node right;
public Node(int value, Node left, Node right) {
this.value = value;
this.left = left;
this.right = right; }}public static void dfs(Node treeNode) {
if (treeNode == null) {
return;
}
// Iterate over the node
process(treeNode)
// Iterate over the left node
dfs(treeNode.left);
// Iterate over the right nodedfs(treeNode.right); }}Copy the code
Recursion is expressive and easy to understand, but if the hierarchy is too deep, it can easily lead to stack overflow. So let’s focus on non-recursive implementations
2. Non-recursive implementation
Observing the characteristics of depth-first traversal carefully, for binary trees, since it is sequential traversal (traversing the current node first, then traversing the left node, then traversing the right node), we have the following idea
- For each node, the current node is traversed first, then the right node is pressed, then the left node is pressed (so that the left node is traversed first when the stack is flipped, which meets the depth-first traversal requirements).
- Pop the stack and get the node at the top of the stack. If the node is not empty, repeat step 1. If the node is empty, the traversal is finished.
Let’s use the following binary tree as an example to see how to implement DFS using stacks.
The overall GIF is shown below
The overall idea is still quite clear, use the stack to push the node to be traversed, and then check whether there are any untraversed nodes after the stack, if there are, push the stack, if there is no continuous backtracking (out of the stack), with the idea, it is not difficult to write the following binary tree depth first traversal code:
/** * use stack to implement DFS *@param root
*/
public static void dfsWithStack(Node root) {
if (root == null) {
return;
}
Stack<Node> stack = new Stack<>();
// Push the root node first
stack.push(root);
while(! stack.isEmpty()) { Node treeNode = stack.pop();// Iterate over the node
process(treeNode)
// Press the right node first
if(treeNode.right ! =null) {
stack.push(treeNode.right);
}
// press the left node again
if(treeNode.left ! =null) { stack.push(treeNode.left); }}}Copy the code
As you can see, using a stack to implement depth-first traversal is not complicated code, and you don’t have to worry about stack overflow because of the deep level of recursion.
Breadth-first traversal
Breadth-first traversal refers to starting from an untraversed node of the graph, traversing the adjacent nodes of the node first, and then traversing the adjacent nodes of each adjacent node.
The breadth-first traversal GIF of the tree mentioned above is as follows, and the value of each node is their traversal order. Therefore, breadth-first traversal is also called sequential traversal. It traverses the first layer (node 1), then the second layer (node 2,3,4), the third layer (node 5,6,7,8), and the fourth layer (9,10).
Depth-first traversal uses stacks, while breadth-first traversal uses queues. Let’s take the following binary tree as an example to see how to use queues for depth-first traversal
Diagram below
I believe that seeing the above GIF, it is not difficult to write the following code
/** * use queue to implement BFS *@param root
*/
private static void bfs(Node root) {
if (root == null) {
return;
}
Queue<Node> stack = new LinkedList<>();
stack.add(root);
while(! stack.isEmpty()) { Node node = stack.poll(); System.out.println("value = " + node.value);
Node left = node.left;
if(left ! =null) {
stack.add(left);
}
Node right = node.right;
if(right ! =null) { stack.add(right); }}}Copy the code
Problem sets drill
Let’s take a look at some of the problems in LeetCode that use DFS and BFS to solve problems:
Leetcode 104,111: Given a binary tree, find its maximum/minimum depth.
For example, given a binary tree [3,9,20,null,null,15,7],
3
/ \
9 20
/ \
15 7
Copy the code
It has a minimum depth of 2 and a maximum depth of 3
How to calculate the depth of the left and right subtrees? For each recursive call, the depth is increased by one. It is not difficult to write the following code
/** * leetcode 104: Find the maximum depth of the tree *@param node
* @return* /
public static int getMaxDepth(Node node) {
if (node == null) {
return 0;
}
int leftDepth = getMaxDepth(node.left) + 1;
int rightDepth = getMaxDepth(node.right) + 1;
return Math.max(leftDepth, rightDepth);
}
/** * leetcode 111: Find the minimum depth of tree *@param node
* @return* /
public static int getMinDepth(Node node) {
if (node == null) { return 0; }
if (node.left == null) { return 1 + getMinDepth(node.right); }
if (node.right == null) { return 1 + getMinDepth(node.left); }
int leftDepth = getMinDepth(node.left);
int rightDepth = getMinDepth(node.right);
return 1 + Math.min(leftDepth, rightDepth);
}
Copy the code
Leetcode 102: Given a binary tree, return the value of the node traversed in order. (That is, layer by layer, all nodes are accessed from left to right). Example, given a binary tree: [3,9,20,null,null,15,7]
3
/ \
9 20
/ \
15 7
Copy the code
Returns the result of its hierarchical traversal:
[[3], [9,20], [15,7]Copy the code
Answer: Clearly the problem is a variation of breadth-first traversal, just in the process of breadth-first traversal, each layer of the nodes are added to the same array, the key is to traverse the same layer node before, must first work out how much the same layer node has, or because of BFS is queue, traverse is constantly in the process of the child nodes around the team, If you do not calculate the nodes of the same layer in advance, the left and right nodes of the same layer will be added to the nodes of the same layer, remember this! Diagram below
According to the above GIF ideas, it is not difficult to get the code as follows:
Java code
/** * leetcdoe 102: Binary tree sequence traversal, using BFS *@param root
*/
private static List<List<Integer>> bfsWithBinaryTreeLevelOrderTraversal(Node root) {
if (root == null) {
// If the root node is empty, the binary tree does not exist
return Arrays.asList();
}
// The final sequence traversal result
List<List<Integer>> result = new ArrayList<>();
Queue<Node> queue = new LinkedList<>();
queue.offer(root);
while(! queue.isEmpty()) {// Record each layer
List<Integer> level = new ArrayList<>();
int levelNum = queue.size();
// Traverses the nodes of the current layer
for (int i = 0; i < levelNum; i++) {
Node node = queue.poll();
// The left and right children of the first node join the team. Since levelNum is calculated before joining the team, the left and right children of the first node are not traversed at the current level
if(node.left ! =null) {
queue.add(node.left);
}
if(node.right ! =null) {
queue.add(node.right);
}
level.add(node.value);
}
result.add(level);
}
return result;
}
Copy the code
Python code:
class Solution:
def levelOrder(self, root) :
""" :type root: TreeNode :rtype: List[List[int]] """
res = [] # nested list, save the final result
if root is None:
return res
from collections import deque
que = deque([root]) # queue, save nodes to be processed
while len(que)! =0:
lev = [] # list, which holds the values of the nodes of this layer
thislevel = len(que) # Number of nodes at this layer
whilethislevel! =0:
head = que.popleft() # eject queue head node
# The left and right children of the first node join the team
if head.left is not None:
que.append(head.left)
if head.right is not None:
que.append(head.right)
lev.append(head.val) # Push the value of the first node into this layer
thislevel-=1
res.append(lev)
return res
Copy the code
This is obvious with BFS, but it could also be done with DFS, and it would be a big plus if you could handle it in an interview.
DFS can be implemented recursively by adding a “layer” of variables to the recursive function. As long as the node belongs to this layer, the node is placed in an array of the corresponding layer. The code is as follows:
private static final List<List<Integer>> TRAVERSAL_LIST = new ArrayList<>();
/** * leetcdoe 102: Binary tree sequence traversal using DFS *@param root
* @return* /
private static void dfs(Node root, int level) {
if (root == null) {
return;
}
if (TRAVERSAL_LIST.size() < level + 1) {
TRAVERSAL_LIST.add(new ArrayList<>());
}
List<Integer> levelList = TRAVERSAL_LIST.get(level);
levelList.add(root.value);
// iterate over the left node
dfs(root.left, level + 1);
// iterate over the right node
dfs(root.right, level + 1);
}
Copy the code
DFS, BFS application in search engine
We use search engines like Google and Baidu almost every day. Do you know how these search engines work? There are three simple steps
1, web crawling
Search engine will crawl the web page through the crawler, the page HTML code stored in the database
2. Pretreatment
Indexing procedures to crawl to the page data for text extraction, Chinese word segmentation, (inverted) index processing, for use by ranking procedures
3, ranking
After a user enters a keyword, the ranking program calls the indexed database data, calculates relevance, and then generates a page of search results in a certain format.
Let’s focus on the first step, web scraping.
The general operation of this step is as follows: assign a group of initial web pages to the crawler, and we know that the web page actually contains many hyperlinks. After crawler crawls a web page, it analyzes and extracts all the hyperlinks in the web page, and then crawls and extracts these hyperlinks, and then extracts the hyperlinks… “, so that you can continue to extract pages based on hyperlinks. The following chart
As shown above, you end up with a graph, and the question then becomes how to traverse the graph, obviously in a depth-first or breadth-first way.
If it is breadth-first traversal, first climb the first layer of the start page, then climb the hyperlinks in each page, if it is depth-first traversal, first climb the start page 1, then climb the links in this page… , then climb to the start page 2…
In fact, crawler is used together with depth-first and breadth-first strategies. For example, in the initial web page, if some pages are important (with high weight), depth-first traversal will be performed on this page first, and then breadth-first traversal will be performed on other (with the same weight) initial web pages after traversal.
conclusion
DFS and BFS are two very important algorithms, you must master, in order to explain the convenience of this article, only for trees DFS, BFS, you can try to use graph code, the principle is the same, but graph and tree representation is different, DFS is generally to solve the connectivity problem, BFS is generally used to solve the shortest path problem. Later, we will study together and look up sets, Dijkstra, Prism algorithm, etc. Please look forward to it!