This article is an original work, first published in the wechat public account: [Mr. Sakamoto]. If you need to reprint, please mark “Reprinted in the wechat public account: [Mr. Sakamoto]” at the beginning of the article. Otherwise, it will be held liable.

Actual combat algorithm – multi-fork tree full path traversal

preface

This paper studies how to traverse the full path of a multi-fork tree and output the full path result. This problem can be studied in the Trie tree to view all dictionary values. This paper will conduct detailed simulation and code implementation of this problem, and discuss the advantages and disadvantages of recursive and non-recursive methods and implement them respectively. If readers are not interested in the advantages and disadvantages of these two methods, they can directly jump to the problem construction chapter for reading. The article is longer, and I recommend you to collect it first and then read it.

The article directories

  • Multifork tree full path traversal
    • preface
    • The article directories
    • Recursive and iterative comparison
      • recursive
      • The iteration
      • Disadvantages and advantages of recursion
        • Disadvantages of recursion
        • The advantage of recursion
    • The problem to build
    • Problem solving
      • Recursive method
      • An iterative approach
    • conclusion
    • The resources

Recursive and non-recursive comparison

The problem on zhihu there have been many answers (www.zhihu.com/question/20…

recursive

To decompose a problem into a number of relatively small problems, the recursive exit is returned to the original path, so the relevant intermediate values must be saved, and these intermediate values are pushed to save, when the scale of the problem is large, it takes up a lot of memory.

non-recursive

The execution efficiency is high, the running time increases only because the number of cycles increases, no extra overhead. There’s no increase in space, right

Disadvantages and advantages of recursion

Disadvantages of recursion

  • Recursion is prone to “stack overflow” errors. Recursion is very memory intensive because you need to keep hundreds or thousands of call records at the same time.

  • In terms of efficiency, recursion may have redundant calculations. There are redundant computations using recursion (for example, Fibonacci sequence is the most typical, computations of Fibonacci sequence 6 require computations of 4 and 5, and computations of 5 require computations of 4, which are repeated). Iteration has an absolute advantage in this respect.

The advantage of recursion

Recursion has better code readability and is more than sufficient for operations with less data.

The problem to build

Now there is a multi-fork tree whose nodes are shown in the following figure. It is necessary to give a method to output all paths of leaf nodes.

The final output should be 5, i.e. [RAD, RAC, RBE, RBF, RG]

Problem solving

First we analyze the nodes and build a node class (TreeNode). Then we need to have a MultiTree class (MultiTree), which contains the method of full-path printing. Finally, we need to set up a Main method to test. The final project structure is as follows:

Note: This article uses Lombok annotations, leaving out the implementation of get, set, and related methods. If you have never used Lombok, you can write your own get and set methods. The following sections explain how each class needs to be implemented without affecting the core code.

TreeNode class

The node class contains two fields:

  • Content: Stores the content of the current node
  • Childs: Used to store information about child nodes. The string of HashMap stores the content of child nodes. Childs is implemented using HashMap to facilitate quick lookup of child nodes

This class contains the necessary get, set methods, a no-parameter constructor, and a full-parameter constructor

@Data
@RequiredArgsConstructor
@AllArgsConstructor
public class TreeNode {
    private String content;
    private HashMap<String,TreeNode> childs;
}
Copy the code

MultiTree class

Contains only two fields:

  • Root: indicates the root node
  • PathList: Used to store paths obtained during traversal

Constructor in this class I manually create the tree in the problem build with the following code:

    public MultiTree(a){
        // Create root node
        HashMap rootChilds = new HashMap();
        this.root = new TreeNode("r",rootChilds);

        // Layer 1 child node
        HashMap aChilds = new HashMap();
        TreeNode aNode = new TreeNode("a",aChilds);

        HashMap bChilds = new HashMap();
        TreeNode bNode = new TreeNode("b",bChilds);

        HashMap gChilds = new HashMap();
        TreeNode gNode = new TreeNode("g",gChilds);

        // Layer 2
        HashMap dChilds = new HashMap();
        TreeNode dNode = new TreeNode("d",dChilds);

        HashMap cChilds = new HashMap();
        TreeNode cNode = new TreeNode("c",cChilds);

        HashMap eChilds = new HashMap();
        TreeNode eNode = new TreeNode("e",eChilds);

        HashMap fChilds = new HashMap();
        TreeNode fNode = new TreeNode("f",fChilds);

        // Establish a connection
        rootChilds.put("a",aNode);
        rootChilds.put("b",bNode);
        rootChilds.put("g",gNode);

        aChilds.put("d",dNode);
        aChilds.put("c",cNode);

        bChilds.put("e",eNode);
        bChilds.put("f",fNode);
    }
Copy the code

In this tree, each node has childs, and if it’s a leaf node, then the size in childs is 0, which is an important basis for determining whether a node is a leaf node or not and then we’re going to implement the core algorithm code.

The Main class

public class Main {
    public static void main(String[] args) {
        MultiTree tree = newMultiTree(); List<String> path1 = tree.listAllPathByRecursion(); System.out.println(path1); List<String> path2 = tree.listAllPathByNotRecursion(); System.out.println(path2); }}Copy the code

Recursive method

The listAllPathByRecursion and listPath methods in the MultiTree class need to be perfected

Recursive procedure method: listAllPathByRecursion

The algorithm flow chart is as follows:

The code implementation is as follows:

public void listPath(TreeNode root,String path){

    if(root.getChilds().isEmpty()){// Leaf node
        path = path + root.getContent();
        pathList.add(path); // Save the result in the list
        return;
    }else{ // Non-leaf nodes
        path = path  + root.getContent() + "- >";

        // Perform recursion on child nodes
        HashMap<String, TreeNode> childs = root.getChilds();
        Iterator iterator = childs.entrySet().iterator();
        while(iterator.hasNext()){ Map.Entry entry = (Map.Entry)iterator.next(); TreeNode childNode = (TreeNode) entry.getValue(); listPath(childNode,path); }}}Copy the code

Recursive call method: listAllPathByRecursion

public List<String> listAllPathByRecursion(a){
    // Clear the path container
    this.pathList.clear();
    listPath(this.root,"");
    return this.pathList;
}
Copy the code

Non-recursive methods

The amount of code in a non-recursive method is simply too much compared to the amount of code in a recursive method, and the content is not easy to understand. I don’t know if you can understand the code I wrote, but I have done my best to write relevant comments.

Firstly, two stacks are established, the schematic diagram is as follows. The realization of stack uses Deque, and it is necessary to pay attention to the null pointer situation in the code.

  • Main stack: storage used to process nodes and temporary paths. If the main stack is empty, node processing is complete

  • Secondary stack: stores nodes to be processed. If the secondary stack is empty, the node traversal is complete

Other related variables:

  • PopCount: Stores the number of pop-ups of children of a node. For example, r has three child nodes. If the number of ejects corresponding to R is 3, it indicates that the leaf nodes of R are processed and r can be ejected. Since r pops up, there are no elements in the main stack, so the processing is complete.
  • When the main stack element changes, the curString will also change. For example, in the figure above, the curString is “r->g->”. When the top stack element pops up, the “G ->” must be subtracted. If the top element is a leaf node, the path has been traversed and needs to be added to the path container.

Program flow chart:

The specific implementation code is as follows:

/** * non-recursive methods print all paths */
public List<String> listAllPathByNotRecursion(a){
    // Clear the path container
    this.pathList.clear();
    // main stack, used to calculate the processing path
    Deque<TreeNode> majorStack = new ArrayDeque();
    // Secondary stack, used to store nodes to be processed
    Deque<TreeNode> minorStack = new ArrayDeque();
    minorStack.addLast(this.root);

    HashMap<String,Integer> popCount = new HashMap<>();
    String curString  = "";

    while(! minorStack.isEmpty()){// exit the substack, enter the main stack
        TreeNode minLast = minorStack.pollLast();
        majorStack.addLast(minLast);
        curString+=minLast.getContent()+"- >";
        // Add the children of this node to the stack
        if(! minLast.getChilds().isEmpty()){ HashMap<String, TreeNode> childs = minLast.getChilds(); Iterator iterator = childs.entrySet().iterator();while(iterator.hasNext()){ Map.Entry entry = (Map.Entry)iterator.next(); TreeNode childNode = (TreeNode) entry.getValue(); minorStack.addLast(childNode); }}/ / the main stack
        TreeNode majLast = majorStack.peekLast();
        // Loop condition: the top of the stack is a leaf node or the top of the stack is traversed.
        while(majLast.getChilds().size() ==0|| (popCount.get(majLast.getContent())! =null && popCount.get(majLast.getContent()).equals(majLast.getChilds().size()))){

            TreeNode last = majorStack.pollLast();
            majLast = majorStack.peekLast();

            if(majLast == null) {// The main stack is empty
                return this.pathList;
            }
            if(popCount.get(majLast.getContent())==null) {// Set the number of times the child node is ejected for the first time to 1
                popCount.put(majLast.getContent(),1);
            }else{ // Not the first time to pop up the child node, add 1 to the original base
                popCount.put(majLast.getContent(),popCount.get(majLast.getContent())+1);
            }
            String lastContent = last.getContent();
            if(last.getChilds().isEmpty()){// Add the result to the pathset only if it is a leaf node
                this.pathList.add(curString.substring(0,curString.length()-2));
            }
            // Adjust the current curString, subtract 2 is the minus "->" symbol
            curString = curString.substring(0,curString.length()-lastContent.length()-2); }}return this.pathList;
}
Copy the code

test

Call the Main method in the Main class, and get the result, as expected, the code passes the test

listAllPathByRecursion[r->a->c, r->a->d, r->b->e, r->b->f, r->g]
listAllPathByNotRecursion[r->g, r->b->f, r->b->e, r->a->d, r->a->c]
Copy the code

conclusion

In fact, this article is an intermediate product of my research on “Realization of Sensitive Word Filtering Algorithm based on Trie Tree”. In fact, it should have realized the path traversal problem of multiple forks tree originally, but due to time reasons and the original lack of a good knowledge management system, the code and notes are lost. Today, I take the opportunity to summarize again. I hope this article can help those in need.

The resources

  • What is the difference between recursion and iteration? – Answer by Ye Shiqing – Zhihu
  • How does recursion become non-recursion