preface
The Trie tree is a data structure that is not very common but is very useful and one of the data structures that the blogger recently learned about was doing a few bytes of problems. And will share their learning content with everyone.
define
First, what is a Trie tree? As the name implies, when querying the target, access the nodes of the tree in the same order as the dictionary according to certain criteria and steps. For a simple example, if you look up the word “He” in the English dictionary, you must first find the letter h in the order of A-Z, and then find the letter E in the same order. A dictionary tree is a structure like a dictionary.
The process of finding the word mentioned above is to continuously find the string with the same prefix as the word to be looked up until the last letter of the word to be looked up. Therefore, a dictionary Tree is also called a prefix Tree.
The nature of the
Use hell, hi, new, and NOp as examples to create a dictionary tree
The structural properties of the dictionary tree can be obtained according to the above
- The root node of the Trie tree does not store any characters. Each node except the root node stores only one character
- The string formed on the path from the root node to a node is the corresponding string of the node
- The child nodes of each node store different characters
Construct the dictionary tree according to the above three points.
The construction of a dictionary tree
The construction of a dictionary tree is relatively simple, not much different from the construction of a normal tree. The following will be the dictionary tree insertion, deletion, query operations are analyzed and explained.
The insert
In the absence of a dictionary tree, we need to build a dictionary tree first.
For example, insert hell:
- Check to see if ‘h’ exists in the children of the root node. If no, insert a node with ‘h’. If yes, continue to access ‘h’ node
- Continue to the ‘h’ node, check to see if ‘e’ exists in the ‘h’ node, if not insert a node with ‘e’, and so on
- Until all characters in the string to be inserted are inserted, the last node is marked with a word tag, indicating that a word is reached from the root node to this point
Then insert the word hit, the process is as follows: check = > exists, access/not exist, create a new node and then access = > until the word to be inserted reaches the last character.
The insertion operation of the dictionary tree is relatively simple and does not require much sorting.
Query operation
As stated above, in accordance with the standards of a certain query string, each node to store one character at a time, the root node to child nodes path string is the node of the string, the query target string according to the corresponding character in step by step from the root node access nodes, actually also is the process of matching string prefix.
In the dictionary tree, look for “hell”,
- Select ‘h’ from ‘root’ where ‘h’ is found
- Query ‘e’ from ‘h’ child node
- Query ‘l’ from ‘e’ child node
- Select ‘l’ from ‘l’ child node where ‘l’ has word marker bits, string match successful, return
If you query no in this dictionary
- Select ‘n’ from root; select ‘n’ from root;
- Select ‘o’ from ‘n’; select ‘o’ from ‘n’; ‘o’ from ‘n’
Delete operation
Delete is a bit more complex than insert and query, but it’s simple, as long as the word already exists in the dictionary tree.
The operation of removing a dictionary tree node takes into account whether the last character of the target string is a leaf node in the tree.
Because a word may be a prefix to another word, if it is not a leaf node, we only need to empty the word flag bit of the word, instead of deleting the whole “branch”.
The deleted word is in the prefix
For example, if you want to delete the word “no”
- So let’s start at the root and find the node ‘n’
- Start with ‘n’ and find the node ‘0’
- At the end of string matching, the ‘o’ node has a word flag bit. Since ‘o’ is not a leaf node at this time, the word flag bit of this node can be cleared.
Delete the entire word
For example, if you want to delete the word “hell”, you would do the same as the first deletion, but start with the last node, the ‘l’ node is the leaf node, and then go up.
Delete the word on the branch
For example, if you want to delete “hi”, you can access the leaf node ‘I ‘, delete the leaf node, and go up to ‘h’. Since ‘h’ is still not a leaf node after’ I ‘is deleted, you will not continue to delete the node.
The deleted words contain other words
For example, to delete “nop”, just like the previous ones, access the leaf node ‘p’ to delete, then move up to find that ‘o’ is a leaf node, but ‘o’ has a word marker bit, so there is no further deletion.
There are several delete operations above, and we get the delete criteria:
- If the last character of the word is a non-leaf node, only the word flag bit is cleared
- If the word is on its own, separate from other words, delete the whole branch
- If a non-leaf node is encountered during word deletion, the deletion operation ends
- If the word flag bit of another word is encountered during the word deletion, the deletion is complete
The purpose of the dictionary tree
With so many operations of the dictionary tree, I believe you have a general understanding of the purpose of the dictionary tree, the most important role of the dictionary tree is for == string matching ==, prefix matching (fuzzy search), string lookup (dictionary) and so on.
Prefix matching (fuzzy search)
The blogger typed in just four keywords, and the search list returned a number of options led by trickle
String lookup (dictionary)
Just as the name suggests, it is a simple dictionary, not many examples.
meaning
Dictionary trees are built to make the insertion and lookup of strings efficient by using the space-for-time idea and the common prefixes of strings to reduce invalid string comparisons. Its insertion or search time is O(n), where n is the length of the string.
Dictionary trees have their downsides, of course. When inserted words don’t have many common prefixes, the construction of a dictionary tree becomes complicated and inefficient.
Code implementation (Java + Go)
Java version
import java.util.*;
public class TrieTree {
class TrieNode{
public TrieNode[] subNode;
public int count;// Number of child nodes of this node
public boolean isWord;// Word marker bits
public TrieNode(a) {
this.count = 0;
this.isWord = false;
this.subNode = new TrieNode[26];// Contains only lowercase letters a to Z}}public TrieNode root;
public TrieTree(a) {
root = new TrieNode();
}
public boolean search(String word) {
TrieNode curNode = root;
int index;
for(int i = 0; i < word.length(); i++) {
index = word.charAt(i)-'a';
if(curNode.subNode[index]! =null) {
curNode = curNode.subNode[index];
}else{
return false; }}return curNode.isWord;
}
public void insert(String word) {
if(search(word)) {
System.out.println("The word already exists.");
return;
}
TrieNode node = root;
int index;
for(int i = 0; i < word.length(); i++) {
index = word.charAt(i) - 'a';
if(node.subNode[index]==null) {
node.subNode[index]= new TrieNode();
}
node.count++;
node = node.subNode[index];
}
node.isWord = true;
}
public void delete(String word) {
if(! search(word)) { System.out.println("No such word.");
return;
}
TrieNode node = root;
LinkedList<Integer> indexList = new LinkedList();
LinkedList<TrieNode> nodeList = new LinkedList();
int index;
for(int i = 0; i < word.length(); i++) {
index = word.charAt(i) - 'a';
indexList.add(index);
nodeList.add(node);
node = node.subNode[index];
}
for(int i = word.length() - 1; i >= 0; i--) {
node = nodeList.pollLast();
index = indexList.pollLast();
if(node.subNode[index].subNode == null) {
if(i ! = word.length() -1) {
if(node.subNode[index].isWord == true) {// If the prefix node has a word marker bit, then the deletion is not continued
return;
}
}
node.subNode[index] = null;
node.count--;
}
if(i == word.length()-1) {
if(node.subNode[index].subNode ! =null) {
node.subNode[index].isWord = false;
return; }}}}public static void main(String[] args) {
TrieTree myTrieTree = new TrieTree();
String[] words = {"hello"."face"."hi"."hell"."why"};
// Insert string
for(String word : words)
myTrieTree.insert(word);
// Insert duplicate string
myTrieTree.insert("hello");
// Delete the string
myTrieTree.delete("hell");
// Deduplicates the string
myTrieTree.delete("hell");
myTrieTree.delete("hi");
// Query the string, true if found, false if not found
System.out.println(myTrieTree.search("hello"));
System.out.println(myTrieTree.search("hi"));
System.out.println(myTrieTree.search("hell")); }}Copy the code
Go version
package main
import (
"fmt"
)
type TrieNode struct {
children []*TrieNode
count int
isWord bool
}
func createTrieNode(a) *TrieNode{
return &TrieNode{
children: make([]*TrieNode, 26),
count: 0,
isWord: false,}}type TrieTree struct {
root *TrieNode
}
func createTrieTree(a) *TrieTree{
root := createTrieNode()
return &TrieTree{
root: root,
}
}
type method interface {
search(word string) bool
insert(word string)
delete(word string)}func(tireTree *TrieTree) search(word string) bool{
tmpNode := tireTree.root
for _,w := range word {
index := int(w - 'a')
iftmpNode.children[index] ! =nil{
tmpNode = tmpNode.children[index]
}else{
return false}}return tmpNode.isWord
}
func(tireTree *TrieTree) insert(word string) {
if tireTree.search(word){
fmt.Println("The word already exists!")
}
tmpNode := tireTree.root
for _,w := range word {
index := int(w - 'a')
if tmpNode.children[index] == nil{
tmpNode.children[index] = createTrieNode()
tmpNode.count++
}
tmpNode = tmpNode.children[index]
}
tmpNode.isWord = true
}
func(tireTree *TrieTree) delete(word string) {
if! tireTree.search(word){ fmt.Println("No such word")
}
tmpNode := tireTree.root
var (
indexList []int
nodeList []*TrieNode
)
for _,w := range word {
index := int(w - 'a')
indexList = append(indexList, index)
nodeList = append(nodeList, tmpNode)
tmpNode = tmpNode.children[index]
}
for i := len(indexList)- 1; i >=0; i++ {
index := indexList[i]
tmpNode = nodeList[i]
if tmpNode.children[index].children == nil{
ifi ! =len(indexList) - 1 {
if tmpNode.children[index].isWord {
return
}
}
tmpNode.children[index] = nil
tmpNode.count--
}
if i == len(indexList) - 1{
iftmpNode.children[index].children ! =nil{
tmpNode.children[index].isWord = false
return}}}}func main(a){
words := []string{"hello"."face"."hi"."hell"."why"}
myTrieTree := createTrieTree()
// Insert string
for _, word := range words {
myTrieTree.insert(word)
}
// Insert duplicate string
myTrieTree.insert("hello")
// Delete the string
myTrieTree.delete("hell")
// Deduplicates the string
myTrieTree.delete("hell")
myTrieTree.delete("hi")
// Query the string, true if found, false if not found
fmt.Println(myTrieTree.search("hello"))
fmt.Println(myTrieTree.search("hi"))
fmt.Println(myTrieTree.search("hell"))}Copy the code
conclusion
The dictionary tree is not very difficult, but it is a very useful data structure that can be learned to solve some problems with string matching and common prefixes.
Of course, as we’ve said, dictionary trees have their drawbacks. Because of the space for time, when you have a bunch of words with very few common prefixes, the space requirement is very large.