preface

The Trie dictionary tree shared this time is a branch of the topic on data structure. It is helpful to know Trie data structure, which is a tree data structure, for the construction of algorithm and data structure knowledge system.

My understanding of Trie trees is that strings are concatenated, eliminating unnecessary storage, using the common prefix of the string.

In fact, for its understanding, you understand this sentence can πŸ‘‡

Using the common prefix of the string to reduce the query time, minimize unnecessary string comparison, query efficiency is higher than the hash tree.

If you don’t know what a Trie data structure is, or if you know something about it, but know how to implement a simple Trie tree, then this article is probably for you.

The Trie tree is introduced around the following points: πŸ‘‡

  • The basic concept
  • Basic properties
  • Application scenarios
  • Two sample

Contact πŸ‘‰TianTianUp. If you have any questions, you can contact the author.


The basic concept

First, we need to do some basic understanding of the Trie tree. Trie trees are called dictionary trees, prefix trees, etc., and I’m going to call them dictionary trees.

Take a look at wikipedia’s description of it ⬇️

In computer science, a trie, also known as a prefix tree or dictionary tree, is an ordered tree used to hold associative arrays in which the keys are usually strings. Unlike binary lookup trees, keys are not stored directly in a node, but are determined by the node’s position in the tree. All descendants of a node have the same prefix, which is the string corresponding to the node, and the root node corresponds to the empty string. In general, not all nodes have corresponding values. Only leaf nodes and some internal nodes have corresponding keys with relevant values.

Plain description, in fact, we can see a picture to understand ~, I found a good picture on the Internet, the specific source, here will not supplement, because it can not find the original author ~

Dictionary tree diagram 1

The point here is that, in general, a character should be represented by a dot, so I’m going to use edges to describe characters just for illustration.

As you can see, the dictionary tree uses edges to represent letters, and the path from root to a node in the tree represents a string. For example, 1β†’2β†’6 represents the string ABA.

For example, if the string formed by 1β†’4β†’8 is CA, then if we expand it further, do we have CAA and CAB, then they all go through the paths of 1β†’4β†’8, which means that they have a common prefix, the content of this prefix is CA. Here we say, We know that the dictionary tree uses the prefix of the string to solve the problem.

So what are the specific properties of it, we introduce ~ below


Basic properties

With some understanding of these concepts, let’s look at the basic properties of Trie trees.

According to this, πŸ‘‡ can be roughly divided into three points

  1. The root node contains no characters, and each node except the root node contains only one character.
  2. From the root node to a node, the characters that pass through the path are connected to the string corresponding to the node.
  3. All children of each node contain different strings.

Next, we can analyze it a little bit. We can look at πŸ‘‡ with a graph

This is what we construct by taking six strings: How,hi,her,hello,so,see.

The illustration Trie tree

(The source of the picture is unknown, and there are too many quotes on the Internet.)

The first property:

It can also be seen from the figure that the root node is /, which represents empty content. Other nodes, for example, one level below the root node, have H and S, which represent two characters respectively.

The second property:

From the root node to a node, the characters that pass through the path are connected to the string corresponding to the node.

How is a string, hi is a string, but you wonder why he and hel can’t be strings?

And when you think about it, it means that you’ve been looking at it very carefully, and you’re about to get a handle on it, and indeed, when we look at the diagram, we see that some of the nodes are different colors, and that’s because we’ve booked this dark node to represent the end of a string. Think about it, what does that do?

So in the actual code, how should we go to the convention or make a tag, in fact, just set a tag bit.

Such as the following πŸ‘‡

const TrieNode = function () {
  this.next = Object.create(null)
  this.isEnd = false
};
Copy the code

The current isEnd variable indicates whether the current node is an end string. When isEnd is True, it indicates that the string formed from the root node to this character exists and is a complete string.

The third property:

All children of each node contain different strings.

Obviously, we start at the root node and work our way down, and we see that the nodes below each node are different, so the strings can’t be the same.


Application scenarios

With a little knowledge of the Trie tree, we can look at some of its practical applications.

The reference here is to several points available on the Internet πŸ‘‡

Keyword prompt in the search engine, the engine will automatically pop up the drop-down box matching keywords, this application scenario we should be familiar with.

A drop-down box

So how to use an efficient data structure storage, here is in line with the nature of the dictionary tree, so you can use the dictionary tree to construct specific data, to achieve a more rapid retrieval effect.

String retrieval

Save the information about some known strings (dictionaries) into the trie tree in advance, and search for other unknown strings to see if they occur or how often they occur. Examples can be given to illustrate the situation πŸ‘‡

  • 10 million strings, some of which are duplicate, you need to get rid of all the duplicates and keep the unduplicated strings.
  • Given a list of N words and an article written in all lowercase English, please write down all the new words that are not in the list in the order that they appear first.

Word frequency statistics

Given a long string, count the maximum number of occurrences, for example πŸ‘‡

  • There is a file of 1 GB size, each line contains a word, the size of the word is not more than 16 bytes, the memory limit is 1 MB. Returns the 100 words with the highest frequency.
  • A text file, about 10,000 lines, one word for each line, ask the statistics of the top 10 words most frequently appeared, please give ideas, give time complexity analysis.

The longest public prefix of a string

By now, we should know that Trie trees use common prefixes for multiple strings to save storage space. When we store a large number of strings on a Trie tree, we can quickly get common prefixes for certain strings, so we can use this feature to solve some prefix problems.

If I have to give an example, here is πŸ‘‡

  • Given N lowercase strings and Q questions, which ask what is the longest common prefix length of two strings?

There are many application scenarios, the rest can be explored by yourself, next, we through the actual topic to see, how to construct a dictionary tree ~


Two examples

Next, let’s use two problems as examples to see what problems dictionary trees can solve in practice πŸ‘‡

The longest word in the dictionary is ⭐

Link: longest word in the dictionary

Gives an English dictionary consisting of an array of strings words. Find the longest word that is made up of other words in the words dictionary by adding one letter step by step. If there are more than one possible answer, return the word with the smallest lexicographic order in the answer.

If there is no answer, an empty string is returned.

Example 1:

Input:words = ["w"."wo"."wor"."worl"."world"]
Output:"world"
Explanation:The word"world"Can be made of"w"."wo"."wor", and"worl"Add a letter composition.Copy the code

Example 2:

Input:words = ["a"."banana"."app"."appl"."ap"."apply"."apple"]
Output:"apple"
Explanation:"apply"and"apple"Can be made up of words from a dictionary. but"apple"The lexicographical order of is less than"apply".Copy the code

Tip:


Source: LeetCode Link: https://leetcode-cn.com/problems/longest-word-in-dictionary Copyright belongs to The Collar buckle network. Commercial reprint please contact official authorization, non-commercial reprint please indicate the source.


This problem is to find the longest word, which can be divided into a certain part of the words array. The most violent idea is to enumerate each item, but this kind of time complexity is huge. At this time, can we think about what is common in this problem?

  • That’s right, the prefix is the same, so from that point of view, you can use this prefix tree, and store it
  • Then go through the tree, as long as the tree has only one branch, it means it has a solution, if there are more than two branches, there is no answer.

Complexity analysis




This should make a lot of sense, but I’m going to skip it.

In this case, my solution to construct dictionary tree, of course, there are other solutions, I won’t expand here, can look at my code oh ~

One of the longest string

The code point here is β˜‘οΈ

In fact, as you can see, building a Trie tree takes a lot of space, sort of space for time, so it depends on the actual problem.


Implement Trie (prefix tree) ⭐⭐

Link: Implement Trie (Prefix tree)

Implement a Trie (prefix tree) with insert, search, and startsWith operations.

Example:

Trie trie = new Trie();
trie.insert("apple");
trie.search("apple"); / / returntrue
trie.search("app"); / / returnfalse
trie.startsWith("app"); / / returntrue
trie.insert("app"); trie.search("app"); / / returntrue Copy the code

Description:

  • You can assume that all input is made up of lowercase letters a-z.
  • Ensure that all inputs are non-empty strings.

Source: the power button (LeetCode) link: https://leetcode-cn.com/problems/implement-trie-prefix-tree copyright shall belong to the collar of the network. Commercial reprint please contact official authorization, non-commercial reprint please indicate the source.


This problem is typical of writing Trie trees, so if you’re writing this problem for the first time, if you don’t have any ideas, try looking at someone else’s code to see where the basic pattern is.

Without further ado, you can refer to this code to see how to construct a dictionary tree πŸ‘‡

Leetcode – Implements Trie tree

The code point here is β˜‘οΈ

The rest of the delete operation, and statistics of the frequency of the string, can be implemented by themselves, this is basically not difficult, draw a graph, know how to implement ~


The Trie dictionary tree is a Trie dictionary tree, and the Trie dictionary tree is a Trie tree

The longest word in the dictionary

Implement Trie (Prefix Tree)

Word Search II

Loading question

❀️ thank you

If you find this article helpful:

  1. Click “like” to support it, so that more people can see this content.
  2. Pay attention to the public number front-end UpUp, contact the author, if you encounter problems, welcome to disturb me, we discuss progress together.
  3. If you feel good, you can also read TianTian’s recent article (thanks to Digg friends for their encouragement and support 🌹🌹🌹) :
    • “Algorithms and Data Structures” a brain map showing the beauty of dynamic programming algorithms (370+πŸ‘)
    • Algorithms and Data Structures “The Beauty of DFS and BFS algorithms (240+πŸ‘)
    • “Algorithms and Data Structures” sorting algorithms (220+πŸ‘)
    • Algorithms and Data Structures takes you through the beauty of hashing algorithms (210+πŸ‘)
    • “Algorithms and Data Structures” The Beauty of backtracking algorithms (210+πŸ‘)
    • “Algorithms and Data Structures” The Beauty of divide-and-conquer algorithms (180+πŸ‘)

This article is formatted using MDNICE