“This article has participated in the good article call order activity, click to see: back end, big front end double track submission, 20,000 yuan prize pool for you to challenge!”

Like and follow, no longer get lost, your support means a lot to me!

🔥 Hi, I am Ugly. This article has been collected by GitHub · Android-Notebook. Welcome to grow with Peng Chouchou. (Contact information & group entry on GitHub)


preface

  • In the last article we discussed prefixes and tricks, and prefix sums are an algorithmic way of thinking that is well suited to dealing with interval queries. At the end of the article, I posed a question: Is there a better way to use prefixes and tricks for interval queries of dynamic data?
  • In this article, I’ll introduce a more efficient interval query data structure — the Segment Tree. Please be sure to like and follow if you can help, it really means a lot to me.

directory

Front knowledge

The content of this article will cover the following preconditions/related knowledge, thoughtful I have helped you to prepare, please enjoy ~

  • “Data structure | prefix and skills”
  • “| binary tree data structure”

1. Disadvantages of prefixes and arrays

In the last article, we used the “prefix and + difference” technique to solve 303. Region and retrieval – array immutable. In simple terms, we create a prefix and array that stores “the sum of all the precursor nodes of the element.” Using this prefix and array, we can quickly calculate the sum of the intervals [I, j] : Nums [I,j]=preSum[j+1]−preSum[I]nums[I,j]=preSum[j+1] -presum [I]nums[I,j]=preSum[j+1]−preSum[I]

Reference code:

class NumArray(nums: IntArray) { private val sum = IntArray(nums.size + 1) { 0 } init { for (index in nums.indices) { sum[index + 1] = Sum [index] + nums[index]}} fun sumRange(I: Int, j: Int): Int {return sum[j + 1] -sum [I]Copy the code

In this case, the interval query time complexity is O(1)O(1)O(1), the space complexity is O(n)O(n)O(n), generally good. However, as mentioned in the preface, so far we have only considered scenarios with static data. What if the data were modifiable?

We need to fix prefixes and arrays, for example: The num [2] num [2] num [2] is 10, then preSum [7] 3,…, preSum [7] 3,…, preSum [7] 3,…, all you need to update, the update operation time complexity is O (n) O (n) O (n). One update is not a big deal, but if the updates are frequent, the amortized time complexity of the algorithm deteriorates. In order to solve the dynamic data scenario, the data structure of “line segment tree” appears. The complexity of this data structure is compared with other data structures as follows:

The data structure Building structure Interval update Range queries Spatial complexity
Traversal, without using data structures O(1) O(1) O(n) O(1)
Prefixes and arrays O(n) O(n) O(1) O(n)
Line segment tree O(n) O(lgn) O(lgn) O * n (4) or (2 * n) O
Tree array O(n) O(lgn) O(lgn) O(n)

You can see that prefixes and arrays have the advantage of O(1)O(1)O(1) queries, but are not suitable for dynamic data scenarios, while line trees seem to have learned the middle ground, balancing the time complexity of both “interval query” and “single point update” operations. How does it do that?


2. What is a segment tree?

This is because prefixes and arrays are linear logical structures, and the modification operation must take linear time. In order for the modification operation to be superior to linear time, a nonlinear logical structure must be built.

2.1 Logical definition of line segment tree

The general binary tree node stores a value, while the node in the segment tree stores the aggregate information (such as maximum/minimum/and) on an interval [L,R][L, R][L,R], and the interval of the child node is exactly the same as the interval of the parent node. , for example, the parent node of the interval is [L, R] [L, R] [L, R), then the left child node of the interval is [L + R (L) / 2] [L + R (L) / 2] [L + R (L) / 2], the right child node interval is [+ R (L) / 2, R] [(L + R) / 2, R] [(L + R) / 2, R]. The leaf node is also an interval, but the endpoint of the interval, L==RL ==RL ==R, is a single point interval.

– photo references from www.jianshu.com/p/4d9da6745… – the yo1ooo

From the logical definition of Segment Tree, we can see that a Segment Tree is essentially a balanced binary search Tree, that is to say, it has the properties of both binary search Tree and balanced binary Tree:

  • Binary search tree: the node value in the left subtree of any node is less than the root node value, and the node value in the right subtree is greater than the root node value;

  • Balance Tree: The height difference between the left and right subtrees of any node is not greater than 1.

2.2 Physical implementation of line segment trees

In general, a physical implementation of a binary tree can be based on an array or a linked list. However, since the line segment tree itself is also a balanced binary tree, other layers of the line segment tree are full except for the last node of the binary tree, so the implementation using array is more space-efficient.

So, how do you implement an array-based line segment tree? The root node of the tree can be assigned to bit [0] or bit [1] of the array. There is no obvious difference between the two methods. The main difference is that the formula for calculating subscripts of child nodes/parent nodes is different:

The root node is stored at the end
[ 0 ] [0]
A:

  • On the first [I] [I] [I] on a node, the first [2 ∗ I + 1] [2 * I + 1] [2 ∗ I + 1) is the left node, the first [2 ∗ I + 2], [* 2 + 2] I [2 ∗ I + 2] is right node
  • On the first [I] [I] [I] on a node, the first [(I – 1) / 2] [(I – 1) / 2] [(I – 1) / 2] is the parent node

The root node is stored in bit [1][1][1] (recommended) :

  • Bit [0][0][0] is not stored. The root node is stored in bit [1][1][1]
  • On the first [I] [I] [I] on a node, the first 2 ∗ [I] [I] 2 * [I] 2 ∗ is the left node, the first [2 ∗ I + 1] [2 * I + 1] [2 ∗ I + 1] is right node
  • For the node in bit [I][I][I], bit [I /2][I /2][I /2] is the parent node

General implementation reference code:

class SegmentTree<E>( private val data: Array<E>, private val merge: (e1: E? , e2: E?) -> E ) { private val tree: Array<E? Tree = Array<Any? >(4 * data.size) { null } as Array<E? > buildSegmentTree(0, 0, data.size - 1)} /** * Index of left child node */ fun leftChildIndex(treeIndex: Int) = 2 * treeIndex + 1 /** * rightChildIndex(treeIndex: Int) = 2 * treeIndex + 2 / * * * had * @ param treeIndex current segment tree index * @ param left the right endpoints of the left endpoint * @ param right interval * / private fun BuildSegmentTree (treeIndex: Int, left: Int, right: Int) {// see Section 3} /** * Get (index: Int): E { if (index < 0 || index > data.size) { throw IllegalArgumentException("Index is illegal.") } return data[index] } /** @param left query(left: Int, right: Int): E { if (left < 0 || left >= data.size || right < 0 || right >= data.size || left > right) { throw IllegalArgumentException("Index is illegal."); } /** * update @param index @param value new value */ fun set(index: Int, value: Int) E) { if (index < 0 || index >= data.size) { throw IllegalArgumentException("Index is illegal."); } // See Section 3}}Copy the code

The buildSegmentTree(), Query (), and Update () methods are implemented in the next section. So why do we need to allocate 4n, 4n, 4n, 4n?

todo


3. Basic operations of line segment trees

Understand the logical definition and implementation of line segment tree, this section, I take you step by step to achieve the three basic operations of line segment tree – tree building & interval query & update.

3.1 done

Tree building is to build the data structure of the line segment tree using the original data. We use the top-down construction method. For each node of the line segment tree, we first build its left and right sub-trees, and then build the current node according to the left and right sub-nodes. For leaf nodes (single point interval), only the current node is built.

Reference code:

init { tree = Array<Any? >(4 * data.size) { null } as Array<E? > buildSegmentTree(0, 0, Data. The size - 1)} / * * * had * @ param treeIndex current segment tree index * @ param treeLeft node * @ treeRight param right left endpoint node interval right endpoint * / private fun buildSegmentTree(treeIndex: Int, treeLeft: Int, treeRight: Int) {if (treeLeft == treeRight) {// merge(data[treeLeft], null) return } val mid = (treeLeft + treeRight) ushr 1 val leftChild = leftChildIndex(treeIndex) val rightChild = RightChildIndex (treeIndex) // Build left subtree buildSegmentTree(leftChild, treeLeft, mid) mid + 1, treeRight) tree[treeIndex] = merge(tree[leftChild], tree[rightChild]) }Copy the code

Building complexity analysis:

  • Time complexity: O(n)O(n)O(n)
  • Space complexity: O(4∗n)O(4 * n)O(4∗n) = O(n)O(n)O(n) O(n)O(n)

3.2 Interval Query

Interval query is the result of querying an expected interval. The basic idea is to recursively query the result of subinterval, and then merge the result of subinterval to get the result of expected interval. The logic is as follows:

  • 0, start from the root node (the root node is the entire interval), recursively perform the following steps:
  • 1. If the search range is exactly equal to the range of the node interval, the aggregated data of the node is directly returned;
  • 2, if the search range just falls in the left subtree range, then recursively in the left subtree search;
  • 3, if the search scope just falls in the right subtree range, then recursively in the right subtree search;
  • 4. If the search scope spans two subtrees, then split into two recursive searches, and merge the results after the search is complete.
/** * @param left / @param right / / fun query(left: Int, right: Int): E { if (left < 0 || left >= data.size || right < 0 || right >= data.size || left > right) { throw IllegalArgumentException("Index is illegal."); } return query(0, 0, data.size-1, left, right) Take data length} / * * @ * * * range query param treeIndex index of the current node * @ param dataLeft current node left interval * @ param dataRight current node right * @ the param left interval left endpoint */ Private query(treeIndex: Int, dataLeft: Int, dataRight: Int, left: Int, right: Int): E {if (dataLeft == left && dataRight == right) {return tree[treeIndex]!! } val mid = (dataLeft + dataRight) ushr 1 val leftChild = leftChildIndex(treeIndex) val rightChild = If (right <= mid) {return query(leftChild, dataLeft, mid, left, dataLeft); If (left >= mid + 1) {return query(rightChild, mid + 1, dataRight, left, } val leftResult = query(leftChild, dataLeft, mid, left, mid) val rightResult = query(rightChild, dataLeft, mid) mid + 1, dataRight, mid + 1, right) return merge(leftResult, rightResult) }Copy the code

Query complexity analysis:

  • Time complexity: O(LGN)O(LGN)O(LGN), depending on the height of the tree
  • Space complexity: O(1)O(1)O(1)

3.3 Single point update

Single-point update is to appropriately adjust the structure of the line segment tree after data changes. The basic idea is to recursively modify the results of subintervals, and then update the results of expected current nodes by merging the results of subintervals. The logic is as follows:

  • 0, update the data array, and then update the value from the root node (which is the entire range), recursively:
  • Select * from left where left = right where left = right;
  • 2. If the update node happens to fall within the range of the left subtree, then recursively update the left subtree;
  • 3. If the update node happens to fall within the range of the right subtree, then recursively update the right subtree;
  • 4. After the left and right subtrees are updated, the current node is updated by merging the subtree information.
/** * @param index @param value new value */ fun set(index: Int, value: E) { if (index < 0 || index >= data.size) { throw IllegalArgumentException("Index is illegal."); } private fun set(treeIndex: Int, dataLeft: Int, dataLeft: Int); Int, dataRight: Int, index: Int, value: E) {if (dataLeft == dataRight) {// tree[treeIndex] = value return} Val mid = (dataLeft + dataRight) ushr 1 val leftChild = leftChildIndex(treeIndex) val rightChild = rightChildIndex(treeIndex) if (index <= mid) { set(leftChild, dataLeft, mid, index, value) } else if (index >= mid + 1) { set(rightChild, mid + 1, dataRight, index, value) } tree[treeIndex] = merge(tree[leftChild], tree[rightChild]) }Copy the code

Update complexity analysis:

  • Time complexity: O(LGN)O(LGN)O(LGN), depending on the height of the tree
  • Space complexity: O(1)O(1)O(1)

At this point, our SegmentTree data structure is completed, the complete code is as follows: SegmentTree


4. Typical examples · Area and retrieval – variable array

307. Range and retrieval – array variable “Antithesis”

Given an array nums, you are asked to complete two types of queries, one that requires updating the value corresponding to the array index, and the other that requires returning the sum of the array elements in a range.

class NumArray(nums: IntArray) {
    fun update(index: Int, `val`: Int) {

    }

    fun sumRange(left: Int, right: Int): Int {

    }
}
Copy the code

This is similar to 【题 303】, the difference is whether the array is mutable, belongs to the dynamic data scene. In the last video, we implemented a generic segment tree data structure, so let’s just use it.

Reference code:

class NumArray(nums: IntArray) { private val segmentTree = SegmentTree<Int>(nums.toTypedArray()) { e1: Int? , e2: Int? -> if (null == e1) e2!! else if (null == e2) e1 else e1 + e2 } fun update(index: Int, `val`: Int) { segmentTree.set(index, `val`) } fun sumRange(left: Int, right: Int): Int { return segmentTree.query(left, right) } }Copy the code

Something is done in a few lines of code, and the result is much better than using a prefix tree. But just from a problem perspective, it seems like a pain to write a bunch of SegmentTree code every time you do a problem. Are there other workarounds?


5. Line segment tree solution framework

Defining the SegmentTree data structure is too much effort. In this section, we’ll discuss a generic framework for solving problems that does not require SegmentTree definition. This is a clever solution, although it does not strictly satisfy the definition of a line segment tree (not a binary search tree, but still a balanced tree), but it is simpler to implement.

Reference code:

class NumArray(nums: IntArray) {private val n = nums.size private val tree = IntArray(2 * n) {0} Init {// Build the leaf node for (index in n until 2 * n) {tree[index] = nums[index-n]} // Build the parent node for (index in n) in sequence - 1 downTo 0) { tree[index] = tree[index * 2] + tree[index * 2 + 1] } } fun update(index: Int, `val`: Var treeIndex = index + n tree[treeIndex] = 'val' while (treeIndex > 0) {// 2; Depending on whether the current node is even or odd, Val right = if (0 == treeIndex % 2) val left = if (0 == treeIndex % 2) Else treeIndex - 1 val right = if (0 == treeIndex % 2) treeIndex + 1 else treeIndex tree[treeIndex / 2] = tree[left] + tree[right] treeIndex /= 2 } } fun sumRange(i: Int, j: Int): Int { var sum = 0 var left = i + n var right = j + n while (left <= right) { if (1 == left % 2) { sum += tree[left] left++ } if (0 == right % 2) { sum += tree[right] right-- } left /= 2 right /= 2 } return sum } }Copy the code

The nice thing about this implementation is that you only need 2 by n space instead of 4 by n space and I’m going to explain the code. The code consists of three main parts:

5.1 done

Building a line segment tree requires initializing an array of 2∗n2*n2∗ N space, and constructing the whole tree from bottom up. First, build the leaf node, the leaf node is located in the array interval [n,2n−1][n, 2n-1][n,2n−1], and then build the parent node according to the result of the child node (the node with the subscript of IndexIndexindex, the left child node with the subscript: 2∗index2*index2∗index, subindex of right child node: 2∗index+12*index+12∗index+1). Refer to the following schematic diagram:

5.2 Interval Query

Interval query is the result of querying an expected interval. Compared with the conventional line segment tree, the interval query process of this kind of line segment tree is relatively difficult to understand. The basic idea is to recursively find nodes that represent the interval. The logic is as follows:

  • 1. The initial interval query is equivalent to the merging of several leaf nodes [left,right][left,right][left,right] [left,right][left,right] between the array [n,2n−1][n,2n −1][n,2n −1].

  • 2. For node IndexIndexindex, its left subindex is 2∗index2*index2∗index, and the right subindex is 2∗index+12*index+12∗index+1, which means that all left subindices are even, and all right subindices are odd;

  • 3, 2 left/left / = = 2 left / / = 2 = 2 and right right right / / = 2 = 2 is looking for a parent, if leftleftleft pointer is odd, then leftleftleft pointer node must be a right, In this case, the left/2left/2left/2 node cannot directly represent the leftleftleft pointer node, so the left/2left/2 node can only be added separately. If the rightrightright pointer is even, then the righttrighttrightt pointer node must be a left node, In this case, the right/2right /2right/2 node cannot directly represent the right pointer node, so only add this “single” node;

  • Left == rightLeft == rightLeft ==right == rightLeft ==right == rightLeft ==right And the next lying loop leftLeftleft must be greater than rightrightright, break out of the loop.

5.3 Single Point update

Single-point update is to appropriately adjust the structure of the line segment tree after data changes. The basic idea is to update nodes corresponding to the target position first, and then recursively update the parent node. It is important to determine which two nodes to merge as the parent node based on whether the index of the current node is even or odd.

For example, if the updated node is the “a” node whose index in the line segment tree array is an even number (subindex 6), then its parent node is the “ab” node which needs to be obtained by merging tree[index] + tree[index+1].


6. Summary

  • Prefixes, arrays and line segment trees are applicable to interval query problems. The former reduces the overall performance when data is updated frequently, while the latter balances the time complexity of update and query with the complexity of O(LGN)O(LGN)O(LGN).
  • From the perspective of problem solving, the conventional method of building line segment tree is too complex, so we can use the anti-conventional way of building line segment tree, which will make the code more concise and the space complexity is better.
  • Besides line segment trees, do you know of any similar data structures that excel at interval queries and single point updates?

The resources

  • Segment tree · Problem set — LeetCode
  • Segment tree entry — Wikipedia
  • Segment trees by Yo1ooo
  • Chapter 24 line segment trees — by liweiwei1419

Creation is not easy, your “three company” is the biggest power of Ugly, we see next time!