preface

We often see in various articles that React optimizes diff complexity from O(n^3) to O(n), so how do we get the traditional O(n^3)? How does React optimize it to O(N)?

Traditional Diff algorithm

The traditional diff algorithm is O(n^3) for the minimum number of steps needed to change one tree into another. How to get it?

  • A node in the old number, compared to all the nodes in the new tree (O(n))
  • If the node is not found in the new tree, then the node will be deleted and the new tree will be trawled to find several nodes to fill it (up to O(n^2)).
  • All the nodes in the old tree will do this.
/ / pseudo-code const oldNodes = [' a ', 'b', 'c', 'd', 'e']. const newNodes = ['a', 'c', 'f', 'd', 'e']; for (let i = 0; i< oldNodes.length; I++) {// complexity O(n) for (let j = 0; j< newNodes.length; J++){// complexity O(n^2) // old node in new node if(not found){for (let k = 0; k< newNodes.length; K++) {// Complexity O(n^3) // Find the fill}}}}

There is a paper devoted to the diff algorithm: the diff paper


React diff

Three Strategies by React Diff

  • DOM nodes in Web UI move very little across the hierarchy and can be ignored (tree diff);
  • Two components of the same class will generate similar tree structures, and two components of different classes will generate different tree structures (Component diff).
  • For a group of child nodes at the same level, they can be distinguished by a unique key (Element diff);

tree diff

Based on Strategy 1, React makes a hierarchical comparison and optimization of the tree algorithm. Two trees only compare nodes at the same level.

The React Tree Diff only compares the DOM nodes in the same color according to the first strategy that ignores the inter-hierarchy movement. When a node of a certain level is found to be no longer present, the node and all its child nodes will be completely deleted without further comparison. This completes the DOM tree comparison with only one traversal, which increases the complexity to O(n).

According to the above rules, maintaining a stable DOM structure when developing components will help improve performance. For example, instead of displaying an element, you can control how it is shown and hidden by using the visibility attribute.

component diff

  • The component type has not changed. Continue comparing the Virtual Dom Tree.
  • If the type of the component changes, the component is treated as a dirty component and a new component is created, including all the children, to replace the component.
  • For components of the same type, it is possible that the Virtual DOM will not have any changes, so we can judge whether the component needs to diff by shouldComponentUpdate.

For example, when component D is changed to component G in the figure above, although their structure is very similar, due to the different component types, React diff will create a new component G to replace the whole component D, and E and F will also be recreated without reuse of component D.

element diff

When nodes are at the same level, React provides three ways to manipulate nodes: INSERT_MARKUP (insert), MOVE_EXISTING (move), and REMOVE_NODE (remove).

  • INSERT_MARKUP, a new node in the original set, needs to insert the new node;
  • Existing MOVE_EXISTING, which has a node in both the old and new sets, but is in a different position, can reuse the old node and needs to move the node;
  • REMOVE_NODE. If a node in the old collection does not exist in the new collection, or if an attribute is inconsistent, it needs to be removed.

Based on Element Diff React, it provides an optimization strategy, which is to add unique key to distinguish the same group of child nodes at the same level, which is also the premise of whether the nodes can be reused.

There is no key sample

The old set contains nodes: A, B, C, D, and the updated new set contains nodes: B, A, D, C. At this point, the new and old sets are differentiated by diff comparison and B is found! = A, then create and insert B into the new set, delete the old set A; And so on, create and insert A, D, and C, and delete B, C, and D.

A key example

Let’s look at the diff process in detail

LastIndex represents the maximum index position of the visited node in the old collection, which is a dynamic value initialized to 0.

  • So we go to C, B doesn’t move, lastindex is equal to 1
  • If we go to C and find that the old set does not exist, we create E and place it in the new set, lastIndex = 1
  • So if we go to C, and we don’t have index < lastIndex, C doesn’t move, lastIndex = 2
  • Index < lastIndex, move A to the corresponding position, lastIndex = 2
  • When the diff of all nodes in the new set is completed, it is also necessary to loop through the old set finally to judge whether there is any node that is not in the new set but still exists in the old set. It is found that there is such node D, so the node D is deleted and the diff is completed at this point

The shortcomings of the diff algorithm

Since the index of node D is the largest in the old set, the index of node A, node B and node C is < lastIndex, which leads to the move operation of node A, node B and node C. And this is just an extreme situation. The real world contains many similar situations. Think about the reverse loop.

Why the complexity of the React diff algorithm is O(n)

The reason is that React uses three strategies that limit a lot of situations and don’t do overly complicated calculations. All comparisons are performed at the same level, and it takes only one loop to complete all operations.

Refer to the article

What’s wrong with the React diff key using an array index?

React official explanation

We strongly recommend that you specify an appropriate key every time you build a dynamic list. If you don’t find a key that fits, then you need to consider rearranging your data structure so that you have the right key.

If you don’t specify any key, React will warn you and treat the array index as the default key. But if you want to reorder, add, or delete a list, using the array index as the key is problematic. Explicitly using key={I} to specify a key does eliminate the warning, but it still has the same problems as array indexes, so it’s best not to do so in most cases.

What’s the problem?

When the index of an array is used as the key value, it causes display misalignment. When deleting and inserting arrays, because array subscripts are dynamic, deleting and inserting will eventually be reflected in the last element.