React virtual DOM, Diff algorithm and Key mechanism

Reference article:

Juejin. Cn/post / 684490…

Deep in the React Tech Stack

1. The virtual dom

React converts the real native JS DOM into JavaScript objects. So that’s the Virtual Dom

After each data update, the virtual Dom is recalculated and compared with the virtual Dom generated last time, and the changed part is updated in batches. React provides a componentShouldUpdate lifecycle to allow developers to manually control how to reduce unnecessary virtual DOM comparisons after data changes, improving performance and rendering efficiency.

Native HTML element code:

<div class="title">
      <span>Hello ConardLi</span>
      <ul>
        <li>apple</li>
        <li>orange</li>
      </ul>
</div>

Copy the code

React might be stored as JS code like this:

const VitrualDom = {
  type: 'div'.props: { class: 'title' },
  children: [{type: 'span'.children: 'Hello ConardLi'
    },
    {
      type: 'ul'.children: [{type: 'li'.children: 'apple' },
        { type: 'li'.children: 'orange'}]}]}Copy the code

When we need to create or update elements, React first creates and changes the VitrualDom object, and then renders the VitrualDom object into the real DOM.

When we need to listen for events on the DOM, we first listen for events on VitrualDom, which responds by proxying native DOM events.

Composition of the virtual DOM:

Create virtual elements and components using JSX or React. CreateElement, React. CreateClass, etc. The ReactElementelement object, our component will eventually be rendered as the following structure:

Type: The type of the element, which can be a native HTML type (string) or a custom component (function or class)
Key: The unique identifier of a component, used in the Diff algorithm, as described below
Ref: Used to access native DOM nodes
Props: The props of the component passed in. Chidren is a property in props that stores the child node of the current component. It can be an array (multiple child nodes) or an object (only one child node)
Owner: Indicates the Component to which the Component being built belongs
Self :(non-production environment) specifies which component instance is currently located
_source :(non-production) specifies the file from which the debug code came (fileName) and the number of lines of code (lineNumber)

<div className="title">
      <span>Hello ConardLi</span>
      <ul>
        <li>apple</li>
        <li>orange</li>
      </ul>
</div>

Copy the code

Print out this JSX element to confirm that the virtual DOM is essentially a JS object:

Where, the native element tag used in JSX has type as the tag name. In the case of a function or class component, its Type is the corresponding class or function object

2. The diff algorithm

React requires two virtual DOM trees to be maintained at the same time: one representing the current DOM structure and the other generated when React state changes are about to be re-rendered. React compares the differences between the two trees to determine if and how the DOM structure needs to be modified. This algorithm is called the Diff algorithm.

There are some general solutions to this algorithmic problem, which generate the minimum operand to convert one tree to another. However, even in the most cutting-edge algorithms, the complexity of the algorithm is O(n 3), where n is the number of elements in the tree.

If the algorithm was used in React, the amount of computation required to show 1000 elements would be on the order of a billion. The cost is simply too high. React proposed a set of O(n) heuristic algorithm based on the following two assumptions:

1: Two different types of elements produce different trees;

2. Developers can use key prop to indicate which child elements are stable in different renders.

The React Diff algorithm is roughly executed as follows:

Diff algorithm will perform depth-first traversal of the old and new trees to avoid complete comparison between the two trees, so the algorithm complexity can reach O(n). Then generate a unique flag for each node:

In the traversal process, the old and new trees are compared at each node traversal, andOnly elements of the same level are compared:

That is, only the parts of the graph connected by dotted lines are compared and the differences are recorded.

React Diff algorithm

(1) Tree diff

Tree Diff is used for cross-layer operations on React DOM nodes. Because there are few DOM movement operations across hierarchies, the Tree Diff of the React Diff algorithm does not make an in-depth comparison of such operations, but simply deletes and creates them

As shown in the figure, node A (including its children) is moved to node D entirely. React only considers the position change of nodes at the same level, but only creates and deletes nodes at different levels.

When the root node finds that A has disappeared in the child node, it destroys A directly. When D discovers that A child node A is added, it creates A new child node (including the child node) as its child node. Diff: create A → create B → create C → delete A

It can be seen that when A node is moved across the hierarchy, the imaginary move operation does not occur, but the whole tree with A as the root node is recreated. This is an operation that affects React performance, so it is not recommended to operate across DOM nodes.

For these reasons, maintaining a stable DOM structure when developing components can help improve performance. For example, you can hide or show nodes through CSS without actually removing or adding DOM nodes

(2) Component diff

Component diff is a diff algorithm that compares React components at the same level before and after updates:

If it is a component of the same type, continue to compare the Virtual DOM tree (for example, continue to compare the component props with the child nodes and their properties in the component).
If not, the component is judged to be a dirty Component and all child nodes under the entire component are replaced, that is, the original component is destroyed and a new component is created.
Knowing that it is possible for components of the same type to have no change in their Virtual DOM can save a lot of diff time. As a result, React allows the user to determine if the component needs diff analysis via shouldComponentUpdate()

As shown in the figure, when component D changes to component G, even though the two components have similar structures, React will delete component D and recreate component G and its children instead of comparing their structures once it determines that D and G are different types of components.

Although diff can affect performance when two components are of different types but have similar structures, as the React blog notes, components of different types rarely have similar DOM trees, so this extreme factor is unlikely to have a significant impact on actual development

(3) Element diff

Element Diff is a diff algorithm that specializes in all nodes of the same hierarchy, including element and component nodes. When nodes are at the same level, diff provides three node operations: INSERT_MARKUP, MOVE_EXISTING, and REMOVE_NODE.

We call the set of all nodes at the same level to be compared in the virtual DOM tree as the new set and the old set respectively, and the following strategies are adopted:

INSERT_MARKUP: A type of component or element node of the new set does not exist in the old set, that is, the new node needs to be inserted into the new node.
MOVE_EXISTING: Generatecomponent-children has called receiveComponent, in which case prevChild=nextChild, generatecomponent-children has called receiveComponent, You need to move it around, and you can reuse the old DOM node.
REMOVE_NODE: a component or node type of the old collection that exists in the new collection but has different elements that cannot be reused or updated. Therefore, you need to delete the component or node that is not in the new collection.

As shown in the figure, the old set contains nodes A, B, C and D, while the updated new set contains nodes B, A, D and C (only the position changes, the respective nodes and internal data remain unchanged)The old and new sets are diff differentiated in sequence, found B! = A, create and insert B into new set, delete old set A; Similarly, create and insert A, D, and C, and delete B, C, and D.

React finds that such operations are cumbersome and redundant, because these nodes are the same, but the location sequence changes, which requires complex and inefficient deletion and creation operations. In fact, you only need to move the location of these nodes.

React proposes an optimization strategy to address this problem. It allows developers to add unique keys to subnodes of the same hierarchy. See the key mechanism below

3. The key mechanism

(1) The role of key

When a key attribute is added to a node at the same level, when its position at the current level changes. After the React Diff algorithm compares the old and new nodes, if the old and new nodes with the same key value are found, the algorithm moves the nodes (and then compares and updates the nodes according to the original policy), instead of deleting the old nodes and creating new nodes as the original policy does. This definitely improves React performance and rendering efficiency

(2) Specific execution process of key

First, loop through for (name in nextChildren), judge whether there is the same node in the new set if (prevChild === nextChild) by the unique key, if there is the same node, then move the operation. If (child._mountIndex < lastIndex), the position of the current node in the old set must be compared with that of lastIndex. Otherwise, this operation is not performed.

Example 1: All nodes at the same level only change position:

Begin traversal in the order in the new collection

If B’s lastIndex(similar to a buoy) = 0 in the new set and index = 1 in the old set, index > lastIndex, it is considered that B has no influence on the position of other elements in the set and does not move. And then lastIndex = Max (index, lastIndex) = 1
Index = 0, lastIndex = 1, if index < lastIndex, move A, lastIndex = Max (index, lastIndex) = 1
LastIndex = Max (index, lastIndex) = 3
C and A do the same thing, same as (2), move it, so lastIndex = Max (index, lastIndex) = 3

The move operation in the above conclusion means updating the rendering of the node, while not moving means updating the rendering is not needed

Example 2: All nodes at the same level are added or deleted and their positions change:

In the same case, B doesn’t move, lastIndex is equal to 1
Get E in the new set, find E not in the old set, create E at lastIndex, lastIndex++
If I pick C in the old set, C doesn’t move, lastIndex=2
If I take A from the old set, A moves to the position in the new set, lastIndex=2
After completing diff of all nodes in the new set, the old set is iterated to find nodes (D in this case) that do not exist in the new set and delete node D.

(3) Index is the key

React is often used to dynamically generate multiple child nodes in the current hierarchy by traversing (e.g. Array.map). This is a common tabular data rendering scene.

React does not recommend using the traversal index as the key attribute of nodes in this scenario. For example, all nodes currently traversed are of the same type, but their internal texts are different. In the case that index is used as the key, when we change the order of some elements in the original data list, the texts of the old and new nodes corresponding to the same index are inconsistent in the diff comparison between the old and new sets. Some nodes will need to update the rendered text, whereas if other stable unique identifiers are used as keys, only positional order changes will occur, eliminating the need to update the rendered text, improving performance.

In addition, using index as the key can cause some unexpected display errors:

{this.state.data.map((v,index) = > <Item key={index} v={v} />)}
['a','b','c']=>
<ul>
    <li key="0">a <input type="text"/></li>
    <li key="1">b <input type="text"/></li>
    <li key="2">c <input type="text"/></li>
</ul>

// Array rearrangement -> ['c','b','a'] =>
<ul>
    <li key="0">c <input type="text"/></li>
    <li key="1">b <input type="text"/></li>
    <li key="2">a <input type="text"/></li>
</ul>


Copy the code

In the above example, after the array is reordered, the instances corresponding to the key are not destroyed, but updated. Insert key=0; insert key=0;

The component is rerendered to get a new virtual DOM.
React considers a component to be the same, so only the component can be updated.
And then compare their children, found that the content of the text content is different (from a – > c), and input component did not change, then trigger a component componentWillReceiveProps method, so as to update their text content;
Because components of children in the input component did not change, its components and father is no connection between the incoming President of props, so the input component does not update (namely the componentWillReceiveProps methods will not be executed), causes the user input values will not change.

(4) Disadvantages of key mechanism

As shown in the figure, if the nodes of the new set are updated as D, A, B, and C, only D node moves compared with the old set, while A, B, and C still maintain their original order. Theoretically, DIff should only move D. However, since D is the largest in the old set, The _mountIndex of another node is less than lastIndex. As A result, A, B, and C move to the rear of node D instead of node D.

During development, minimize operations like moving the last node to the head of the list. React rendering performance is affected to some extent when the number of nodes is too large or update operations are too frequent.

(5) Precautions for key use:

If the list subsections are iterated over as pure presentation, and do not involve dynamic changes in the order of the list elements, then using index as the key is fine.
Key is only optimized for diff comparison of nodes at the same level, and nodes across the level have no effect on each other’s key values
In most cases, element nodes that use the key attribute at the same level of traversal have the same node type (e.g. span elements or the same component). If the same key exists in the old and new collections, and the node type is different (for example, from SPAN to div), this is equivalent to completely replacing the old node, deleting the old node, and creating the new node.
If there is a key in the new set that did not exist in the old set. For example, a node whose key was 1 now has a key of 100, but no other node in the old collection has a key of 100 either. In this case, the DIff algorithm destroys and recreates the corresponding node. This can be useful in some scenarios (such as resetting a component)
The toString() operation is performed on keys before comparison. Therefore, do not use values of object type as keys. This will result in nodes with the same key value in the same hierarchy. Nodes or components of the same type with duplicate keys are likely to have internal child element state sharing problems

React virtual DOM, Diff algorithm and Key mechanism

1. The virtual dom

2. The diff algorithm

3. The key mechanism

Related Posts

The browser rendering notes that a qualified beginner or intermediate front end engineer needs to master

Front-end Packaging Path (1) – Fundamentals & Development history & Early introduction to Webpack

React daily knowledge points collection