In this article, we will understand the implementation process of Diff algorithm.

Relevant concepts

Let’s start by looking at some of the concepts involved in React Diff.

React nodes

Assuming that there is a DOM node that triggers an update, there are four types of nodes associated with the node during coordination:

  1. The DOM node itself.

  2. WorkInProgress Fiber, the fiber node (current) in the workInProgress Tree generated during the update process

Fiber. Alternate).

  1. Current Fiber, the DOM node corresponding to the fiber node has been rendered in the page (i.e. WorkInProgress Fiber. Alternate, i.e. WorkInProgress Fiber generated in the last update).

  2. ReactElement, the result of a ClassComponent call to the Render method or FunctionComponent during update. The ReactElement contains information describing the DOM node.

Diff algorithm can be understood as a new ReactElement is generated in the process of updating. After comparing the ReactElement with the corresponding current fiber in the process of F coordination, the workInProgress Fiber is generated.

Double buffering mechanism

The React Fiber dual-cache mechanism is an in-memory build-and-replace technique. This technique is used in render.

There are two Fiber trees in React. One is the current Fiber Tree corresponding to the DOM displayed on the screen, and the other is the workInProgress Fiber Tree constructed in memory when a new update task is triggered.

Fiber nodes in the Current Fiber Tree and workInProgress Fiber Tree are connected using the alternate property.

currentFiber.alternate === workInProgressFiber;
workInProgressFiber.alternate === currentFiber;
Copy the code

The React root node also has the Current attribute. Use the current attribute to switch between the root nodes of different Fiber trees to switch between the Current Fiber Tree and workInProgress Fiber Tree.

In the coordination phase, React uses the DIff algorithm to compare the React Element that generates the update with the nodes in the current Fiber Tree, and finally generates the workInProgress Fiber Tree in memory. The Renderer then renders the update to the page based on the workInProgress Fiber Tree. At the same time, the current attribute of the root node points to the workInProgress Fiber Tree, and the workInProgress Fiber Tree becomes the Current Fiber Tree.

effectTag

Effecttags are used to hold the specific type of DOM operation to be performed. Effecttags are represented in binary:

/ /... // means that the DOM node corresponding to the Fiber node needs to be inserted into the page. export const Placement = /* */ 0b000000000000010; // means that the Fiber node needs to be updated. export const Update = /* */ 0b000000000000100; export const PlacementAndUpdate = /* */ 0b000000000000110; // means that the DOM node corresponding to the Fiber node needs to be removed from the page. export const Deletion = /* */ 0b000000000001000; / /...Copy the code

The bottleneck of the Diff

React coordination:

At some point a node calls the Render () method of React, creating a tree of React elements. The same render() method returns a different tree the next time state or props is updated. React needs to determine how to update the UI efficiently based on the differences between the two trees to keep the current UI in sync with the latest tree.

Some common solutions to this algorithm are to generate a minimum number of operations to convert one tree to another. However, even with the optimal algorithm, the complexity of the algorithm is O(n 3), where n is the number of elements in the tree.

React How to solve Diff bottleneck

React proposed a set of O(n) heuristic algorithms based on the following two assumptions:

  1. Diff only sibling elements;

  2. Two different types of elements produce different trees;

  3. The developer can set the key attribute to tell the render which child elements can be saved in different renders.

React Diff implementation

The React Diff entry function for reconcileChildren is the reconcileChildren’s current === NULL, which separates the current Fiber node from the Current Fiber node for mounting or updating, and then performs different tasks in each case:

export function reconcileChildren( current: Fiber | null, workInProgress: Fiber, nextChildren: any, renderLanes: Lanes) {if (current === null) {workinprogress. child = mountChildProgress (workInProgress, null, nextChildren, renderLanes, ); } else {// For the components of the update workinprogress. child = reconcileChildFibers(workInProgress, current. Child, nextChildren, renderLanes, ); }}Copy the code

MountChildFibers and reconcileChildFibers are generated through ChildReconciler.

export const reconcileChildFibers = ChildReconciler(true);
export const mountChildFibers = ChildReconciler(false);
Copy the code

The difference is that shouldTrackSideEffects has a different value. If shouldTrackSideEffects is true, it will collect effectTag for the generated fiber node, and if shouldTrackSideEffects is true, it will not collect effectTag.

function ChildReconciler(shouldTrackSideEffects) {
  //...

  function reconcileChildrenArray(
    returnFiber: Fiber,
    currentFirstChild: Fiber | null,
    newChildren: Array<*>,
    lanes: Lanes,
  ): Fiber | null {
    //...
  }

  function reconcileChildrenIterator(
    returnFiber: Fiber,
    currentFirstChild: Fiber | null,
    newChildrenIterable: Iterable<*>,
    lanes: Lanes,
  ): Fiber | null {
    //...
  }

  function reconcileSingleTextNode(
    returnFiber: Fiber,
    currentFirstChild: Fiber | null,
    textContent: string,
    lanes: Lanes,
  ): Fiber {
    //...
  }

  function reconcileSingleElement(
    returnFiber: Fiber,
    currentFirstChild: Fiber | null,
    element: ReactElement,
    lanes: Lanes,
  ): Fiber {
    //...
  } 

  function reconcileSinglePortal(
    returnFiber: Fiber,
    currentFirstChild: Fiber | null,
    portal: ReactPortal,
    lanes: Lanes,
  ): Fiber {
    //...
  } 

  // This API will tag the children with the side-effect of the reconciliation
  // itself. They will be added to the side-effect list as we pass through the
  // children and the parent.
  function reconcileChildFibers(
    returnFiber: Fiber,
    currentFirstChild: Fiber | null,
    newChild: any,
    lanes: Lanes,
  ): Fiber | null {
    //...
  } 

  return reconcileChildFibers;
}
Copy the code

ChildReconciler internally defines a number of reconciling-related functions, which include reconcileChildFibers as their return value. That is to say, the ChildReconciler’s internal definition for the React Diff reconcileChildFibers is the real entry function. The reconcileChildFibers call different handlers within the reconcileChildFibers based on the typeof the newChild (the newly generated ReactElement) and the $$typeof attribute.

/ / processing function according to different Diff newChild into function reconcileChildFibers (returnFiber: Fiber, currentFirstChild: Fiber | null, newChild: any, ): Fiber | null { //... if (typeof newChild === 'object' && newChild ! $$typeof) {case REACT_ELEMENT_TYPE: case REACT_ELEMENT_TYPE: case REACT_ELEMENT_TYPE: // reconcileSingleElement is called //... }} the if (typeof newChild = = = 'string' | | typeof newChild = = = 'number') {/ / call reconcileSingleTextNode processing / /... } if (isArray(newChild)) {// Calls the reconcileChildrenArray reconcililechildrenarray... } the if (getIteratorFn (newChild)) {/ / call reconcileChildrenIterator processing / /... } / /... Return deleteRemainingChildren(returnFiber, currentFirstChild); }Copy the code

We can divide Diff into two types based on the number of nodes in newChild:

  1. When newChild has only one node, i.e. newChild of type Object, number, string, we enter a single-node Diff
  2. When newChild has multiple nodes, that is, Array of type, the multi-node Diff is entered

Single-node Diff

Single node means newChild is a single node, but we are not sure how many nodes this node has in the corresponding level of current Fiber Tree. Therefore, we can divide this node into three scenarios according to the number of nodes in the corresponding level of current Fiber Tree:

  1. A single old node exists at the corresponding layer. Procedure
Old: A New: A'Copy the code
  1. Multiple old nodes exist at the corresponding layer
Old: A -> B -> C New: A'Copy the code
  1. There are no old nodes at the corresponding tier
Old: null New: A'Copy the code

We can classify scenario 1 and scenario 2 into one category (i.e. at the current level, current fiber has at least one node in the sibling direction linked list), and scenario 3 is a separate category (there is no current fiber node in the current level).

The current fiber list at sibling direction is not empty

At this point, we need to loop through all current fiber nodes of the list at the sibling direction at the current level, Check whether the stateNode (i.e., DOM node corresponding to the fiber node) of the current layer can be reused (with the same key and type). If so, the Current fiber can be used as a copy to generate workInProgress Fiber. When all current fiber nodes at the current level cannot be reused, a new fiber node is directly generated as the workInProgress Fiber.

The linked list of current fiber at the sibling direction of the current level is empty

Since there is no corresponding current Fiber node, a new fiber node is directly generated as workInProgress Fiber.

reconcileSingleElementConcrete implementation of

function reconcileSingleElement( returnFiber: Fiber, currentFirstChild: Fiber | null, element: ReactElement, lanes: Lanes, ): Fiber { const key = element.key; let child = currentFirstChild; // Loop over the current fiber node while (child! == null) {if (child.key === key) { Check whether current fiber. ElementType is equal to ReactElement. default: {if (child-.elementType === element.type) {// If (child-.elementType === element.type) {// If (child-.elementType === element.type) { Therefore, if the current node has a sibling node, delete effectTag on the sibling node deleteRemainingChildren(returnFiber, child.sibling). // Const existing = useFiber(child, element.props); // Const existing = useFiber(child, element.props); existing.ref = coerceRef(returnFiber, child, element); existing.return = returnFiber; return existing; } break; }} // If the key is the same but the type is different, the node corresponding to the current node is found when it was updated last time, but the type of the two nodes is inconsistent, and the subsequent nodes cannot be reused. Delete the current fiber node and the remaining siblings of the node deleteRemainingChildren(returnFiber, Child); break; } else {// Delete Deletion effectTag on the current current fiber node, delete the current node, and continue to traverse the next sibling node to try to match deleteChild(returnFiber, child). } child = child.sibling; Const created = createFiberFromElement(Element, returnFiber.mode, lanes); created.ref = coerceRef(returnFiber, currentFirstChild, element); created.return = returnFiber; return created; }Copy the code

Pay attention to, When the Deletion effectTag is not matched for the current fiber node, it will be added to the effectList of the parent node at the same time. (Normally, the collection of effectList is performed in completeWork. However, the deleted node will break away from the fiber tree and cannot enter the process of completeWork, so the effectList of the parent node is added in advance in the beginWork stage.

Multi-node Diff

When newChild is multi-node, multi-node Diff is triggered in the following scenarios.

  1. Node updates
// Node attributes changed; / / old < div key = "a" > a < / div > < div key = "b" > b < / div > < div key = "c" > c < / div > / / new < div key = "a" className = "a" > a < / div > < div Key = "b" > b < / div > < div key = "c" > c < / div > / / the node type is changed; / / old < div key = "a" > a < / div > < div key = "b" > b < / div > < div key = "c" > c < / div > / / new < span key = "a" > a < / div > < div key = "b" > b < / div > < div key="c">c</div>Copy the code
  1. The number of nodes changes;
/ / / / delete node < div old key = "a" > a < / div > < div key = "b" > b < / div > < div key = "c" > c < / div > / / new < div key = "a" > a < / div > / / / / the old new node < div Key = "a" > a < / div > < div key = "b" > b < / div > < div key = "c" > c < / div > / / new < div key = "a" > a < / div > < div key = "b" > b < / div > < div key="c">c</div> <div key="d">d</div> <div key="e">e</div>Copy the code
  1. The position of the node changes;
/ / old < div key = "a" > a < / div > < div key = "b" > b < / div > < div key = "c" > c < / div > < div key = "e" > e < / div > / / new < div key = "a" > a < / div > < div key="c">c</div> <div key="b">b</div> <div key="e">e</div>Copy the code

We can see, in the analysis of reconcileChildFibers React to use two functions reconcileChildrenArray (for an array type) and reconcileChildrenIterator (for iterative type) for multi-node diff, We focus on the reconcileChildrenArray.

function reconcileChildrenArray(
  returnFiber: Fiber,
  currentFirstChild: Fiber | null,
  newChildren: Array<*>,
  lanes: Lanes,
): Fiber | null {
  //...
}
Copy the code

The parameter currentFirstChild refers to the first fiber in the current Fiber tree hierarchy. Using fiber.sibling we can get a linked list of all fiber nodes of currentFirstChild in the current hierarchy from left to right. We use oldFiber to represent this linked list.

The newChildren argument is an array containing all ReactElement objects of the current hierarchy.

The reconcileChildrenArray compares the differences between the two sequences and generates a list of Fiber nodes in the current level of the workInProgress Fiber Tree, using resultingFirstChild as the head node. The comparison process goes through two rounds of traversal.

First iteration

  1. Traversal compares newChildren and oldFiber to determine whether DOM nodes are reusable.

  2. If reusable, oldFiber is reused to generate newFiber and added to the sequence with resultingFirstChild as the head node (subsequently represented by newFiberList), and both sequences jump to the next node.

  3. If it is not reusable, it is processed in two cases:

    • If the keys are inconsistent, the first iteration is terminated immediately.
    • If the keys are the same but the types are different, delete the current oldFiber node and generate a new fiber node and add it to the oldFiber nodenewFiberListIn, both sequences jump to the next node to continue traversal.
  4. When any of the two sequences to complete a sequence traversal (i.e. newIdx > = newChildren. Length 1 | | oldFiber. (= = = null), traversing the end of the first round.

Second round of traversal

After the first iteration, there are four results.

oldFiberwithnewChildrenWe’re all done iterating

This round of DIFF is now complete. In this case, which is the best case, only the corresponding update operation in the first round of traversal is required.

oldFiberI’m done, butnewChildrenI haven’t gone through it yet.

Assuming that all nodes in newFiberList can be directly reusable, the current scenario is consistent with the new node scenario we discussed earlier.

/ / / / the old new node < div key = "a" > a < / div > < div key = "b" > b < / div > < div key = "c" > c < / div > / / new < div key = "a" > a < / div > < div key = "b" > b < / div > <div key="c">c</div> <div key="d">d</div> <div key="e">e</div>Copy the code

This means that the remaining newChildren nodes are new, and we need to traverse the remaining newChildren to generate the corresponding workInProgress fiber and mark the upper Placement.

It is only assumed that all nodes added to newFiberList in the first round are reusable for easy understanding, but we have analyzed previously that there are two types of nodes that can be added to newFiberList in the first round:

  1. Nodes that can be reused directly;
  2. Nodes with the same key but different type;

newChildrenI’m done, butoldFiberI haven’t gone through it yet

Again, we assume that all nodes added to newFiberList can be reused directly, this time consistent with the node deletion scenario we discussed earlier.

/ / old < div key = "a" > a < / div > < div key = "b" > b < / div > < div key = "c" > c < / div > / / new < div key = "a" > a < / div >Copy the code

At this point, all the newChildren nodes have found reusable nodes. We need to traverse the remaining oldFiber and mark Deletion.

Neither newChildren nor oldFiber has completed traversal

This means that there is a possibility that a node has changed position in this update. Let’s look at the example of position shifting mentioned earlier.

/ / old < div key = "a" > a < / div > < div key = "b" > b < / div > < div key = "c" > c < / div > < div key = "e" > e < / div > / / new < div key = "a" > a < / div > < div key="c">c</div> <div key="b">b</div> <div key="e">e</div>Copy the code

After the first round of traversal, the node with key value a is multiplexed by oldFiber node to generate newFiber node, which is added to newFiberList. When comparing the second node, the key values of the new node C and the old node B are inconsistent. At this point, we will terminate the first round of traversal and enter the second round of traversal.

Because there may be moving nodes, the index of the node cannot find two corresponding nodes, and we need a way to find old and new node pairs. At this point, the node’s key comes into play. We mentioned earlier:

The developer can set the key attribute to tell the render which child elements can be saved in different renders.

React stores unprocessed oldFiber nodes into a Map with oldFiber as the key and oldFiber as the value.

//existingChildren is a node whose key is OldFiber is value's Map const existingChildren = mapRemainingChildren(returnFiber, oldFiber);Copy the code

It then continues through the newChildren sequence and queries whether there is an oldFiber node with the same key in existingChildren by the key of the newChildren node.

Next we need a second round of traversal to mark which nodes can be reused and which nodes need to be moved. At this point we need a reference index to determine whether the node needs to be moved.

We use oldIndex, the position index of oldFiber in oldFiber linked list corresponding to the right-most node in newFiberList, as the reference index and lastPlacedIndex as the reference index.

In the example analyzed above, lastPlacedIndex is the oldIndex of newFiberList’s right-most node A, which is 0, after the first round of traversal.

Then start the second round of traversal

NewChildren C newChildren C queries oldFiber c through existingChildren and finds that oldIndex of oldFiber C is 2, greater than lastPlacedIndex=0.

According to the definition of lastPlacedIndex, if the current oldIndex is greater than lastPlacedIndex, it means that oldFiber node C is located to the right of oldFiber node corresponding to the rightmost node of newFiberList. Meanwhile, since the newChildren sequence is traversed sequentially, the newFiber node generated by the current newChildren must be to the right of the newFiber node corresponding to the rightmost node of newFiberList. Therefore, the position of node C before and after the update does not occur.

At this point, since oldIndex > lastPlacedIndex on c, we update lastPlacedIndex to oldIndex=2 on C. Meanwhile, add newFiber generated by node C to newFiberList to continue traversal.

NewChildren node B newChildren node C queries oldFiber node C through existingChildren, and finds that oldIndex of oldFiber node C is 1, less than lastPlacedIndex=2. OldFiber node b is located to the left of oldFiber node corresponding to the right-most node of newFiberList. However, the newFiber node generated by newChildren B must be to the right of the newFiber node corresponding to the rightmost node of newFiberList. Therefore, the position of node B will change before and after the update.

At this point, since oldIndex of node B < lastPlacedIndex, lastPlacedIndex remains unchanged, and newFiber generated by node B is marked Placement, indicating that the position of this node will change before and after updating. Finally, add newFiber to newFiberList to continue the walk.

NewChildren d newChildren D queries oldFiber d through existingChildren, and finds that oldIndex of oldFiber D is 3, greater than lastPlacedIndex=3. The processing logic is the same as that of node C.

When diff is complete, we also get the complete newFiberList in this level as the fiber node of the workInProgress fiber tree in this level.

Let me look at another example

// New A->D->B->C ->CCopy the code

As you can see, node D does not move, and the position of the node changes only when it needs to move to the left (in this case, nodes B and C), so we need to minimize the need to move nodes from the back to the front.

reconcileChildrenArrayConcrete implementation of

Finally, let’s look at how reconcileChildrenArray implements this logic.

/* * returnFiber: parent of currentFirstChild * currentFirstChild: parent of currentFirstChild * newChildren: ReconcileChildrenArray (returnFiber: Fiber, currentFirstChild: reconcileChildrenArray) Fiber | null, newChildren: Array<*>, lanes: Lanes, ): Fiber | null { //.. // Select * from * * * * * * * * * * * * * * * * * * * * * In the above analysis, we use newFiberList to represent this sequence let resultingFirstChild: Fiber | null = null; / / previousNewFiber used to subsequent new generated workInProgress fiber connected to the let newFiberList previousNewFiber: fiber | null = null; Let oldFiber = currentFirstChild; // oldFiber = currentFirstChild; // Store newFiberList's index let lastPlacedIndex = 0; // store traversal to newChildren index let newIdx = 0; // Store the current oldFiber next node let nextOldFiber = null; // first pass through for (; oldFiber ! == null && newIdx < newChildren.length; newIdx++) { if (oldFiber.index > newIdx) { nextOldFiber = oldFiber; oldFiber = null; } else { nextOldFiber = oldFiber.sibling; } // Try to generate a new node, if the key is different, return null, the key is the same, and compare whether the type is the same. If the types are consistent, run useFiber(Update logic) CreateXXX (INSERT logic) const newFiber = updateSlot(returnFiber, oldFiber, newChildren[newIdx], lanes,); If (newFiber === NULL) {if (oldFiber === null) {oldFiber = nextOldFiber; } break; } if (shouldTrackSideEffects) {// shouldTrackSideEffects should be true if (oldFiber && newFiber. Alternate === Null) {// if newFiber. Alternate is null, oldFiber deleteChild(oldFiber, oldFiber); LastPlacedIndex = placeChild(newFiber, lastPlacedIndex, newIdx); // Add newFiber to newFiberList if (previousNewFiber === null) {resultingFirstChild = newFiber; } else { previousNewFiber.sibling = newFiber; } previousNewFiber = newFiber; oldFiber = nextOldFiber; } if (newIdx === newchildren.length) { Delete untraversed nodes in oldFiber chain deleteRemainingChildren(returnFiber, oldFiber); / /... // Diff return resultingFirstChild; } if (oldFiber === null) {// oldFiber is traversed, newChildren is added to newFiberList for (; newIdx < newChildren.length; newIdx++) { const newFiber = createChild(returnFiber, newChildren[newIdx], lanes); if (newFiber === null) { continue; } lastPlacedIndex = placeChild(newFiber, lastPlacedIndex, newIdx); if (previousNewFiber === null) { resultingFirstChild = newFiber; } else { previousNewFiber.sibling = newFiber; } previousNewFiber = newFiber; } / /... // Diff return resultingFirstChild; } // newChildren and oldFiber are not traversed, add the rest of oldFiber sequence to a map, In the second traversal process, the corresponding oldFiber const existingChildren = mapRemainingChildren(returnFiber, oldFiber) can be found by key value; // go through for (; newIdx < newChildren.length; newIdx++) { const newFiber = updateFromMap( existingChildren, returnFiber, newIdx, newChildren[newIdx], lanes, ); if (newFiber ! == null) { if (shouldTrackSideEffects) { if (newFiber.alternate ! Delete (newFiber. Key === null? NewIdx:) {// If newFiber is created by reuse, clean up the map corresponding old node existingChildren.delete(newFiber. newFiber.key, ); LastPlacedIndex = placeChild(newFiber, lastPlacedIndex, newIdx); // Add newFiber to newFiberList if (previousNewFiber === null) {resultingFirstChild = newFiber; } else { previousNewFiber.sibling = newFiber; } previousNewFiber = newFiber; } if (shouldTrackSideEffects) {// Delete oldFiber existingChildren.forEach(child => deleteChild(returnFiber, child)); } / /... // Diff return resultingFirstChild; Function placeChild(newFiber: Fiber, lastPlacedIndex: number, newIndex: number, ): number { newFiber.index = newIndex; / /... const current = newFiber.alternate; if (current ! == null) { const oldIndex = current.index; If (oldIndex < lastPlacedIndex) {/ / the node need to move newFiber. The flags | = Placement; return lastPlacedIndex; } else {// This object does not need to be moved, return oldIndex as lastPlacedIndex return oldIndex; }} else {/ / it is inserted into the logical newFiber flags | = Placement; return lastPlacedIndex; }}Copy the code

Explore the React source code series

React Fiber

React Diff