Diff algorithm what is Diff

A DOM node may have up to four nodes associated with it at any one time:

  1. DOMThe node itself.
  2. current Fiber. If theDOMThe nodes are already rendered in the page, socurrent FiberOn behalf of theDOMNode correspondingFiberNode.
  3. workInProgress Fiber. If theDOMNodes will be rendered to pages in this update,workInProgress FiberOn behalf of theDOMNode correspondingFiberNode.
  4. jsxObject. That isclassThe component’srenderMethods orfunctionThe result returned by the component is of typeReactElement.

The Diff algorithm compares 2 and 4 and produces 3.

The React Diff algorithm differs from the common Diff algorithm

The ordinary Diff algorithm compares the two trees completely. Even in the more advanced algorithm, the time complexity of the algorithm is O(n³), where n is the number of elements in the tree. If you use this algorithm, the number of calculations that you need to perform for 1,000 elements is 1,000 cubed, which is on the order of a billion. The amount of calculation is terrifying.

To reduce algorithm complexity, the React diff presets three limits:

  1. Apply only to sibling elementsDiff. If aDOMThe node has crossed the past in two updates, thenReactNo attempt will be made to reuse him.
  2. Two different types of elements produce different trees. If the element is controlled bydivbecomep.ReactWill be destroyeddivAnd its descendant nodes, and create a newpAnd its descendants.
  3. Developers can usekeyThis property indicates which child elements are stable under different renders.

To explain the third point:

/ / before update
<div>
  <p key="key1">p</p>
  <h3 key="key2">h3</h3>
</div>

/ / updated
<div>
  <h3 key="key2">h3</h3>
  <p key="key1">p</p>
</div>
Copy the code

If you do not add key then p changes to h3 according to article 2, then the node will be destroyed and new.

But now that the key has been added, and it still exists after the update, the node can be reused, just by switching the order.

The time complexity of Diff algorithm can be increased to O(n) after this series of processing.

The source code

As we said earlier in reconciling, the Diff algorithm is used in the reconciling phase, when the beginWork reconcileChildFibers are called.

// path: packages/react-reconciler/src/ReactChildFiber.new.js
// Select different diff functions to handle according to the newChild type
function reconcileChildFibers(
  returnFiber: Fiber,
  currentFirstChild: Fiber | null,
  newChild: any,
  lanes: Lanes,
) :Fiber | null {
  const isUnkeyedTopLevelFragment =
    typeof newChild === 'object'&& newChild ! = =null &&
    newChild.type === REACT_FRAGMENT_TYPE &&
    newChild.key === null;
  if (isUnkeyedTopLevelFragment) {
    newChild = newChild.props.children;
  }

  // Handle the object type
  if (typeof newChild === 'object'&& newChild ! = =null) {
    switch (newChild.$$typeof) {
      case REACT_ELEMENT_TYPE:
        return placeSingleChild(
          reconcileSingleElement(
            returnFiber,
            currentFirstChild,
            newChild,
            lanes,
          ),
        );
      case REACT_PORTAL_TYPE:
        return placeSingleChild(
          reconcileSinglePortal(
            returnFiber,
            currentFirstChild,
            newChild,
            lanes,
          ),
        );
      case REACT_LAZY_TYPE:
        if (enableLazyElements) {
          const payload = newChild._payload;
          const init = newChild._init;
          returnreconcileChildFibers( returnFiber, currentFirstChild, init(payload), lanes, ); }}// Array type
    if (isArray(newChild)) {
      return reconcileChildrenArray(
        returnFiber,
        currentFirstChild,
        newChild,
        lanes,
      );
    }

    if (getIteratorFn(newChild)) {
      return reconcileChildrenIterator(
        returnFiber,
        currentFirstChild,
        newChild,
        lanes,
      );
    }

    throwOnInvalidObjectType(returnFiber, newChild);
  }

  // Handle string or numeric types
  if (typeof newChild === 'string' || typeof newChild === 'number') {
    return placeSingleChild(
      reconcileSingleTextNode(
        returnFiber,
        currentFirstChild,
        ' ' + newChild,
        lanes,
      ),
    );
  }

  // Omit some code

  // Delete the node
  return deleteRemainingChildren(returnFiber, currentFirstChild);
}
Copy the code

The function can be summarized as follows:

  1. newChildobjectType to distinguish is notArrayType,ArrayType indicates that the sibling contains multiple nodes, while others indicate that the sibling has only one node.
  2. newChildIf it is a string or a number, it indicates that there is only one node at the same level.

There are some differences in how one node and multiple nodes are handled at the same level. This is also the problem that Diff algorithm deals with.

The new node is the single-node Diff

Take the reconcileSingleElement of the Object for example:

// path: packages/react-reconciler/src/ReactChildFiber.new.js
function reconcileSingleElement(
  returnFiber: Fiber,
  currentFirstChild: Fiber | null,
  element: ReactElement,
  lanes: Lanes,
) :Fiber {
  const key = element.key;
  let child = currentFirstChild;
  // Check whether the corresponding DOM node exists
  while(child ! = =null) {
    // There was a DOM node in the last update
    // Compare keys to see if they are the same
    if (child.key === key) {
      // If the key is the same and the type is the same, return the reusable fiber
      const elementType = element.type;
      if (elementType === REACT_FRAGMENT_TYPE) {
        if (child.tag === Fragment) {
          // Mark the sibling of this fiber as deleted
          deleteRemainingChildren(returnFiber, child.sibling);
          const existing = useFiber(child, element.props.children);
          existing.return = returnFiber;
          // Omit some code
          returnexisting; }}else {
        if (
          child.elementType === elementType ||
          (__DEV__
            ? isCompatibleFamilyForHotReloading(child, element)
            : false) ||
          (enableLazyElements &&
            typeof elementType === 'object'&& elementType ! = =null &&
            elementType.$$typeof === REACT_LAZY_TYPE &&
            resolveLazy(elementType) === child.type)
        ) {
          // Mark the sibling of this fiber as deleted
          deleteRemainingChildren(returnFiber, child.sibling);
          const existing = useFiber(child, element.props);
          existing.ref = coerceRef(returnFiber, child, element);
          existing.return = returnFiber;
          // Omit some code
          returnexisting; }}// Same key but different type
      // Mark this fiber and its sibling fiber as deleted
      deleteRemainingChildren(returnFiber, child);
      break;
    } else {
      // If the key is different, mark the fiber as deleted
      deleteChild(returnFiber, child);
    }
    // Switches the last node to a sibling node
    child = child.sibling;
  }

  // No corresponding DOM node exists, create a new fiber node and return
  if (element.type === REACT_FRAGMENT_TYPE) {
    const created = createFiberFromFragment(
      element.props.children,
      returnFiber.mode,
      lanes,
      element.key,
    );
    created.return = returnFiber;
    return created;
  } else {
    const created = createFiberFromElement(element, returnFiber.mode, lanes);
    created.ref = coerceRef(returnFiber, currentFirstChild, element);
    created.return = returnFiber;
    returncreated; }}Copy the code

Example:

/ / to be updated
<ul>
  <li key="0"></li>
  <li key="1"></li>
  <li key="2"></li>
</ul>

/ / updated
<ul>
  <p key="1"></p>
</ul>
Copy the code

After the update, there is only one P brother, which belongs to the single-node Diff and will go to the above code logic.

The fibers corresponding to the previous three Li are iterated over in the reconcileSingleElement function.

If the key is different, the fiber will be deleted, and its brother fiber that has not been traversed will be processed.

When the key is the same, it will continue to determine whether the type is the same. In this case, this logic will be triggered when the node whose key=”1″ is traversed. In this case, the node and its sibling nodes will be deleted, so that the sibling nodes will be deleted even if they are not traversed.

The new node Diff for multiple nodes

Multi-node Diff goes to if (isArray(newChild)) {… } this logic.

// path: packages/react-reconciler/src/ReactChildFiber.new.js
function reconcileChildrenArray(
  returnFiber: Fiber,
  currentFirstChild: Fiber | null,
  newChildren: Array<*>,
  lanes: Lanes,
) :Fiber | null {
  // Omit some code

  let resultingFirstChild: Fiber | null = null;
  let previousNewFiber: Fiber | null = null;

  let oldFiber = currentFirstChild;
  let lastPlacedIndex = 0;
  let newIdx = 0;
  let nextOldFiber = null;
  // The first round of traversal
  for(; oldFiber ! = =null && newIdx < newChildren.length; newIdx++) {
    if (oldFiber.index > newIdx) {
      nextOldFiber = oldFiber;
      oldFiber = null;
    } else {
      nextOldFiber = oldFiber.sibling;
    }
    const newFiber = updateSlot(
      returnFiber,
      oldFiber,
      newChildren[newIdx],
      lanes,
    );
    if (newFiber === null) {
      if (oldFiber === null) {
        oldFiber = nextOldFiber;
      }
      break;
    }
    if (shouldTrackSideEffects) {
      if (oldFiber && newFiber.alternate === null) {
        deleteChild(returnFiber, oldFiber);
      }
    }
    lastPlacedIndex = placeChild(newFiber, lastPlacedIndex, newIdx);
    if (previousNewFiber === null) {
      resultingFirstChild = newFiber;
    } else {
      previousNewFiber.sibling = newFiber;
    }
    previousNewFiber = newFiber;
    oldFiber = nextOldFiber;
  }

  // newChildren is traversed, oldFiber is not traversed
  if (newIdx === newChildren.length) {
    deleteRemainingChildren(returnFiber, oldFiber);
    return resultingFirstChild;
  }

  // newChildren is not traversed, oldFiber is traversed
  if (oldFiber === null) {
    for (; newIdx < newChildren.length; newIdx++) {
      constnewFiber = createChild(returnFiber, newChildren[newIdx], lanes); 'if (newFiber === null) {
        continue;
      }
      lastPlacedIndex = placeChild(newFiber, lastPlacedIndex, newIdx);
      if (previousNewFiber === null) {
        resultingFirstChild = newFiber;
      } else {
        previousNewFiber.sibling = newFiber;
      }
      previousNewFiber = newFiber;
    }
    return resultingFirstChild;
  }

  // The second iteration: newChildren and oldFiber are not completed
  const existingChildren = mapRemainingChildren(returnFiber, oldFiber);

  for (; newIdx < newChildren.length; newIdx++) {
    const newFiber = updateFromMap(
      existingChildren,
      returnFiber,
      newIdx,
      newChildren[newIdx],
      lanes,
    );
    if(newFiber ! = =null) {
      if (shouldTrackSideEffects) {
        if(newFiber.alternate ! = =null) {
          existingChildren.delete(
            newFiber.key === null ? newIdx : newFiber.key,
          );
        }
      }
      lastPlacedIndex = placeChild(newFiber, lastPlacedIndex, newIdx);
      if (previousNewFiber === null) {
        resultingFirstChild = newFiber;
      } else{ previousNewFiber.sibling = newFiber; } previousNewFiber = newFiber; }}if (shouldTrackSideEffects) {
    existingChildren.forEach(child= > deleteChild(returnFiber, child));
  }

  return resultingFirstChild;
}
Copy the code

Diff is all about finding the differences between old and new nodes, so there are three operations for nodes: add, delete, and update. Updates occur the most frequently of the three operations. So the React Diff algorithm splits traversal into two rounds, the first for updates and the second for non-update operations.

First iteration

  1. First traversalnewChildrenAnd compare thenewChildrenThe first child of andoldFiberThrough thekeytype(updateSlotFunction to determine whether it can be reused.
  2. If you can reuse it thennewIdx++Continue the comparisonnewChildrenThe next child of andoldFiber.sibling, can be reused to continue traversal.
  3. If it cannot be reused, it can be divided into two situations: ①keyDifferent, executebreakImmediately jumped outforLoop, the first round of traversal ends; 2.keyThe sametypeDifferent, executedeleteChildThat will beoldFiberMarked asDELETIONAnd continue traversing.
  4. ifnewChildrenI’m going through it, oroldFiberAfter traversal, the loop breaks out and the first round of traversal ends.

If the traversal is displayed in Step 3, newChildren and oldFiber are not traversal either.

If the traversal in Step 4 is displayed, either newChildren or oldFiber may be traversed, or both may be traversed.

/*** step 3 Jump out ***/
// Before the update
<li key="0">0</li>
<li key="1">1</li>
<li key="2">2</li>            
// After the update
<li key="0">0</li>
<li key="2">1</li>
<li key="1">2</li>

/*** step 4 Jump out ***/
// Before the update
<li key="0" className="a">0</li>
<li key="1" className="b">1</li>
            
// Update case 1 -- both newChildren and oldFiber are traversed
<li key="0" className="aa">0</li>
<li key="1" className="bb">1</li>
            
// Update 2 -- newChildren is not iterated, oldFiber is iterated
// newChildren leave key==="2" untraversed
<li key="0" className="aa">0</li>
<li key="1" className="bb">1</li>
<li key="2" className="cc">2</li>
            
// Update to case 3 -- newChildren traversed, oldFiber traversed
// oldFiber key==="1
<li key="0" className="aa">0</li>
Copy the code

After the first walk, start the second walk.

Second round of traversal

The second round of traversal is divided into four cases according to the results of the first round of traversal.

NewChildren and oldFiber are traversed at the same time

The first iteration completes the component update (which is done by the updateSlot function), and the Diff ends.

NewChildren is not traversed, oldFiber is traversed

The existing DOM nodes are reused, and there are new nodes added, which means that new nodes are inserted in this update. We just need to traverse the rest of the newChildren to mark Placement in turn for the generated workInProgress fiber (placeChild function complete).

NewChildren is traversed, oldFiber is not traversed

This means that there are fewer nodes than the previous update, and some nodes have been deleted. Therefore, it is necessary to traverse the remaining oldFiber and mark Deletion in turn.

Neither newChildren nor oldFiber has been traversed

This means that some nodes have changed positions during this update. This is the essence of the Diff algorithm and the hardest part of it.

Steps:

  1. mapRemainingChildrenFunction will beoldFiberTo generate aMapexistingChildren, using nodeskeyAttributes asMapkey.MapvalueoldFiber.
  2. So let’s go over the restoldFiberThrough theexistingChildrenIn thekeyfindkeyThe sameoldFiber(byupdateFromMapFunction complete.
  3. lastPlacedIndexFor the last reusable node inoldFiberThe position of, witholdIndexsaidoldFiberindexProperty, which is the location index, compareslastPlacedIndexoldIndexThe size of theoldIndex >= lastPlacedIndexThen the node does not need to be moved and willoldIndexAssigned tolastPlacedIndex.oldIndex > lastPlacedIndexThe node needs to be moved.

In summary, we should minimize the need to move nodes from the back to the front, which is not performance-friendly.

For example, abcd -> acdb moves b directly behind; Abcd -> dabc moves ABC to the end.