React data structures used in the React source code

React uses specific data structures to implement functions and its internal workings. The code samples are basically simplified source code, with only the key logic preserved and links to the source code attached. React version 17.0.2.

You need to know before you start

ReactElements: Objects returned by JSX that are recreated after each render and are called elements.
Fiber Nodes: Objects created based on an element. The information carried by the element is merged into Fiber. Unlike elements, Fiber is not recreated after each render. Based on the Fiber architecture, React implements seemingly magical features such as asynchronous interruptible rendering and scheduling multiple tasks with different priorities. More on these features later.
Update: There are only two ways to trigger updates in React: render root and call setState. In the case of setState, update is the update object created by calling setState to record the values to be updated and so on. This object has a lane attribute that is then merged into the fiber.lanes attribute.
When the algorithm is updated, we need to compare the differences between the old tree and the new tree to find the parts that need to be rendered. The source code uses the old Fiber tree to compare with the new element tree.
Lanes: The priority model, part of the Fiber architecture, that allows for scheduling tasks of different priorities. Lane is a property of an Update object and is the priority assigned to the Update object based on the “action type” that produced the update. For example, when a component calls setState in componentDidMount and Click, the lane generated by the click event has a higher priority than that generated by componentDidmMount. For example, when a component calls setState in componentDidMount, the Lane generated by the Click event has a higher priority than that generated by componentDidmMount. The goal is to complete updates generated by user input as quickly as possible. Update. lane is then merged into fiber. Lanes, representing a collection of all lanes that have not been updated on a fiber (more on that later). Each update object generated by a call to setState has a lane that represents the priority of the update. There are many different types of lane. If you are interested in more details, you can see this PR.
Root: indicates the Root fiber node, which contains lanes generated by all fiber nodes. For example, a Root node contains ABfiber. FiberA.

Linked lists and Fiber trees

Like an array, a linked list is a linear structure, but unlike an array, the elements in a linked list are not contiguous in memory. In addition to their own value, the elements in a linked list have a pointer attribute that points to the next element. One advantage of linked lists over traditional arrays is that you don’t need to move other elements when you add or remove them.

Fiber tree is a linked list. Each Fiber object is a linked list node with three Pointers: return, Sibling, and child, which point to the parent node, sibling node, and first child node respectively.

Why design Fiber as a linked list node? React implements asynchronous interruptible rendering based on the Fiber architecture: threads are periodically released (paused) to continue at the compromised pace, ensuring that rendering doesn’t block threads for too long. For example, in the process of rendering 10,000 elements, if you try to enter something in the input box or click on an element, the interface will not React at all because the thread is executing the React source code (update, create fiber, etc.). If you render the 100th element and give up the thread, and then get user input, since the thread is idle, you can respond to the input, feed back the input and continue the rest of the update from the 100th fiber that was interrupted.

Obviously, each render processes the entire path from the node that initiated the update (a deep walk through the tree). Recursion is easy to think of, but recursion can be broken, so there is no way to know the current progress of the task, and it is impossible to recover progress from the break. Andrew Clark explains the Fiber architecture

Therefore, using some data structure to replace the original recursive call stack and breaking down the operation of each node into small units is the first step to achieve interruptibility, and then the recovery progress can be further realized. Using a linked list to represent the unit set in the tree is obviously more intuitive than a multidimensional array.

WorkLoopConcurrent in the source code always uses the wIP pointer to traverse the Fiber tree, while the beginWork and completeUnitOfWork simulate recursion into the stack and off the stack, respectively. BeginWork returns the child, complete returns the sibling, or returns NULL and completes the parent. Using the linked list data structure, you can arbitrarily recover the unfinished work from the interrupt via the WIP pointer.

function workLoopConcurrent() {
  // Loop fiber tree, workInProgress represents processing fiber
  while(workInProgress ! = =null&&! shouldYield()) { performUnitOfWork(workInProgress); }}function performUnitOfWork(unitOfWork: Fiber) :void {
  // performUnitOfWork Performs different logic depending on the element type, such as calling function components, instantiating class components, updating context values, etc.
  const current = unitOfWork.alternate;
  let next;
  // Check for updates, initialization of elements of the corresponding type, reconciliation, etc.
  next = beginWork(current, unitOfWork, subtreeRenderLanes);

  unitOfWork.memoizedProps = unitOfWork.pendingProps;
  if (next === null) {
    // The path is finished (there are no children), and the stack logic is executed.
    completeUnitOfWork(unitOfWork);
  } else{ workInProgress = next; }}Copy the code

Stack and the Context

Stack A linear structure that follows LIFO.

The React stack is used a lot in the react source code. Here’s one that I think is typical: contextAPI application.

The React documentation introduces a feature of context: “A Provider can interact with multiple consumer components. Multiple providers can also be nested, with the inner layer overwriting the outer data.”

const MyContext = React.createContext(null);

const Content = () = > (
  <MyContext.Consumer>
    {value => `The ${value} layer.`}
  </MyContext.Consumer>
);

const Demo = () = > (
  <>
    <MyContext.Provider value="outer">
      <Content />{/* outer */}<MyContext.Provider value="inner">
        <Content />{/* Inner layer overwrites outer layer data, Content is inner */}</MyContext.Provider>
      <Content />{/* outer Content outer */}</MyContext.Provider>
  </>
);
Copy the code

Now that we know about this feature, let’s look at the problem. We said that the order of updating the component tree is deep traversal:

Outer Provider -> outer first Content -> inner Provier -> inner Content -> outer second ContentCopy the code

The problem is that you need to trace the value back to the outer Provider, whose value is outer, after processing a path for the inner Provider. An easy solution would be to associate the Consumer with the Provider instance so that it can get the correct value directly, but doing so would be costly for the Provider to find the corresponding Consumer.

ValueStack, valueCursor, and index stand for contextValue, contextValue value, contextValue index, contextValue index, contextValue index, contextValue index, contextValue index, contextValue index, contextValue index, contextValue index, contextValue index, contextValue index, contextValue index, contextValue index, contextValue index

All context instances use these three global variables.

When traversing the fiber tree, we call pushProvider when we encounter the Provider type Fiber: Add index, change pointer, store the value of the upper context (if there is another Provider) on the stack, update cursor to the value of the current context, Finally, update the value of the context instance.

So when you’re done with all the nodes of that path, that’s when you’re out of the stack, you execute popProvider, and in this case, you process the innerContext and then you go back and process the second Context, and popProvider does some recovery, Ensure that the third pushProvider gets the correct contextValue.

export function pushProvider(context) {
  index++;

  valueStack[index] = valueCursor.current;

  valueCursor.current = value;

  context._currentValue = nextValue;
}

export function popProvider(context) {
  // Cursor refers to the previous contextValue, depending on whether the Provider is nested to point to the upper-layer contextValue or is already set to NULL
  const currentValue = valueCursor.current;
  valueCursor.current = valueStack[index];
  // Remove the value of this context from the stack
  valueStack[index] = null;
  index--;
  // Update the value of the context instance
  context._currentValue = currentValue;
}
Copy the code

Using a stack to keep track of context changes is a lot like using a linked list to keep track of fiber trees, using data structures instead of a recursive call stack so that you can manually control the values in the stack.

Binary heap

A binary heap is a binary tree that satisfies the following two properties.

A complete binary tree in which every level of the tree has left and right child nodes (except for the leaves of the last level), and the leaves of the last level are left child nodes as much as possible. This property of the binary heap makes it ideal for array representation.
A binary heap is either the smallest heap or the largest heap. In the smallest heap, the node at the top of the heap is the smallest. All nodes are less than or equal to each of its children. Maximum heap is the opposite of minimum heap.

Both of the following heaps are legal:

            10                      10
         /      \               /       \  
       20        100          15         30  
      /                      /  \        /  \
    30                     40    50    100   40
Copy the code

Binary heaps can be represented as arrays:

The top element of the heap is at index 0
With I as its own index, (i-1) / 2 points to the parent node
(2 times I) + 1 refers to the left child node
2 times I plus 2 goes to the right child

The operation of the heap

SiftUp () : The upshift operation swaps the node and its parent until the parent is smaller than this value, with a time complexity of O(Logn). If the parent is not smaller than this value, no operation is needed, because the rest of the heap is sufficient.

function siftUp(heap, node, i) {
  let index = i;
  while (index > 0) {
    const parentIndex = (index - 1) > > >1;
    const parent = heap[parentIndex];
    if (compare(parent, node) > 0) {
      // The parent is larger. Swap positions.
      heap[parentIndex] = node;
      heap[index] = parent;
      index = parentIndex;
    } else {
      // The parent is smaller. Exit.
      return; }}}Copy the code

SiftDown () : A move down operation that swaps this node with its left and right children until the child is larger than it, in O(Logn) time.

function siftDown(heap, node, i) {
  let index = i;
  const length = heap.length;
  const halfLength = length >>> 1;
  while (index < halfLength) {
    const leftIndex = (index + 1) * 2 - 1;
    const left = heap[leftIndex];
    const rightIndex = leftIndex + 1;
    const right = heap[rightIndex];

    // If the left or right node is smaller, swap with the smaller of those.
    if (compare(left, node) < 0) {
      if (rightIndex < length && compare(right, left) < 0) {
        heap[index] = right;
        heap[rightIndex] = node;
        index = rightIndex;
      } else{ heap[index] = left; heap[leftIndex] = node; index = leftIndex; }}else if (rightIndex < length && compare(right, node) < 0) {
      heap[index] = right;
      heap[rightIndex] = node;
      index = rightIndex;
    } else {
      // Neither child is smaller. Exit.
      return; }}}Copy the code

Push () : Inserts values into the heap. The time complexity is O(Logn), because siftUp operations are performed on the newly inserted elements to maintain binary heap properties.

export function push(heap: Heap, node: Node) :void {
  const index = heap.length;
  heap.push(node);
  siftUp(heap, node, index);
}
Copy the code

Pop () : Remove the top element and place the last element on the top of the heap in O(Logn) time, because siftDown operation on the top element is required to maintain binary heap characteristics after removing the element.

export function pop(heap: Heap) :Node | null {
  if (heap.length === 0) {
    return null;
  }
  const first = heap[0];
  const last = heap.pop();
  if(last ! == first) { heap[0] = last;
    siftDown(heap, last, 0);
  }
  return first;
}
Copy the code

Scheduler

It is well known that setState has the behavior of batch update. In terms of implementation, setState does not update the update object immediately after it is generated, but stores the task to be executed in a task queue. Therefore, setState does not perform any update, but only records some information to be processed. Place the task to be executed for another update into a task queue.

What actually performs the task is the Schduler module, a single-responsibility scheduler designed as a code base and not coupled to React. The main functions of Scheduler include sorting tasks in the task queue, selecting tasks with the highest priority for asynchronous execution, and calling up unfinished tasks when the thread is idle (as shown in React, workLoop interrupts).

Will React to call the Scheduler scheduleCallback function, will be registered to the Scheduler taskQueue performConcurrentWorkOnRoot function, when the task should be performed after hand over to the Scheduler, React is no longer in charge.

function unstable_scheduleCallback(priorityLevel, callback, options) {
  var currentTime = getCurrentTime();

  var startTime = currentTime;

  var timeout;
  switch (priorityLevel) {
    case ImmediatePriority:
      timeout = IMMEDIATE_PRIORITY_TIMEOUT;
      break;
    case UserBlockingPriority:
      timeout = USER_BLOCKING_PRIORITY_TIMEOUT;
      break;
    case IdlePriority:
      timeout = IDLE_PRIORITY_TIMEOUT;
      break;
    case LowPriority:
      timeout = LOW_PRIORITY_TIMEOUT;
      break;
    case NormalPriority:
    default:
      timeout = NORMAL_PRIORITY_TIMEOUT;
      break;
  }
  If the task expires, the task will be placed at the top of the queue. The expiration time of each priority is different.
  var expirationTime = startTime + timeout;
  var newTask = {
    id: taskIdCounter++,
    callback,
    priorityLevel,
    startTime,
    expirationTime,
    sortIndex: -1};// sortIndex is set based on the expiration time, the smaller the priority (closest to the current time).
  newTask.sortIndex = expirationTime;
  // Operate binary heap
  push(taskQueue, newTask);
  Scheduler and React use the same name on Scheduler and React to try to empty the queue.
  requestHostCallback(flushWork);

  return newTask;
}
Copy the code

function workLoop(hasTimeRemaining, initialTime) {
  let currentTime = initialTime;
  currentTask = peek(taskQueue);
  while(currentTask ! = =null) {
    if( currentTask.expirationTime > currentTime && (! hasTimeRemaining || shouldYieldToHost()) ) {// If it hasn't expired, shouldYieldToHost is also called to let go of the thread
      break;
    }
    const callback = currentTask.callback;
    currentTask.callback = null;
    currentPriorityLevel = currentTask.priorityLevel;
    const didUserCallbackTimeout = currentTask.expirationTime <= currentTime;
    / / the React ` performConcurrentWorkOnRoot ` pass will accept a Scheduler () function checks whether the expired identification, if has expired, the React will launch non-interruptible update

    / / ` performConcurrentWorkOnRoot ` function according to whether the React will have to deal with all the fiber to the Scheduler or null return itself
    // Interrupts require notification to the Scheduler not to remove itself from the task queue and wait for an opportunity to resume the task
    const continuationCallback = callback(didUserCallbackTimeout);
    if (typeof continuationCallback === 'function') {
      // Set the task's callback as the function to return
      currentTask.callback = continuationCallback;
    } else {
      // The task is finished and removed from the queue
      if (currentTask === peek(taskQueue)) {
        pop(taskQueue);
      }
    }
    currentTask = peek(taskQueue);
  }
  if(currentTask ! = =null) {
    // Returns whether there are still tasks in the queue
    return true;
  } else {
    return false; }}Copy the code

Finally, scheduling multiple tasks with different priorities

Earlier we mentioned that a fiber. Lanes attribute can contain multiple lanes of different priorities, which is the most common case: a workLoop gives in (there is unfinished work) and a component generates a new update, in which case Fiber.

If a fiber has multiple lanes, React will need to pick out the highest-priority updates to execute. Now suppose the existing task is set to state as A and the new task is set to state as B, there will be two cases: the existing priority and the new priority:

If A task has A higher priority, continue to execute the task that will be set to state A, and the update portion with state B will be skipped, internally it actually works like this: During each update, existing lanes are compared with new unprocessed lanes(root.pendingLanes). These low-priority lanes(such as state updates) are skipped and wait for the next schedule. React does not merge the two setStates into the same batch. Instead, it executes the high-priority task first.

// Skip updated source code with a lower priority than base, as shown in the 'processUpdateQueue' function
if(! isSubsetOfLanes(renderLanes, updateLane)) {// Priority is insufficient. Skip this update. If this is the first
  // skipped update, the previous update/state is the new base
  // update/state.
  const clone: Update<State> = {
    eventTime: updateEventTime,
    lane: updateLane,

    tag: update.tag,
    payload: update.payload,
    callback: update.callback,

    next: null};if (newLastBaseUpdate === null) {
    // Skipped UPDATE objects are collected into the 'baseUpdate' queue, waiting for the next processing
    newFirstBaseUpdate = newLastBaseUpdate = clone;
    newBaseState = newState;
  } else {
    newLastBaseUpdate = newLastBaseUpdate.next = clone;
  }
  // Update the remaining priority in the queue.
  newLanes = mergeLanes(newLanes, updateLane);
}
Copy the code

New task priority is higher, need to cancel the existing tasks, at a higher priority on the Scheduler to register a new performConcurrentWorkOnRoot, this kind of situation can be understood as a high priority task to jump the queue.

// cancel the source code of low-priority tasks, which is one of the 'ensureRootIsScheduled' functions
const nextLanes = getNextLanes(
  root,
  root === workInProgressRoot ? workInProgressRootRenderLanes : NoLanes,
);
// We use the highest priority lane to represent the priority of the callback.
const newCallbackPriority = getHighestPriorityLane(nextLanes);

// Check if there's an existing task. We may be able to reuse it.
// Get the priority of an existing task
const existingCallbackPriority = root.callbackPriority;
if (existingCallbackPriority === newCallbackPriority) {
  // The priority hasn't changed. We can reuse the existing task. Exit.
  // The new and old tasks have the same priority, so the new update can be processed by reusing existing tasks
  return;
}
// 'getNextLanes' will filter out existing and unprocessed lanes with higher priorities, so existing lanes can only have lower or equal priorities
if(existingCallbackNode ! =null) {
  // Cancel the existing callback. We'll schedule a new one below.
  // Cancel the existing task and wait for the next schedule to complete it
  cancelCallback(existingCallbackNode);
}
Copy the code

React data structures used in the React source code

You need to know before you start

Linked lists and Fiber trees

Stack and the Context

Binary heap

The operation of the heap

Scheduler

Finally, scheduling multiple tasks with different priorities

Related Posts

Mpvue research

CSS Box Model

6 frequently used TypeScript features