How to implement a Virtual DOM algorithm? Issue #13 livoras/blog GitHub
- 1 introduction
- Thinking about front-end application state management
- 3. Virtual DOM algorithm
- 4 Algorithm Implementation
- 4.1 Step 1: Simulate the DOM tree with JS objects
- 4.2 Step 2: Compare the differences between two virtual DOM trees
- 4.3 Step 3: Apply the difference to a real DOM tree
- 5 conclusion
- 6 References
This article will teach you how to implement a basic Virtual DOM algorithm in 300~400 lines of code, and try to explain the idea of Virtual DOM algorithm as clearly as possible. I hope that after reading this article, you can have a deep understanding of Virtual DOM algorithm and provide some new thinking for your existing front-end programming.
The complete code implemented in this article is hosted on Github.
Suppose you need to write an application for a table like the one below, which can be displayed in ascending or descending order depending on the field.
The application seems so simple that you can think of several different ways to write it. The easiest thing to think of is to store data in your JavaScript code like this:
Var sortKey = "new" Var sortType = 1 var data = [{...}, {...}, {..},...] // Table dataCopy the code
Use three fields respectively to store the current sorted field, sorted direction, and table data; Then add click events to the table header: when the user clicks on a specific field, sort the content according to the contents stored in the above several fields, and then operate DOM with JS or jQuery to update the page’s sorting state (the arrows in the table header indicate the current sorting state, which also needs to be updated) and the table content.
As a result, as the application becomes more complex, more fields need to be maintained in JS, and more DOM operations need to be listened for and invoked to update the page in the event response, the application becomes very difficult to maintain. Later, people used the MVC, MVP architecture pattern, hoping to reduce the difficulty of maintaining such complex applications from the way the code is organized. But MVC doesn’t reduce the state that you maintain, it doesn’t reduce the state updates that you need to do to the page (which is DOM in the front end), you still need to do to the DOM, just in a different place.
Since the state changes to manipulate the corresponding DOM element, why not make something that binds the view to the state, so that the view changes automatically when the state changes, so that you don’t have to manually update the page? This is where people came up with the MVVM pattern, where once you declare in the template what state the view component is bound to, the bidirectional binding engine will automatically update the view when the state is updated (see this introduction for MV* mode).
MVVM can greatly reduce the complexity of maintaining state -> views (greatly reducing the view update logic in the code). But this is not the only way, there is a very intuitive way to greatly reduce view update operations: once the state changes, re-render the entire view with the template engine, and replace the old view with a new one. Just like the table above, when the user clicks, the state is still updated in JS, but instead of manually manipulating the DOM, the entire table is rendered by the template engine and the innerHTML is set.
When you hear about this practice, you will immediately realize that this practice can cause a lot of problems. The biggest problem is that this is very slow, because even a small state change reconstructs the entire DOM, which is very cost-effective; And by doing so, the input and textarea will lose focus. The final conclusion will be: There is no problem updating local small views. Backbone does this. However, for large views, such as global application state changes, it is not advisable to update many local views of the page.
But be aware and keep this in mind, because as you’ll see later, Virtual DOM actually does this, with special steps added to avoid changes to the entire DOM tree.
Another point to note is that each of the approaches presented above addresses the same problem: maintaining state and updating views. In a general application, a good solution to this problem reduces almost all of the complexity.
DOM is slow. If we print out all the attributes of a simple div element, you’ll see:
And that’s just the first layer. Real DOM elements are huge because that’s how the standard is designed. And you have to be very careful with them, the slightest touch can cause the page to be rearranged, which is a performance killer.
Native JavaScript objects are faster and simpler to process than DOM objects. The structure and attribute information of the DOM tree can be easily represented as JavaScript objects:
Var element = {tagName: 'ul', // node tag: props: {// DOM, props: {id: 'list'}, children: [// child of this node {tagName: 'li', props: {class: 'item'}, children: ["Item 1"]}, {tagName: 'li', props: {class: 'item'}, children: ["Item 2"]}, {tagName: 'li', props: {class: 'item'}, children: ["Item 3"]}, ] }Copy the code
The corresponding HTML is written as follows:
<ul id='list'>
<li class='item'>Item 1</li>
<li class='item'>Item 2</li>
<li class='item'>Item 3</li>
</ul>
Copy the code
Since the information in the DOM tree can be represented in JavaScript objects, you can, in turn, build a real DOM tree based on the tree structure represented in JavaScript objects.
As mentioned in the previous section, the way the state changes -> re-render the entire view can be slightly modified: use JavaScript objects to represent DOM information and structure, and re-render this JavaScript object structure when the state changes. Of course, this doesn’t really help, because the actual page doesn’t change.
However, you can use the newly rendered object tree to compare with the old tree to record the differences between the two trees. The difference recorded is that we need real DOM operations on the page, then apply them to the real DOM tree, and the page changes. That’s it: the structure of the view is indeed completely new, but it does change only in different ways when you finally manipulate the DOM.
This is known as the Virtual DOM algorithm. There are several steps:
- Using JavaScript object structure to represent the DOM tree structure; Then use this tree to build a real DOM tree and plug it into the document
- When the state changes, a new object tree is rebuilt. Then compare the new tree with the old tree and note the differences between the two trees
- Applying the differences recorded in 2 to the actual DOM tree built in Step 1, the view is updated
The Virtual DOM is essentially a cache between JS and DOM. The analogy is CPU and hard disk, since the hard disk is so slow, we put a cache between them: since DOM is so slow, we put a cache between them, JS and DOM. The CPU (JS) only operates on memory (the Virtual DOM) and writes changes to the hard disk (DOM) at the end.
Representing a DOM node in JavaScript is a simple matter. You just need to record its node type, attributes, and child nodes:
element.js
function Element (tagName, props, children) {
this.tagName = tagName
this.props = props
this.children = children
}
module.exports = function (tagName, props, children) {
return new Element(tagName, props, children)
}
Copy the code
For example, the DOM structure above can be expressed simply:
var el = require('./element')
var ul = el('ul', {id: 'list'}, [
el('li', {class: 'item'}, ['Item 1']),
el('li', {class: 'item'}, ['Item 2']),
el('li', {class: 'item'}, ['Item 3'])
])
Copy the code
Ul is now just a DOM structure represented by a JavaScript object that does not exist on the page. We can build true <ul> from this UL:
Prototype = function () {var el = document.createElement(this.tagName) // Build var props = function () {var el = document.createElement(this.tagname) This. Props for (var propName in props) {// Set the DOM attribute of the object var propValue = props[propName] el.setAttribute(propName, propValue) } var children = this.children || [] children.forEach(function (child) { var childEl = (child instanceof Element) ? Child.render () // If the child node is also a virtual DOM, recursively build the DOM node: Document.createtextnode (child) // If a string, build only the text node el.appendChild(childEl)}) return el}Copy the code
The Render method builds a real DOM node based on the tagName, sets its properties, and recursively builds its own child nodes. So you just need to:
var ulRoot = ul.render()
document.body.appendChild(ulRoot)
Copy the code
The ulRoot above is a real DOM node. Insert it into the document so that you have a real <ul> DOM structure inside the body:
<ul id='list'>
<li class='item'>Item 1</li>
<li class='item'>Item 2</li>
<li class='item'>Item 3</li>
</ul>
Copy the code
The full code is visible at element.js.
As you might expect, comparing the differences between two DOM trees is a core part of the Virtual DOM algorithm, which is also known as the Virtual DOM diff algorithm. The complete diff algorithm for two trees is an O(n^3) time complexity problem. But in the front end, you rarely move DOM elements across hierarchies. So the Virtual DOM only compares elements at the same level:
The upper div will only be compared to divs of the same level, and the second level will only be compared to the second level. So that’s order n.
In real code, a depth-first traverse of the old and new trees would be performed so that each node would have a unique tag:
In depth-first traversal, each time a node is traversed, the node is compared to the new tree. If there is a difference, it is recorded in an object.
// diff function, Function diff (oldTree, newTree) {var patches = 0 var patches = {} dfsWalk(oldTree, newTree); NewTree, index, patches) return patches} function dfsWalk (oldNode, newNode, index, Patches [index] = [...] {// For patches[index] = [...] Function diffChildren(oldNode. Children, newNode. Children, index, patches)} newChildren, index, patches) { var leftNode = null var currentNodeIndex = index oldChildren.forEach(function (child, I) {var newChild = newChildren[I] currentNodeIndex = (leftNode && leftNode.count) // Compute node id? currentNodeIndex + leftNode.count + 1 : CurrentNodeIndex + 1 dfsWalk(Child, newChild, currentNodeIndex, patches) leftNode = child})}Copy the code
For example, if the div above is different from the new div and the current tag is 0, then:
patches[0] = [{difference}, {difference}, ...] // Use an array to store the differences between old and new nodesCopy the code
Similarly, P is a patch [1], ul is a patch [3], and so on.
What are the differences in nodes mentioned above? Operations on the DOM might:
- Replace the original node, such as div with section
- Move, delete, and add child nodes, such as the div child node above, interchanging p and ul
- The attributes of the node are modified
- For text nodes, the text content may change. For example, change the content of text node 2 above to Virtual DOM 2.
So we define several types of differences:
var REPLACE = 0
var REORDER = 1
var PROPS = 2
var TEXT = 3
Copy the code
For node replacement, it’s simple. Check whether the tagnames and values of the old and new nodes are the same. If they are different, they need to be replaced. If div is replaced with section, write:
patches[0] = [{
type: REPALCE,
node: newNode // el('section', props, children)
}]
Copy the code
If you add container to a div, record:
patches[0] = [{
type: REPALCE,
node: newNode // el('section', props, children)
}, {
type: PROPS,
props: {
id: "container"
}
}]
Copy the code
If it is a text node, such as text node 2 above, record it:
patches[2] = [{
type: TEXT,
content: "Virtual DOM2"
}]
Copy the code
What if I reorder the children of my div? For example, the order of p, ul, div is changed to div, P, ul. How do you compare this? If you compare them in the same order, they all get replaced. If the tagName of p is different from that of div, p is replaced by div. Eventually, all three nodes are replaced, making DOM overhead very high. We don’t actually have to replace nodes, we just have to move nodes, we just have to know how to move them.
This involves a comparison algorithm for two lists, which needs to be discussed in a separate section.
Suppose each child node can now be uniquely identified alphabetically:
The old node order:
a b c d e f g h i
Copy the code
The node is now deleted, inserted, and moved. Add j node, delete E node, move H node:
New node order:
a b c h d f g i j
Copy the code
Now that you know the order of old and new, minimize insert and delete (move can be thought of as a combination of delete and insert). Abstract, this problem is actually Edition Distance of string. The most common solution algorithm is Levenshtein Distance, which is solved by dynamic programming and the time complexity is O(M * N). But we don’t really need to achieve the minimum operation, we just need to optimize some of the more common movement cases, sacrifice some DOM operations, so that the algorithm time complexity is linear (O(Max (M, N)). Specific algorithm details are more, here is not tired, interested can refer to the code.
If we can get the operations of the children of a parent node, we can record them:
patches[0] = [{
type: REORDER,
moves: [{remove or insert}, {remove or insert}, ...]
}]
Copy the code
Note, however, that since the tagName is repeatable, you cannot use this for comparison. Therefore, you need to add a unique identification key to the child node. When comparing lists, you can use the key for comparison, so that the nodes in the old DOM tree can be reused.
In this way, we can record the differences of each node by going depth-first through the two trees and comparing the nodes at each level. The complete diff algorithm code can be seen in diff.js.
Because the JavaScript object tree constructed in step 1 has the same information and structure as the actual DOM tree rendered. Therefore, we can also perform depth-first traversal of the DOM tree. During the traversal, we can find the differences of nodes currently traversed from the patches object generated in step 2, and then perform DOM operations.
function patch (node, patches) { var walker = {index: 0} dfsWalk(node, walker, patches) } function dfsWalk (node, walker, Patches) {var currentPatches = patches[walker.index] var len = node.childNodes? node.childNodes.length : 0 for (var i = 0; i < len; I ++) {var child = node.childnodes [I] walker. Index++ dfsWalk(child, walker, Patches)} if (currentPatches) {applyPatches(node, currentPatches) // Perform DOM manipulation on the current node}}Copy the code
ApplyPatches perform DOM operations on the current node based on the differences of different types:
function applyPatches (node, currentPatches) {
currentPatches.forEach(function (currentPatch) {
switch (currentPatch.type) {
case REPLACE:
node.parentNode.replaceChild(currentPatch.node.render(), node)
break
case REORDER:
reorderChildren(node, currentPatch.moves)
break
case PROPS:
setProps(node, currentPatch.props)
break
case TEXT:
node.textContent = currentPatch.content
break
default:
throw new Error('Unknown patch type ' + currentPatch.type)
}
})
}
Copy the code
The complete code can be seen in patch.js.
Virtual DOM algorithm is mainly to achieve the above steps of the three functions: Element, diff, patch. Then you can actually use it:
/ / 1. Build a virtual DOM tree = el var (' div '{' id' : 'container'}, [el (' h1, {style: 'color: blue'}, ['simple virtal dom']), el('p', ['Hello, virtual-dom']), el('ul', [el('li')]) ]) // 2. Through the virtual DOM to build real DOM tree. The var root = render () document. The body. The appendChild (root) / / 3. Create a new virtual DOM var newTree = el('div', {'id': 'container'}, [el('h1', {style: 'color: red'}, ['simple virtal dom']), el('p', ['Hello, virtual-dom']), el('ul', [el('li'), el('li')]) ]) // 4. Var patches = Diff (tree, newTree) // 2. Apply altered patches (root, patches) to real DOM elementsCopy the code
Of course, this is a very crude practice, and in practice you have to deal with event listening and so on; JSX syntax can also be added when generating the virtual DOM. With all this done, you can build a simple ReactJS.
The complete code implemented in this article is stored on Github for study only.