What is DOM?

DOM is the internal data structure that expresses HTML, connects Web pages to JavaScript scripts, and filters out unsafe content.

Why do you need DOM?

The browser engine does not recognize the HTML file byte stream, so you need to replace the HTML file byte stream with an internal structure that the browser engine recognizes, which is the DOM.

How is DOM used?

DOM provides a structured representation of HTML documents. The DOM provides JavaScript with an interface for scripting operations. Through this interface, JavaScript can access the DOM structure to change the structure, style, and content of a document.

How is the DOM tree generated?

The HTML byte stream is converted into a DOM structure, which needs to be processed by the HTML parser. The HTML parser is a module within the browser’s rendering engine. The job of an HTML parser is to parse HTML pages and resources retrieved from the network or local disk from the byte stream into a DOM tree structure:

  1. After receiving the response header, the network process determines the type of the file based on the content-Type field in the response header. For example, if the value of the content-Type is text/ HTML, the browser determines that the file is an HTML file.
  2. Then select or create a renderer process for the request.
  3. Once the renderer is ready, a pipeline is established to share data between the network process and the renderer process.
  4. The network process receives data and places it in the pipe, while the renderer process reads data from the other end of the pipe and throws it back to the HTML parser.
  5. There are three stages to convert byte streams into DOM.
  6. In phase 1, the byte stream is converted into tokens, including Tag tokens and text tokens, through the word splitter. Tag The Token is divided into StartTag and EndTag.
  7. The second and third phases, which occur simultaneously, involve parsing the Token into a DOM node and adding the DOM node to the DOM tree. The HTML parser maintains a Token stack structure that is used to compute parent-child relationships between nodes, and the tokens generated in the first phase are pushed into this stack in sequence.

When the parser starts working, it creates an empty DOM structure with root document and pushes a StartTag Document Token to the bottom of the stack. The first StartTag HTML Token parsed by the tokenizer is then pushed onto the stack and an HTML DOM node is created, which is added to the document. The new tokens generated by the tokenizer are pushed and pushed, and the whole parsing process continues until the tokenizer has split all the byte streams.

An 🌰 :


<html>
    <body>
        <div>Have a meal</div>
        <div>Go to bed</div>
    </body>
</html>
Copy the code
  1. Create an empty DOM structure with root document and push the StartTag Document Token to the bottom of the stack.
  2. The parser parses the StartTag HTML, then creates an HTML DOM node, mounts it under the Document, and pushes the StartTag HTML Token.
  3. The tokenizer parses the StartTag body, creates a DOM node for the body, mounts it to the HTML, and pushes the StartTag Body Token.
  4. The parser parses the StartTag div, creates a DOM node for the div, mounts it under the body, and pushes the StartTag DIV Token.
  5. The word splitter parses the text Token of the div, then creates a text node and mounts it under the div. The text Token does not need to be pushed onto the stack.
  6. The parser resolves the EndTag div, at which point the HTML parser determines whether the element at the top of the stack is a StartTag div, and pops the StartTag div from the top of the stack.
  7. The parser parses the StartTag div, creates a DOM node for the div, mounts it under the body, and pushes the StartTag DIV Token.
  8. The word splitter parses the text Token of the div, then creates a text node and mounts it under the div. The text Token does not need to be pushed onto the stack.
  9. The parser resolves the EndTag div, at which point the HTML parser determines whether the element at the top of the stack is a StartTag div, and pops the StartTag div from the top of the stack.
  10. The parser parses the EndTag body, and the HTML parser determines if the current element at the top of the stack is StartTag Body, and if so, pops StartTag Body from the top of the stack.
  11. The parser parses the EndTag HTML, at which point the HTML parser determines whether the current element at the top of the stack is StartTag HTML, and if so, pops StartTag HTML from the top of the stack.
  12. And then finally document goes off the stack.

If you insert a JavaScript script between two divs, that’s a bit of a change.

<html>
    <body>
        <div>Have a meal</div>
        <script>
            let div1 = document.getElementsByTagName('div') [0] 
            div1.innerText = 'Play games'
        </script>
        <script type="text/javascript" src='foo.js> 
        
sleep
</html> Copy the code
  1. When the participle parses to the first script. The HTML parser pauses, and the JavaScript engine steps in to execute the script in the script tag.
  2. When the participle parses to the second script. You need to download this JavaScript code first. This requires a lot of attention to the download environment, because downloading JavaScript files blocks DOM parsing, often takes time, and can be influenced by the network environment, JavaScript file size, and so on.
  • After the rendering engine receives the byte stream, it will start a preparsing thread to analyze JavaScript, CSS and other relevant files contained in the HTML file. After parsing the relevant files, the preparsing thread will download these files in advance.
  • You can also use CDN to speed up loading of JavaScript files.
  • Reduce the size of JavaScript files.
  • If there is no DOM-related code in the JavaScript file to manipulate, you can set the JavaScript script to load asynchronously and mark the code with async or defer. Once the script file with the async flag is loaded, it will be executed immediately. The script file that uses the defer tag needs to be executed before the DOMContentLoaded event.
  1. After the script is executed, the HTML parser resumes parsing and continues parsing until the final DOM is generated.

If there are references to external CSS files, or CSS content is built in via the Style tag, the rendering engine will also need to convert that content to CSSOM. Because JavaScript has the ability to modify CSSOM, you need to rely on CSSOM before executing JavaScript. So CSS also blocks DOM generation in some cases.

The HTML preparser recognizes that there are CSS files and JavaScript files that need to be downloaded, and then initiates a download request for both files at the same time. The download process of these two files overlaps, so the download time is calculated according to the file with the longest. Either the CSS file or the JavaScript file arrives first, wait until the CSS file is downloaded and the CSSOM is generated, then execute the JavaScript script, and finally continue building the DOM.

<html>
    <head>
      <link href="theme.css" rel="stylesheet">
    </head>
    <body>
        <div>Have a meal</div>
        <script>
            let div1 = document.getElementsByTagName('div') [0] 
            div1.innerText = 'Play games'
        </script>
        <script type="text/javascript" src='foo.js> 
        
sleep
Copy the code