preface

When we explore how browsers work, we tend to perform multiple examples to infer the underlying design; For a long time, I explored the browser black box in this way, but it was always to verify the example itself rather than assert the actual behavior of the browser. In pursuit of truth (misinformation), I decided to write this series of articles to explore the browser code of conduct from the very beginning.

HTML Standard is the HTML specification developed by the W3C, and the core of the browsers we use, such as WebKit in Chrome, is the IMPLEMENTATION of the HTML specification. All interface behavior (excluding browser elements such as bookmarks) is described by the HTML specification and implemented by WebKit (or some other browser engine). Note that the Javascript specification is not created by the W3C. The HTML specification only defines the Document and window objects, commonly known as the DOM and BOM. We will explore how the HTML specification is defined.

What you can learn by reading this article

  • Take a rough look at each flow of HTML Parser and what it does.
  • The relationship between Css/JavaScript resource loading and HTML Parser.
  • Events such as DOMContentLoaded, window.onload, and onReadyStatechange trigger time.

Some points to note:

  • This article uses Chrome as the test browser
  • References are from HTML Standard, Chromium Design Document

Because of the above, testing this use case in a browser with a non-WebKit kernel may be problematic.

How does the browser parse HTML

To be sure, browsers can parse not only HTML, but also XML documents, most image formats, PDF files, etc., but let’s focus on how HTML is parsed for now.

Dom and HTML Parser

Before we dive into the process of browsers parsing HTML, let’s define the Dom:

  • An abstract representation of the page used inside the browser (the browser draws the page from a rendering tree built by the Dom).
  • Interfaces exposed to JavaScript operations.

In this paper, the process of GENERATING Dom from HTML text is called Parser, and the process of generating Layout Tree from Dom Tree + Css is called Render.

The Dom is to the browser what the Virtual Dom is to the front-end developer. The Dom is not a real view (the interface we see), but the browser can calculate the underlying drawing instructions based on the Dom Tree and Css.

The HTML Parser produces the Dom, not the actual interface, and changes to the Dom do not immediately cause drawing (similar to how react calls setState).

So when the HTML Parser block means that new nodes are being stopped (eg: HTMLElement), sometimes the HTML Parser block doesn’t mean that the browser can’t draw an interface. (Why can’t I draw parts of the Dom and Css if they already exist?)

Let’s take the mystery out of the browser step by step.

HTML Parser executes the process

When we open a web page, the browser actually makes a request and eventually presents the result of the request to the user. As developers, our concern is: how should the browser be prepared during this process? How is the response to the request handled? With these two questions in mind, we move on.

So how does the browser do this? When a browser first requests a resource, it initializes a separate context, called the Browsing Context, which contains:

  • Document without any Element (you can’t even know the type of document until the request returns)
  • A window object corresponding to document
  • JavaScript runtime environment to save the results of the script run
  • Bind this to the window.

When the requested resource is an HTML file (the browser uses Content-Type recognition), the browser initializes an HTML Parser to the current Document and passes the response to the HTML Parser for parsing. This is the parser process specified in HTML Standard:

Byte Stream Decoder & Input Stream Preprocessor

We get the byte stream from file System or HTTP Response. After we get the byte stream, the browser will try to decode and select the decoder according to the following Settings:

  • Get the character encoding based on the HTTP header field conten-type.
  • <meta charset=” utF-8 “/ >
  • If neither is present, the browser will passByte encoding sniffing algorithmDecision character encoding

In addition, the specification recommends writing HTML documents using asciI-compatible encoding (such as UTF-8), which the VSC uses by default to save documents, because the specification uses ASCII encoding to detect meta tags and obtain the encoding of documents.

A document parsed with the wrong encoding:

At this point, our browser is finally decode byte streams properly, but before processing characters to build the Dom, there is an additional pre-processing called the Input Stream Preprocessor, which simply standardizes newlines. For example, Windows uses CRLF as a newline character while UNIx-like systems use LF as a newline character.

In addition, we see the Script Execution step flow back to the Input Stream Preprocessor by calling Document. write (which is why Script Execution blocks browser parsing). As the name implies, during Script Execution, The content inserted by Document.write is injected into the Input Stream and used as the next parsing point. Let’s look at the results:


      
<html lang="en">.<body>
  <p>
    parser first
  </p>
  <script>
    document.write('<p>parser second</p>')
  </script>
  <p>
    parser third
  </p>
</body>
</html>
Copy the code

You can see that the output of the Document.write call comes before the tag after the script.

Tokenizer & Tree Construction

Tokenizer is a common term in the compiler world. The literal translation of Tokenizer means tokenization. Let’s consider how many types of characters there are in HTML documents:

  • Documentation comments
  • HTML tags
  • The text content to be displayed
  • Inline style code and script code
  • HTML reserved characters such as & NBSP;
  • There’s more I didn’t think of!

We receive a string of stateless strings. To facilitate HTML parsing, we need to cut this string into a series of substrings, label them, and pass them to Tree Construction one by one. That’s what Tokenizer does.

Tree Construction has a series of insert states to ensure that the node is inserted in the appropriate position. If a node is inserted in an illegal position, it causes a Parser Error (eg: Parser Error does not necessarily cause the Parser to terminate, and the specification defines a number of error-correcting mechanisms.

<! A series of insertion states ensure that parsing HTML into a DOM produces the following structure

      
<html lang="en">
<head></head>
<body></body>
</html>


      
<html lang="en">
<head>
  <! An invalid node type that causes a Parser Error and is ignored by the error-correcting mechanism. Document. querySelector('#invalid') will be null -->
  <span id="invalid">2</span>
</head>
<body></body>
</html>
Copy the code

The Tree Construction will eventually produce a node and insert it into the Dom, which can then be manipulated using JavaScript.

And that’s it? Now the fun begins!

Next, let’s talk about how styles and scripts behave during HTML Parser execution.

Styles, scripts, and HTML Parser

The most common myth is that style loading does not block HTML parsing, while scripts do.

This sentence is too vague. After all, styles have inline style and external style, and scripts also have inline script and external script, while the external script has defer and async attributes to distinguish it again.

Load and parse styles

Parsing styles is not the work of the HTML Parser. The HTML Parser only needs to insert the LINK or style tags into the Dom, so the loading and parsing of style resources in the Style and Link tags are carried out in parallel.

Parallelism is to return the main thread (the HTML Parser thread), create a child thread (or some other parallel implementation eg: Fiber), and put the following tasks (eg: download styles and parse styles) into the child thread.

So far, the conclusion is that style load parsing does not block HTML parsing. In most cases, this conclusion is true, and we need to mention a global variable script-blocking style sheet counter during parsing.

This variable will change in the following scenario:

  • Couter++ when a script-blocking style sheet starts parsing
  • When a script-blocking style sheet is parsed, couter–

What is a script-blocking style sheet?

  1. A file containing href, type text/ CSS andmediaThe value is null or matches the link label of the current media query
  2. A style tag that uses the @import syntax to import external style resources
// script-blocking style sheet
<link rel="stylesheet" type="text/css" href="./index.css"/>
<style>
@import './index2.css';
</style>
Copy the code

For now, just remember that this variable is related to script execution and rendering, so let’s look at the loading mode of the script.

Load and execute JavaScript

This is a popular script loading flow chart. It basically covers most of the effects of script loading on HTML Parser, but some details are missing. Let’s see how the specification defines it.

For scripts without the defer/async attribute, we call them pending parsing-blocking scripts, but note that:

  • For scripts whose type is Module, the defer attribute defaults to true
  • Scripts dynamically inserted using JavaScript are not included

How does a pending parsing-blocking script load and execute?

  1. When HTML Parser encounters this type of script, it exits the current Parser task and freezes all tasks from HTML Parser (freezing effect: Event Loop does not execute HTML Parser tasks, The HTML Parser is blocked.
  2. Execute the following steps in parallel (the main thread is executing tasks other than HTML Parser) :
    1. Keep checking whether the script is loaded and check whether script-blocking style sheet counter equals 0 until both conditions are true
    2. Restore the previous Parser task and push it into the Event loop
    3. Unfreeze the HTML Parser task
  3. Execute the script

From the steps above, we saw that in this scenario, the parsing of the style load would block the execution of the pending parsing script, which in turn would block the HTML Parser.


      
<html lang="en">
<head>
   <title>Document</title>
   <! Parsing pending blocking script until I finish parsing it.
   <link rel="stylesheet" type="text/css" href="./index.css? lazy=1000" /> 
</head>
<body>
    <! Parsing parsing parsing parsing parsing parsing parsing parsing parsing parsing parsing parsing parsing parsing parsing parsing parsing parsing
    <script src="./index.js"></script> 
    <! I will wait until the script at 👆 is executed.
    <span>hello</span>
</body>
</html>
Copy the code

Why is style parsing designed to block this part of the script execution? Consider the following scenario:

// index.css
.color {
    color: red
}
Copy the code

      
<html lang="en">
<head>
  <title>Document</title>
   <link rel="stylesheet" type="text/css" href="index.css" /> 
</head>
<body>
    <p class="color">What is my color</p>
    <script>
        const element = document.querySelector('.class');
        const color = window.getComputedStyle(element).color;
        console.log(color) // rgb(255, 0, 0);
        // If a script is executed without waiting for the CSS to load, it will get the wrong style
    </script>
</body>
</html>
Copy the code

async & defer

We know that the script tag also has async and defer properties that affect its loading execution, but let’s see how this behaves. Some things to know:

  • For scripts with type module, defer defaults to true.
  • Async takes precedence over defer, meaning that while async exists, the defer attribute is ignored.
  • Script tag async created by JavaScript defaults to true.

When the HTML Parser encounters a Script tag with the defer tag and no Async tag, The HTML Parser puts the corresponding script in a queue called List of scripts that will execute when the document has finished parsing. And a parallel download (without blocking the main thread), but without execution; It waited until HTML Parsing was complete, and then went back to executing the queue, which we’ll talk about later.

When the HTML Parser encounters a script tag with the async tag, the Parser still chooses to store it in a set called Set of scripts that will execute as soon as possible. The parallel download is then enabled, and unlike defer, when the Async script is loaded, it will immediately look for an opportunity to execute (Event loop next tick); As a result, async scripts run at unpredictable times and out of order (execute first after downloading); And when the async script was downloaded before the HTML Parsing was complete, subsequent Parser tasks were still blocked (although the async script was not blocked during Parsing).

There is another type of script list called List of scripts that will execute in order as soon as possible. So far, I have only found the following scripts that conform to this type:

// Script element created by JavaScript with aysnc false
const script = document.createElement('script')
script.async = false; // Script tag async created by javascript defaults to true
script.src = 'dy.js';
document.body.append(script);
Copy the code

For this type of script, the parsing and execution form is similar to async, except that this type of script will be executed in the order of addition, while Async script is unordered. In addition, the execution of both scripts is not blocked by stylesheet parsing.

At this point we should also have the following questions:

  • If my async script has not been downloaded by the end of parsing, how can I confirm when an async script can be used?
  • When will my defer script be executed? When will it be available? Is the window ready to use the onload event?

Parse the finished work and when window.onload fires

After the HTML Parser is parsed, there are some other things that need to be done, such as triggering events and running scripts. Note that after the HTML Parser is parsed, it does not mean that the stylesheet is parsed. After all, the stylesheet is not parsed by the Parser, which means that script-blocking style sheet counter may be greater than 0.

A front-facing knowledge document. The readyState is one of three values: lodaing, interactive, complete, load when the document is loading, when the state changes triggered when the document. The onreadystatechange event.

After parsing the HTML text, the Parser performs the following steps:

  • Setting Document. readyState to interactive triggers the onReadyStatechange event (which means that all DOM elements can now be manipulated).
  • Suspend this task (this task refers to the current Parser, after which other tasks existing in event loop will be run until the subsequent conditions are reached) until script-blocking style sheet counter is 0 and defer script download is finished. Then execute all the defer scripts.
  • Trigger the DOMContentLoaded event, at which point some async scripts may not have finished downloading
  • Pause the task until all async scripts and scripts in the list of scripts that will execute in order as soon as possible have been downloaded, and then execute these scripts. However, the scripts between the two may be interleaved.
  • Pause this task until all dom elements’ onload/ onError events are emitted (eg: img)
  • Change document.readyState to complete to trigger window.onload (this means that all scripts are available and all DOM nodes trigger onLoad/onError)
  • At this point the HTML Parser task is done, and control is returned to the Event loop.

Write in the last

Part of the summary

  • Stylesheets loaded with external resources block execution of scripts other than dynamic inserts and those with async properties.
  • The async script and dynamically inserted sync script are executed immediately after loading and the time is unpredictable. The former is out of order and the latter is in order
  • Scripts without async and defer will block the Parser for downloading, and all script execution will block the Parser.
  • The Parser will execute the defer script after parsing all the DOM
  • Document runs all scripts before complete

Comments and previews

Understanding this underlying logic helps us to have direction when doing compile-time optimizations, and to be aware of what is blocking browser loading when dealing with first-screen loading issues.

In the case of script, the optimal way to load is already written on forums and blogs, hanging at the end of the HTML, but this knowledge comes in handy when we need to do something special and suddenly need the script not to be blocked by header styles, or to load dependencies in parallel.

After reading these, students may find that the words event loop and task frequently appear in the article. In fact, HTML Parser is a task running on Event loop in essence. In fact, the logic of Event loop is quite complex in HTML Standard. It doesn’t just schedule tasks that execute JavaScript, it basically schedules all tasks on the page.

My next article will parse event loop according to the description of HTML Standard, which will include the event loop scheduling mode, the task suspension mechanism mentioned in the original, as well as the issue of rendering timing that you should be interested in. In fact, this article does not mention the issue of rendering. But rendering can happen at any time after the CSS is loaded.

If you think it’s good, please give it a thumbs up. If you are interested in the next article, please follow 😁

Example download: github.com/MinuteWong/…