The high-level structure of the browser

Main components:

1. User interface: including address bar, forward/back button, bookmark menu, etc. All parts of the display belong to the user interface, except for the page you requested displayed in the browser’s main window.

2. Browser engine: Transfer instructions between the user interface and the rendering engine.

3. Rendering engine: Responsible for displaying the requested content. If the requested content is HTML, it is responsible for parsing the HTML and CSS content and displaying the parsed content on the screen.

4. Networking: For network calls, such as HTTP requests. Its interfaces are platform independent and provide an underlying implementation for all platforms.

5. User interface back end: Used to draw basic widgets, such as combo boxes and Windows. It exposes a common interface that is platform-independent, while underneath it uses the operating system’s user interface approach.

JavaScript interpreter: Used to parse and execute JavaScript code.

7. Data storage: This is the persistence layer. Browsers need to keep all kinds of data, such as cookies, on their hard drives. The new HTML specification (HTML5) defines a “web database,” which is a complete (but lightweight) in-browser database.

Main process

The rendering engine starts parsing the HTML document and turns each tag into a DOM node on the content tree. It also parses style data in external CSS files and style elements. This style information in HTML with visual instructions will be used to create another tree structure: the render tree.

The rendering tree contains multiple rectangles with visual attributes such as color and size. The order in which these rectangles are arranged is the order in which they will appear on the screen.

Once the rendering tree is built, the “layout” phase of processing is entered, where each node is assigned an exact coordinate that should appear on the screen. The next stage is drawing – the rendering engine traverses the rendering tree, drawing each node from the user interface back-end layer.

It should be emphasized that this is a gradual process. In order to achieve a better user experience, the presentation engine strives to bring content to the screen as quickly as possible. It doesn’t have to wait until the entire HTML document has been parsed to start building the rendering tree and setting up the layout. While receiving and processing the rest of the content from the web, the rendering engine parses and displays some of it.

WebKit’s main process:

Gecko’s main process:

Parse and build DOM trees

Analysis and translation

The process is shown in the figure:

Parsers typically divide the parsing work between two components: a lexical analyzer, which breaks the input into valid tokens (lexical analysis); The parser is responsible for analyzing the structure of the document according to the syntax rules of the language to build the parse tree (parsing).

HTML parser

The parser’s output “parse tree” is a tree structure made up of DOM elements and attribute nodes.

<html>
  <body>
    <p>
      Hello World
    </p>
    <div> <img src="example.png"/></div>
  </body>
</html>
Copy the code

Parsing algorithm

The algorithm consists of two phases: tokenization and tree construction.

Mark,

Tokenization is a lexical analysis process that parses input into multiple tags. HTML tags include start tags, end tags, attribute names, and attribute values. The tag generator recognizes the tag, passes it to the tree constructor, and then accepts the next character to recognize the next tag; Repeat until the end of the input. The initial state is the data state. When a character < is encountered, the state changes to token open state. Receiving an A-Z character creates the “start tag” and the state changes to the “tag Name state”. This state remains until the > character is received. Each character received during this period is appended to the new tag name. In this case, the tags we create are HTML tags.

When a > tag is encountered, the current tag is sent and the state changes back to “data state.” Tags do the same. The HTML and body tags are now issued. Now let’s go back to “data state.” When an H character is received in Hello World, character markers are created and sent up to < in the received. We will send a character token for each character in Hello World.

Now let’s go back to the Tag Open state. When the next input character/is received, the End Tag token is created and changed to “Tag Name status”. We will hold this state again until receive >. The new tag is then sent and the “data state” is returned. The same goes for input.

Tree building

When the parser is created, the Document object is also created. During the tree construction phase, the DOM tree with the Document as the root node is also constantly modified to add various elements to it. Each node sent by the tag generator is processed by the tree builder. The SPECIFICATION defines the DOM elements for each tag, which are created when the corresponding tag is received. These elements are added not only to the DOM tree, but also to the stack of open elements. This stack is used to correct nesting errors and handle unclosed tags. The algorithm can also be described by state machines. These states are called “insertion modes.”

The input to the tree building phase is a sequence of tags from the tokenization phase. The first mode is “Initial Mode”. The HTML tag is received into “before HTML” mode, and the tag is reprocessed in this mode. This creates an HTMLHtmlElement and appends it to the Document root object.

The state will then change to “before head”. At this point we receive the “body” tag. Even without the “head” tag in our example, the system implicitly creates an HTMLHeadElement and adds it to the tree.

Now we’re in “in head” mode, and then we’re in “After head” mode. The body tag is reprocessed, HTMLBodyElement is created and inserted, and the mode changes to “in body”.

Now you receive a series of character tokens generated by the “Hello World” string. A “Text” node is created and inserted when the first character is received, and other characters are appended to that node.

Receiving the body end tag triggers the “After Body” mode. Now we will receive the HTML closing tag and go into “After After Body” mode. When the end-of-file flag is received, the parsing process ends.

Parsing the CSS