This is the 15th day of my participation in the August More Text Challenge

The main components of the browser

To understand how a browser works, we need to know what the main components of a browser are:

  • User page: All parts of the display belong to the user page except the request page displayed in the main window of the browser
  • Browser engine: Transmits instructions between the user interface and the rendering engine
  • Rendering engine: Responsible for displaying the requested content
  • Network: Used for network calls
  • User interface back end: Used to draw basic widgets
  • JavaScript parser: Used to parse and execute JavaScript code
  • Data storage: Persistence layer

The rendering engine

Main process

  1. Parsing: HTML parsing and CSS parsing
  2. Build render tree: Integrate DOM tree and styleSheet
  3. Layout: Assign coordinates to each node
  4. Draw: Walk through the render tree and draw each node from the back end of the user page

parsing

By parsing the document, you get a tree of nodes that represents the structure of the document, which can also be called a syntax tree.

As follows:

We can see that there is a translation link in the figure. In many cases, parsing trees are not the final product, and parsing is only part of the translation process. We want to translate the input document into another format, using the parse tree as a middleman.

HTML parsing

The HTML parsing algorithm consists of two stages: tokenization and tree building.

  • Tokenization: A lexical analysis process that outputs HTML tags, including start tags, end tags, attribute names, and attribute values.
  • Tree building: Each node generated by the tag generator is sent to the tree builder for processing, and these elements are added not only to the DOM tree, but also to the stack of open elements.

CSS analytical

CSS parsing parses a CSS file into a stylesheet object, and each object contains CSS rules (selectors and declarations).

The order in which scripts and style sheets are processed:

When the parser encounters a script tag, it immediately parses and executes the script, at which point parsing of the document stops until the script is finished. If the script is external, the parsing process is stopped until the network synchronous fetching of resources is complete.

With the preprocessor, resources can be loaded on parallel connections, increasing overall speed. But the preprocessor only resolves external resource references.

Firefox disables all scripts during stylesheet loading and parsing. WebKit, on the other hand, disallows a script only if the style property it is trying to access may be affected by an unloaded stylesheet.

Building a rendering tree

A rendering tree is a tree of visual elements in the order in which they are displayed. It is also a visual representation of a document that allows the browser to draw content in the correct order.

The relationship between the rendering tree and the DOM tree

  • Renderers correspond to DOM elements, but non-visual DOM elements are not inserted into the rendering tree. If you set the display value of an element to None, it will not show up in the render tree.
  • There are DOM elements that correspond to multiple visual objects. They tend to be elements with complex structures that cannot be described by a single rectangle.
  • There are rendering objects that correspond to DOM nodes but are located differently in the tree. Such is the case with floating and absolute positioning elements, which are outside the normal flow, placed elsewhere in the tree and mapped to the real frame, with the placeholder frame in place.

When building a rendering tree, you need to calculate the visual properties of each rendering object. Each DOM node has a “attach” method, which is called when the node is inserted into the DOM tree. The style properties of the node are calculated to generate a renderer. Let’s take a look at the process of integrating (‘ attaching ‘in WebKit jargon) :

layout

Renderers do not contain location and size information when they are created and added to the rendering tree. The process of calculating these values is called layout or rearrangement.

HTML uses a flow-based layout model, which means that geometric information can be calculated in most cases in a single walk. Elements at the back of the stream usually do not affect the geometry of elements at the front, so the layout can traverse the document from left to right and top to bottom. There are exceptions, however, where an HTML table calculation requires more than one traversal.

The coordinate system is established with respect to the root frame, using upper and left coordinates.

Layout is a recursive process. It starts with the root renderer (the < HTML > element corresponding to the HTML document) and recursively traverses some or all of the frame hierarchies, computing geometric information for each renderer that needs to be computed.

draw

In the paint phase, the rendering tree is traversed and the renderer’s “paint” method is called to display the renderer’s contents on the screen. Drawing is done using user interface infrastructure components.

Drawing order

  1. The background color
  2. The background image
  3. A border
  4. Their offspring
  5. outline