The inner workings of modern browsers

Summary figure

The core CPU and GPU of a computer

The Central Processing Unit is the brain of a computer. Each CPU core performs different tasks one by one. Many computers today have multiple chips and cores.

GPU:Graphics Processing Unit,GPU is good at Processing simple tasks across the core, is developed for solving Graphics. In a graphics environment, “Use GPU support” and “use CPU” are both related to fast rendering and smooth rendering.

Many GPU cores with wrenches mean that they can only perform a limited number of processing tasks

When you start a computer, it’s the CPU and GPU that power the application. Typically, applications run on cpus and gpus through mechanisms provided by the operating system.

Three layers of computer architecture: the bottom is the machine hardware, the middle is the operating system, and the top layer is the application program.
Execute processes and threads on processes and threads

The process acts as a bounding box, and the thread is an abstract fish swimming in it

A process can be described as the executor of an application. Threads exist in the process and execute arbitrary parts.

When an application is started, one or more processes are created to help the application work. The operating system provides a “memory” for processes to use. All application states are stored in this private memory space.

A process can request another process of the operating system to perform different tasks. At this point, different memory is allocated to the new Process, and the processes can communicate with each other through IPC (Inter Process Communication). When one of the application’s worker processes becomes unresponsive, that process can be restarted without stopping the rest of the application.

Schematic diagram of individual processes communicating through IPC
Browser architecture

Different browsers may be multi-process or single-process.

Chrome architecture, using the multi-process architecture

At the top of the Browser is the Browser process, which coordinates work with other application modules in the Browser.

Various Chrome processes work:

  • Browser Process:
    • Address bar, bookmark bar, forward and back buttons, etc
    • Invisible operations, requests, file access, etc
  • Rederer Process
    • Responsible for all the work of a web TAB
  • Plugin Process
    • Plugin process that handles a plugin used by a web page, such as Flash plugin
  • GPU Process
    • Responsible for gPU-related work * and other processes, e.g. extended applications, application processes

Different processes point to different parts of the browser UI

Google Chrome is a separate process for each site (web page), so that if a web page crashes, other web pages are not affected. The advantage of having multiple processes in a browser is that security and follow up are sandboxed. Because the operating system provides a way to limit process permissions, browsers can sandbox certain processes. Since processes have their own private memory space, each process has a copy of the common infrastructure, which also means that more memory can be used. When running to the limit, Chrome will use the same process for different tabs on the same site.

When running on powerful hardware, Chrome will allocate different service modules to different processes to improve stability, but when running on weak hardware, Chrome will consolidate some service modules into the same process to save memory, but the corresponding stability will be reduced.

Renderers per Iframe) – site isolation

Site isolation enables each Iframe to run in a separate renderer process. Each TAB site runs a separate process, and the Iframe within the site runs a separate rendering process, so that the same Iframe can share memory across different sites.

The same origin policy is the security model of the Web, which means that one site cannot access its data without authorization from other sites. Process isolation is the most efficient means of separating sites.

Site Isolation Diagram
What happens when you navigate?

It starts with the Browser Process

The browser process manages everything but Ta. There are many threads in the browser process, such as the UI thread that draws the browser buttons and input fields, the network thread that handles the network stack to fetch from the Internet, and the storage thread that controls file access. When entering a URL, the input is handled by the browser’s process UI thread.

At the top is the browser UI, and at the bottom is the browser process diagram with the UI, network, and storage threads
A simple navigation
  1. Process the input

The UI thread of the browser process first determines whether it is a search query or a URL. The UI thread decides whether to search for content to a search engine or to a website.

The UI thread asks if the input is a search query or a URL address
  1. When the Enter key is pressed, the UI thread enables the network to retrieve the site content, and the loading animation is displayed in the corner of the TAB. The network thread establishes a TLS connection for the request, using the appropriate protocol.

Secure Sockets Layer (SSL) and its successor Transport Layer Security (TLS) is a Security protocol that provides Security and data integrity for network communications. TLS and SSL encrypt network connections at the transport layer.

SSL is located between TCP/IP and various application-layer protocols and provides security support for data communication. The SSL Protocol is divided into two layers: SSL Record Protocol (SSL Record Protocol) : Based on reliable transport protocols (such as TCP), it supports basic functions such as data encapsulation, compression, and encryption for high-level protocols.

The Secure Transport Layer protocol (TLS) is used to provide confidentiality and data integrity between two communication applications. This protocol consists of two layers: TLS Record and TLS Handshake. The lower layer is TLS recording protocol, which is above a reliable transport protocol (such as TCP) and has nothing to do with the specific application. Therefore, TLS protocol is generally classified as transport layer security protocol.

Roger[ˈrɑ:dʒə(r)]: Roger[ˈrɑ:dʒə(r)] The UI thread tells the network thread to navigate tomysite.com

At this point, the network thread might receive a server redirect header like HTTP 301. In this case, the network thread tells the UI thread that the server is requesting a redirect. Then, another URL request is initiated.

  1. Read the response

Contains the content-type response header and payload as the actual data

Once the response body starts to be received, the network thread looks at the first few bytes of the data stream, and the content-Type field in the response message declares the Type of the data. However, there may be a loss error. So MIME type sniffing is used to solve this problem.

If the response is an HTML file, the next step is to pass the data to the renderer process. If it is a downloaded file, meaning it is a download request, it is passed to the downloader manager.

The network thread asks a response if the data is HTML from a secure web site

The safeBrowsing check is also run at this point, and if the domain name and data match a malicious site, the network thread displays a warning page. Cross Origin Read Bloking (CORB) checks are also performed to ensure that sensitive cross-domain data is not passed to the rendering process.

  1. Find the renderer

Once all checks are processed and the network thread is sure to navigate to the requested site, the network request thread tells the UI thread that all data requests are complete. The UI thread looks for the renderer process to start rendering the Web page.

The network thread tells the UI thread to look for the renderer

Since the network request thread takes time, you can apply optimizations one at a time. While the UI thread is sending a URL request to the network thread, you can also look for a rendering process that may not be used if the navigation redirect occurs.

  1. Submit the navigation

Now that the data and renderer are ready, the browser sends an IPC(inter-process communication) to the renderer to submit the navigation. It also sends a data stream, so the renderer keeps accepting the HTML data. Once the browser process receives confirmation that the renderer has committed, the navigation is complete and the document load parsing begins.

At this point, the address bar is updated, the security indicator and site Settings UI show the site information for the new page, and the Tab’s session history is updated, so the forward and back buttons go to the site you just navigated. When you close a Tab or window, to optimize Tab/ Session restoration, Session history is saved on hard disk.

IPC between the browser and the renderer process, requesting the rendering of the page.
Additional steps: initial loading completed

Once the navigation is committed, the renderer process starts loading resources and rendering the page, and once the renderer process is finished rendering, it sends IPC back to the browser process (this also happens after all frameh and onload events have been fired and executed). At this point, the UI thread stops the loading animation on the TAB page.

Navigate to another site

Simple navigation is complete. When entering a URL in the navigation bar, the browser process checks whether the rendered site cares about beforeUnload events. If it does not, it performs the same operation to navigate to another site.

The Befounload event will remind users to “leave the site” when they leave or close tabs.

Note: Do not add unconditional beforeUnload handlers, which can cause delays.

The browser process sends IPC to the renderer process telling it to navigate to another site

If the renderer starts navigation (window.location.herf= XXX), the renderer first checks the Befoeload event handler and performs the same steps as the browser does for navigation, except that the navigation request is sent by the renderer to the browser process.

When a new navigation is to a new site, a separate renderer process is called to handle the navigation, leaving the current renderer process to handle the Unload event.

Two IPC’s (from the browser process to the new renderer) tell the render page and tell the old renderer to unload
Service Worker to be added

The internal mechanics of the renderer process

The rendering process processes the web site content

The renderer is responsible for everything sent within the tag. In the renderer process, the main thread handles most of the data returned by the server to the user, and if web Sworker or Service Sorker is used, some of the JS will be processed by the worker thread. Composition and raster threads also run within the render process to render the page efficiently and smoothly.

The core job of a rendering process is to turn HTML,CSS, and JavaScript into a web page that users can interact with.

The renderer process contains the main thread, worker thread, composite thread, and raster thread
  1. Parsing
  • When the renderer receives HTML data, the main thread parses the text string (HTML) and converts it into (Dom). Dom is the internal representation of the page in the browser and the data structure with which the developer interacts with it through JS

HTML tag parsing is determined by HTML Standar, and the HTML specification handles some tag errors gracefully.

  1. Subresource loading

External resources, such as images, CSS and JS, need to be loaded from the web or cache. As the DOM is parsed and built, the main threads are loaded one by one in the order they are processed, and to speed things up, the “preload scanner” is done simultaneously. If the document has < IMG > , the preloaded scanner sends the request in the browser process.

The main thread parses the HTML and builds the DOM tree

When parsing HTML,

  1. Prompt browser how to load resources

You can add async or defer properties to the

  1. Style calculation

Only DOM elements do not determine the appearance of the page, and CSS styles are required. The main thread parses the CSS and determines the computed style of each DOM node.

The main thread parses the CSS to add post-computed styles
  1. Layout Tree

Knowing what element style each node already corresponds to is not enough to render the page. Layout is the process of calculating the shape of a geometric element. The main thread traverses the DOM, calculating the style and creating a layout tree that contains information such as xy coordinates and border sizes. A layout tree may be similar to a DOM tree structure tree, but it only contains information about the visible content of the page. If display: None is applied to an element, it is not part of the layout tree. Similarly, if an application such as p::before{content:”Hi!” } is included in the layout tree even if it is not in the DOM.

The main thread iterates through the computed style DOM tree to generate the layout tree
  1. draw

With DOM, style, and layout trees, you can’t draw pages, and you don’t know the order in which to draw them. For example, some elements with a Z-index simply draw from top to bottom, resulting in layer height errors.

Because z-index is not considered, page elements appear in the order of HTML tags, resulting in incorrectly rendered images

In the draw step, the main thread traverses the Layout Tree to create Paint Records, like background first, and then rectangles, text.

The thread traverses the layout tree and generates a draw record
  • The cost of updating the render pipeline is high
DOM + Style, layout, and the generation order of the draw tree

The most important thing about the render pipeline: at each step, the result of the previous operation is used for the data of the next operation. If you animate an element, you have to do the same thing for every frame, and most browsers at 1 frame per second will feel stuck if the browser misses the middle frame.

Animation frames on the timeline
  1. Actually draw a page
  • rasterizer
Simple raster processing diagram

Now you know the structure of the document, the CSS, the layout tree, the drawing order. The data is converted into pixels on a physical device. This process is called rasterization.

  • synthetic

Composition process is to raster each part of the page, and the composition thread for layer movement composition.

  • layered

To figure out which elements are on what layer, the main thread traverses the layout tree to create a layer tree. If some parts are separate layers (such as the slide side menu bar) but are not split, you can use the CSS property will-change to alert the browser.

  • Rasterization and synthesis of the main thread

Once the layer tree has been created and the drawing order determined, the main thread passes the information to the composition thread, which then rasters each layer, a layer that may be as large as the page, and blocks it up and sends it to the raster thread. The raster thread rasterizes each chunk and stores it in video memory.

Raster threads create partitioned bitmaps and send them to the GPU

The compositing thread sets the priority of the different raster threads so that the view or nearby area can be rasterized first. The layer also has blocks of different resolutions that can be zoomed in.

Once the blocks are rasterized, the composition thread collects information about those blocks (called drawing quadrilateral) to create the composition frame.

  • Draw quadrilateral: Contains information such as the location of the block in memory and the position of the block in the page when compositing.
  • Composite frame: A collection of drawn quadrangles representing a frame on a page.

The composite frame is then submitted to the browser process via IPC, at which point a composite frame can be added to the UI thread or to the renderer process of other plug-ins, which are sent to the GPU and displayed on the screen. If a scroll event is received, the composited frame creates another composited frame to the GPU.

The compositing thread creates the compositing frame and sends it to the browser process, which in turn sends it to the GPU

User input behavior and synthesizer

Anything the user does is input to the browser. This includes clicking, scrolling, touching the screen, and swiping the mouse.

For example, when a user touches the screen, the browser process is the first to capture the action, and the information the browser process has is limited to the area where the action occurred, because the contents of the tag are processed by the renderer process. The browser process communicates the click behavior and coordinates to the renderer process. The rendering process is processing accordingly.

The synthesizer receives input events

The view window that hangs over the page layer
Understand non-immediately scrollable zones

It is the renderer’s main thread’s job to run JS. After the page is synthesized, the area where events are registered is called the “not immediately scrollable area”, and the synthesizer thread will notify the renderer’s main thread to handle it. No input event occurs in the event register area, and the synthesizer process does not need to wait for the main thread and can continue synthesizing frames.

Setup time handling should be taken care of
document.body.addEventListener('touchstart', 
event => {
    if(event.target === area) { event.preventDefault(); }});Copy the code

Event broker is a common event handling pattern in browsers that adds an event to the top-level element. The problem with this is that the entire page is marked as a non-immediate scrolling area, and the synthesizer process needs to ask the main thread each time if it needs to handle an event and wait for feedback. The smooth synthesizer processing mode fails.

You can minimize this by adding a passive:true option to the event listener. This prompts the browser that you want to continue listening for events in the main thread, but the synthesizer does not have to wait to create a new composite frame.

document.body.addEventListener('touchstart', event => {
    if (event.target === area) {
        event.preventDefault()
    }
 }, {passive: true});
Copy the code

For example, if you want to scroll horizontally in a certain area, use passive: True will allow smooth scrolling, but the vertical scrolling may occur before event.preventDefault(), which can be prevented by event.cancelable.

document.body.addEventListener('pointermove', event => {
    if(event.cancelable) { event.preventDefault(); }}, {passive:: // passive::true});
Copy the code

You can also use the CSS touch-action attribute to completely eliminate the influence of the event handler, such as: #area{touch-action:pan-x; }

Find event objects

When the combinator thread sends an input event to the main thread, the main thread first performs a hit test to find the corresponding time target. The hit test looks for the element at the event coordinates based on the paint records generated during rendering.

The main thread checks the drawing record and queries the drawing content at coordinates X and Y

##### event optimization

Our typical screen refresh rate is 60fps, but some events trigger more than that. For optimization purposes, Chrome merges sequential events (e.g. Wheel, mousewheel, mousemove, pointermove, touchmove), And delay execution until the next frame of rendering. Discontinuous events such as KeyDown, KeyUp, MouseUp, MouseDown, TouchStart, and TouchEnd are triggered immediately.

Use getCoalescedEvents to get intra-frame events

Event merging can help build a good user experience for most Web applications. However, if you are developing a drawing application and need to draw a line based on the coordinates of the TouchMove event, some coordinate points in the interval may also be lost due to event merging when you try to draw the next smooth line. At this point, you can use the getCoalescedEvents method of the target event to get the combined event information.

window.addEventListener('pointermove', event => {
    const events = event.getCoalescedEvents();
    for (letevent of events) { const x = event.pageX; const y = event.pageY; // use x and y coordinates to draw lines}});Copy the code

Reference:

  • CPU, GPU, Memory, and multi-process architecture
  • What happens in navigation
  • Input is coming to the Compositor
  • Inner workings of a Renderer Process
  • How browsers work: Behind the scenes of the new Web browser