preface

In our interview process, we are often asked this question: what happened to the browser from entering the URL in the browser address bar to the page display? It may seem like a cliche, but how well this question is answered can really tell you a lot about the breadth and depth of the interviewer’s knowledge.

This article tells you what happens inside the browser, from the perspective of the browser, after entering the URL and pressing Enter. After reading this article, you will know:

  • What processes are in the browser, and what do they do
  • Browser address What the internal process or thread does after entering the URL
  • How do internal processes handle these interaction events when we interact with the browser

Browser architecture

Before we talk about browser architecture, let’s understand two concepts, processes and threads.

Process is an execution process of the program, is a dynamic concept, is the basic unit of allocation and management of resources during the execution of the program, thread is the basic unit of CPU scheduling and allocation, it can share all the resources of the process with other threads belonging to the same process.

In simple terms, processes can be understood as executing applications, and threads can be understood as executors of code in our applications. And their relationship can be imagined, thread is running in the process inside, a process there may be one or more threads, and a thread, can only belong to a process.

Is known to all, the browser belongs to an application, an execution, and the application can be understood as the computer started a process, the process starts, the CPU will give the process allocation corresponding memory space, when we got the process of memory, a thread can be used for resource scheduling, and then complete our application functionality.

In an application, in order to meet the needs of the function, the started process will create a new process to handle other tasks, these new processes have a new independent memory space, cannot be with the original process internal memory, if these processes need to communicate with each other, This can be done through Inter Process Communication.

Process 1

Many applications use this multi-process approach because processes are independent of each other and do not affect each other. That is, when one process dies, it does not affect the execution of the other processes. You only need to restart the stalled process to resume running.

Multi-process architecture for browsers

If we were to develop a browser, its architecture could be a single-process multithreaded application, or a multi-process application using IPC communication.

Different browsers use different architectures. The following uses Chrome as an example to describe the multi-process architecture of the browser.

In Chrome, there are four main processes:

  • The Browser Process is responsible for moving the Browser TAB forward and backward, working with the address bar and bookmark bar, and handling some of the Browser’s invisible low-level operations, such as web requests and file access.
  • Renderer Process: Responsible for the display of a Tab. Also called the rendering engine.
  • Plugin Process: Controls the plugins used by web pages
  • GPU Process: Processes THE GPU tasks of the entire application
Process relationship

What is the relationship between these four processes?

First, when we want to browse a web page, we enter a URL in the Browser’s address bar, and the Browser Process sends a request to the URL to retrieve the HTML content of the URL, and then passes the HTML to the Renderer Process. The Renderer Process parses the HTML content and returns any resources that need to be requested by the network to the Browser Process for loading. The Renderer Process also notifies the Browser Process that the Plugin Process needs to load the plug-in resources and execute the plug-in code. After parsing, Renderer Process computes image frames and hands these image frames to GPU Process, which converts them to the image display screen.

Process relationship

Benefits of a multi-process architecture

Why does Chrome use multi-process architecture?

First, higher fault tolerance. In today’s WEB applications, HTML, JavaScript and CSS are increasingly complex. These codes running in the rendering engine frequently have bugs, and some bugs will directly cause the crash of the rendering engine. The multi-process architecture enables each rendering engine to run in its own process without being affected by each other, that is to say, When one of the pages crashes and dies, the rest of the pages continue to run normally.

Browser fault tolerance

Second, increased security and sanboxing. Rendering engines will often encounter untrustworthy and even malicious code on the network, and they will take advantage of these vulnerabilities to install malicious software on your computer. To address this problem, browsers limit the permissions of different processes and provide sandbox environments to make them more secure and reliable

Third, higher response speed. In a single-process architecture, tasks compete for CPU resources, which slows down browser responsiveness, while a multi-process architecture avoids this disadvantage.

Multi-process architecture optimization

As mentioned earlier, the Renderer Process is responsible for the display of a Tab. This means that each Tab has a Renderer Process. Memory between these processes cannot be shared, and memory between different processes often needs to contain the same content.

The browser process mode

To save memory, Chrome provides four Process Models, each of which does different things to the TAB Process.

  • Process-per-site-instance (default) – Use one Process for the same site-instance
  • Process-per-site – A Process is used for the same site
  • Process-per-tab – Each TAB uses one Process
  • Single process – All tabs share a Single process

Site and site-instance need to be defined here

  • Site refers to the same registered domain name(e.g. Google.com, bbc.co.uk) and scheme (e.g. https://)). For example, a.baidu.com and B.baidu.com can be regarded as the Same site. Note that this site must be distinguished from the same-origin policy, because the same-origin policy also involves subdomain names and ports.

  • Site-instance refers to a group of connected pages from the same site, Connected can obtain references to each other in script code. A new page and an old page that meet the following two conditions and belong to the same site defined above belong to the same site-instance

    • By the user<a target="_blank">This way click to open a new page
    • New pages opened by JS code (e.gwindow.open)

With the concept understood, the four process patterns are explained below

The first is Single Process, which, as the name implies, is single-process mode, where all tabs use the same process. Next comes process-per-tab, which, as the name implies, creates a new Process for each TAB opened. For process-per-site, when you open the page of A.baidu.com and the page of B.baidu.com, the tabs of these two pages use the same Process, because the sites of these two pages are the same. In this way, if one of the tabs crashes, And the other TAB will crash.

Process-per-site-instance is the most important, because this is the default mode used by Chrome, which is the mode used by almost all users. When you open a TAB to visit A.baidu.com and then open another TAB to visit B.baidu.com, the two tabs will use two processes. If you open the b.baidu.com page in a.baidu.com with JS code, the two tabs will use the same process.

Default mode selection

So why do browsers use process-per-site-instance as the default Process mode?

Process-per-site-instance is compatible with performance and ease of use, and is a relatively middle-of-the-road and general pattern.

  • The ability to open many fewer processes means less memory footprint compared to process-per-tab
  • Compared with process-per -site, it can better isolate unrelated tabs under the same domain name, which is more secure

What happened to the navigation

We’ve talked about multi-process architecture in browsers, the various benefits of multi-process architecture, and how Chrome optimizes multi-process architecture. Let’s take a closer look at how processes and threads render our web pages from the simple scenario of a user browsing the web.

Page loading process

As we mentioned earlier, most of the work outside of TAB is done by the Browser Process. The Browser Process has different worker threads for different tasks:

  • UI Thread: Controls buttons and input fields on the browser;
  • Network thread: processes network requests and obtains data from the network.
  • Storage thread: controls file access.
Browser process thread

Step 1: Process the input

When we enter the content in the address bar of the browser and press Enter, the UI Thread will determine whether the input content is search keyword (Search query) or URL. If the input content is search keyword, the default search engine will search for the CORRESPONDING URL. If the input content is URL, the USER will start to request the URL.

Process the input

Step 2: Start navigation

After the press enter is pressed, THE UI Thread delivers the URL corresponding to keyword search or the entered URL to the Network thread. At this time, the UI thread displays the icon before Tab as loading state, and then the Network process conducts a series of operations such as DNS addressing and ESTABLISHING TLS connection for resource request. If it receives a 301 redirect response from the server, it tells the UI thread to redirect and then it makes a new network request again.

Start navigation

Step 3: Read the response

After receiving the response from the server, the Network Thread parses the HTTP response packet and determines the MIME Type (MIME Type) of the response body based on the Content-Type field in the response header. If the media Type is an HTML file, The response data is passed to the Renderer process for further work, or if it is a ZIP file or other file, to the download manager.

At the same time, the browser runs Safe Browsing, a security check that allows Network Thread to display a warning page if the domain name or request content matches a known malicious site. In addition, network threads do a CORB (Cross Origin Read Blocking) check to ensure that sensitive cross-site data is not being sent to the renderer process.

Step 4: Find the renderer process

After the network Thread is satisfied that the browser can navigate to the requested page, the Network Thread notifies the UI Thread that the data is ready, and the UI Thread finds a Renderer process to render the page.

Find the renderer

In order to optimize the step of finding the renderer process, the browser has pre-found and started a renderer process from the second step. If all goes well in the middle step, when the Network Thread receives the data, the renderer process will be ready. However, if a redirect is encountered, the prepared renderer process may not be available, at which point a renderer process is restarted.

Step 5: Submit the navigation

At this point, the Browser Process sends an IPC message to the Renderer Process to confirm navigation. At this point, the Browser Process sends the prepared data to the Renderer Process, which receives the data and sends an IPC message to the Browser Process. Tell the browser process that the navigation has been submitted and the page starts loading.

Submit the navigation

The navigation bar is updated, the security indicator is updated (the little lock in front of the address), and the history TAB is updated, which allows you to switch the page backwards and forwards.

Step 6: Initialization is complete

When the navigation is submitted, the rendering process starts to load the resources and render the page (detailed below). When the page is rendered (the onLoad event is triggered by both the page and the internal IFrame), it will send an IPC message to the browser process, telling the browser process, The UI Thread stops showing loading ICONS in the TAB.

Principles of Web Rendering

After the navigation process is complete, the browser process hands the data to the renderer process, which takes care of everything within the TAB. The core purpose is to turn the HTML/CSS/JS code into a Web page that the user can interact with. So how does the rendering process work?

The renderer process contains the following threads:

  • A main thread
  • Multiple work threads
  • A Compositor thread
  • Multiple Raster threads
Threads in the browser process

Different threads have different job responsibilities.

Build the DOM

When the renderer process receives the navigation acknowledgement, it begins to receive data from the browser process. At this point, the main thread parses the data and converts it into a Document Object Model (DOM) Object.

DOM is a data structure and API for WEB developers to interact with WEB pages through JavaScript.

Subresource loading

In the process of DOM construction, images, CSS, JavaScript scripts and other resources will be parsed, which need to be obtained from the network or cache. If the main thread encounters these resources in the process of DOM construction, it will initiate requests to obtain them one by one. In order to improve efficiency, The Browser also runs the Preload scanner program. If tags such as IMG and link exist in HTML, the preload scanner passes these requests to the Browser Process’s Network Thread for resource download.

Loading child resources

Download and execute JavaScript

If a

However, there are several ways to tell the browser what to do with a resource. For example, if you add properties such as Async or defer to the

Style calculation – Style calculation

The DOM tree is just the structure of our page. To know what the page looks like, we also need to know the style of each node of the DOM. When the main thread parses a page and encounters CSS resources with a

Computed styles are the specific styles that the main thread calculates for each DOM element based on CSS selectors. The browser will provide the default style even if your page doesn’t have any custom styles.

Style calculation

Layout – Layout

After completing the DOM tree and calculation style, we also need to know the position of each node on the page. Layout is actually the process of finding the geometric relationship of all elements.

The main thread iterates through the DOM and related elements to build a Render Tree that contains the page coordinates of each element and the size of the box model. In the process, hidden elements are skipped. None), and pseudo-elements are visible in the layout tree, although not in the DOM.

layout

Paint, Paint

After the layout, we know the structure, style and geometric relationship of different elements. We need to know the drawing sequence of each element to draw a page. In the drawing stage, the main thread will traverse the layout tree and generate a series of paint records. Drawing records can be regarded as notes recording the sequence of drawing elements.

paint

Synthetic – Compositing

We have all the information about document structure, element style, element geometry and painting sequence. If we want to draw a page at this time, we need to convert this information into pixels in the display. This transformation process is called Rasterizing.

So we need to draw a page, the simplest way is to only raster the content of the page in the viewport, if the user scrolls the page, move raster frame and raster more content to fill up the missing part of the page, as follows:

The simplest rasterization process

The first version of Chrome used this simple drawing approach, but the only drawback is that every time the page scrolls, the raster thread needs to raster the content that moves into the view, which incurs a performance loss. To optimize this, Chrome employs a more complex process called compositing.

So, what is composition? Compositing is a technique for dividing a page into several layers, rasterizing them separately, and then combining them into a single page in a single thread, the Compositor thread. When the user scrolls the page, all the browser needs to do is compose a new frame to show the scrolling effect, since the layers are rasterized. The page animation implementation is similar by moving the layers on the page and creating a new frame.

The rasterization process of synthesis

To implement the composition technique, we need to Layer elements to determine which elements need to be placed on which Layer. The main thread traverses the render Tree to create a Layer Tree. Elements with will-change CSS attributes are treated as separate layers. Elements that do not have will-change CSS attributes will be placed in a separate layer by the browser depending on the situation.

layer tree

You might want to have a single layer for all elements on a page, but after a certain number of layers on the page, the compositing of layers is slower than rasterizing a small portion of the page in each frame, so it’s important to measure the rendering performance of your application.

Once the Layer Tree is created and the render order is determined, the main thread notifies the synthesizer thread, which starts rasterizing each Layer of the number of layers. Some layers can be the size of an entire page, so the composite thread splits them into tiles and sends them to a series of Raster Threads for rasterization. After completion, the raster thread will store the raster result of each graph block in the memory of GPU Process.

The raster thread creates the bitmap of the graph block and sends it to the GPU

To optimize the display experience, composite threads can give different raster threads different priorities, rasterizing layers that are in or near the viewport first.

When the blocks above the layer are rasterized, the compositor thread collects the information called draw quads from the blocks to construct a Compositor frame.

  • Drawing quadrilateral: Contains information such as the location of the blocks in memory and the position of the blocks on the page after the layers are composed.
  • Composite frame: A collection of drawn quadrangles that represent the contents of a frame on a page.

After all of the above steps are complete, the composite thread will commit a render frame to the Browser process via IPC. At this point another composite frame may be submitted by the browser process’s UI thread to change the browser’s UI. These composite frames are sent to the GPU to be displayed on the screen. If the compositing thread receives a page scroll event, the compositing thread builds another compositing frame and sends it to the GPU to update the page.

The compositing thread builds compositing frames, which are sent to the browser process and then to the GPU

The nice thing about composition is that there is no main thread involved, so the composition thread doesn’t have to wait for the style to evaluate and the JavaScript to finish executing. This is why synthesizer related animation is the most smooth, if an animation involves layout or drawing adjustment, it will involve the recalculation of the main thread, naturally will be a lot slower.

Browser handling of events

When the page is rendered, an interactive WEB page is displayed in the TAB, and the user can move the mouse, click the page and so on. When these events occur, how does the browser handle these events?

Take the click event as an example. When the mouse clicks on the page, the Browser Process first receives the event information, but the Browser Process only knows the type and location of the event, and how to Process the click event specifically. Again, the Renderer Process inside a Tab. The Browser Process receives the event and then passes the event information to the renderer, which finds the coordinates according to which the event occurred, finds the target object, and runs the target’s click-event-bound listener.

Click events are routed from the browser process to the renderer process

The synthesizer thread receives events in the renderer process

We said that in front of the synthesizer can separate from the main thread by thread has rasterizer layer create composite frame, scrolling page, for example, if you don’t have the page scroll events related to a binding, combiner thread can separate from the main thread create composite frame, if the page binding the page scroll event, synthesizer thread will wait for the main thread after the event processing will create the composite frame. So how does the synthesizer thread determine if this event needs to be routed to the main thread?

Since it is the main thread’s job to execute JS, when the page is synthesized, the synthesizer thread will mark the region of the page bound to the event handler as a non-fast scrollable region. If the event occurs in these regions with annotations, the synthesizer thread will send the event information to the main thread. Wait for the main thread to handle the event. If the event does not occur in these areas, the synthesizer thread will synthesize the new frame without waiting for the main thread to respond.

A user event occurred in the non-fast scroll area

For tags that are not fast scrolling areas, developers need to pay attention to the binding of global events. For example, we use event delegates to pass the event of the target element to the root element body for processing, as follows:

document.body.addEventListener('touchstart', event => {
  if (event.target === area) {
    event.preventDefault()
  }
})
Copy the code

In the developer’s point of view, this code will be a problem, but from the browser’s point of view, this piece of code to the body element binding event listener, which means that the entire page be edited as a non fast scroll area, it makes even if you are certain areas of the page without binding events, every time a user trigger event occurs, The synthesizer thread also needs to communicate with the main thread and wait for feedback, so the mode of the smooth synthesizer processing the composite frame independently fails.

An event handling diagram of a page when the entire page is a non-fast scrolling area

Passtive tells the browser that you want to bind the event and that you want the combinator thread to skip the main thread’s event processing and create the composite frame directly.

document.body.addEventListener('touchstart', 
event => {
    if (event.target === area) {
        event.preventDefault()
    }
 }, {passive: true});
Copy the code

Find the target object of the event

When the synthesizer thread received the event information, determine that the event occurred in the non-fast rolling area, the synthesizer thread will send this time information to the main thread, the main thread to obtain the event information is the first thing through the hit test (HIT test) to find the target object of the event. The specific hit test process is to traverse the paint records generated during the draw phase to find the element object containing the coordinates of the event occurrence.

An event handling diagram of a page when the entire page is a non-fast scrolling area

Browser optimizations for events

The average frame rate on our screen is 60 frames per second, or 60 FPS, but certain events trigger more frequently than that, such as wheel, MouseWheel, Mousemove, Pointermove, touchMove, these continuous events trigger 60 to 120 times per second. If each trigger event is sent to the main thread for processing, due to the relatively low screen refresh rate, the host thread will trigger an excessive number of hit tests and JS code, resulting in unnecessary performance loss.

Events flooded the screen refresh timeline, causing the page to lag

For optimization purposes, the browser merges these sequential events and delays them until the next render frame is executed, i.e., requestAnimationFrame.

Same axis of events as before, but this time events are merged and delayed

Non-continuous events, such as keyDown, keyUp, mouseDown, mouseup, TouchStart, Touchend, etc., are sent directly to the main thread for execution.

conclusion

The browser multi-process architecture, according to different functions divided into different processes, different tasks within the process divided into different threads, when the user starts to browse the web page, the browser process to process input, start navigation request data, request response data, search for new rendering process, submit navigation, After rendering the parsing HTML DOM, build process load resources, download and execute the JS code, style, layout, drawing, synthetic calculation, step by step build an interactive WEB page, browser process and accept after page interactive event information, and hand it over to the rendering process, tested hit NaZhu rendering process, Find the target element and execute the bound event to complete the page interaction.

Most of this article is a compilation, interpretation and translation of the inside Look at Modern Web Browser series. I hope you can only get inspired by this article.

Related references

  • Why do browsers use multi-process architectures
  • Learn how Chrome works
  • Browser multi-process architecture
  • Illustrate the basic workings of the browser
  • Inside look at modern web browser (part 2)
  • Inside look at modern web browser (part 3)

This work is reproduced (read the original text)