This is the fourth day of my participation in the August More text Challenge. For details, see:August is more challenging

In this article we’ll take an in-depth look at the details of The Chrome browser, from the advanced architecture to the rendering pipeline. If you’re wondering how a browser can turn your code into a functional web site, or you’re not sure why you recommend using specific technologies to improve performance, this article is for you.

For front-end developers, it’s important to know the whys and wheres, so it’s important to be able to harness the browser better if you know the enemy and the browser will give you better performance.

Core computing terms and Chrome’s multi-process architecture

The core of a computer is the CPU and GPU

CPU (central processing unit) and GPU (graphics processing unit) as the two most important computing units in the computer directly determine the computing performance.

The CPU can be thought of as the brain of your computer. Unlike cpus, Gpus excel at simple tasks, but span multiple cores simultaneously. As the name suggests, it was originally developed to work with graphics. This is why “using a GPU” or “GPU support” in the context of graphics is associated with fast rendering and smooth interaction. In recent years, as Gpus accelerate computing, more and more computing becomes possible using gpus alone.

When you launch an application on a computer or mobile phone, the CPU and GPU are the drivers that drive the application. Typically, applications run on cpus and Gpus using mechanisms provided by the operating system.

Computer architecture.

We can divide the computer into three layers from bottom up: the machine hardware at the bottom, the operating system in the middle, and the applications at the top. With the existence of an operating system, upper-layer running applications can use the capabilities provided by the operating system to use hardware resources without directly accessing them.

Executes programs on processes and threads

The process acts as the bounding box, and the thread acts as the abstract fish swimming within the process

Another concept to master before delving into browser architecture is processes and threads. A process can be described as the executor of an application. A thread is a thread that exists inside a process and executes any part of its process program.

When you start the application, a process is created. It is possible for the program to create threads to help it work, but this is optional. The operating system allocates private memory space for processes to use, and when the program is closed, this private memory is freed. Coroutines are smaller than threads, and coroutines are smaller units that run in threads. Async /await is implemented based on coroutines.

Interprocess communication (IPC) : One process allows the operating system to start another process running different tasks. When two processes need to communicate, you can use Inter Process Communication (IPC).

Most programs are designed to use IPC for interprocess communication. The advantage is that when one process sends a message to another process without a response, it does not affect the current process to continue working.

Chrome’s multi-process architecture

Browser Architecture

So how do you build a Web browser using processes and threads? Well, it could be a process with many different threads, or it could be many different processes with several threads communicating via IPC.

The important thing to note here is that these different architectures are implementation details. There is no standard specification for how to build a Web browser. One browser’s approach may be completely different from another.

Here we will use Chrome’s latest architecture as described in the following image.

At the top is the coordination of the browser process with other processes that deal with different parts of the application. For the renderer process, multiple processes are created and assigned to each TAB. Until recently, Chrome provided a process for each TAB where possible; Now it tries to provide each site with its own process, including iframe (see Site quarantine).

Which process controls what?

The following table describes each Chrome process and what it controls:

Processes and what they control
The browser Controls the “Chrome” section of the application, including the address bar, bookmarks, and back and forward buttons. It also handles the invisible privileged parts of the Web browser, such as network requests and file access.
The renderer Controls any content in the tabs that display the web site.
The plug-in Controls any plug-ins used by the site, such as Flash.
Graphics processor Processes GPU tasks independently of other processes. It is split into different processes because the GPU processes requests from multiple applications and draws them on the same surface.

There are many more processes, such as extension processes and utility processes. If you want to see how many processes are running in Chrome, click the options menu icon More_vert in the upper right corner, select more Tools, and then select Task Manager. This opens a window with a list of the processes currently running and how much CPU/ memory they are using.

Benefits of Chrome’s multi-process architecture

Chrome uses multiple renderer processes. In the simplest case, you can imagine that each TAB has its own renderer process. Suppose you have three tabs open, each of which is run by a separate renderer process. If one TAB becomes unresponsive, you can close the unresponsive TAB and move on while keeping the other tabs active. If all tabs are running on one process, when one becomes unresponsive, all tabs are unresponsive. Then you have to restart the browser.

Another benefit of splitting the browser’s work into multiple processes is security and sandbox.

Because processes have their own private memory space, they often contain copies of common infrastructure (such as V8, which is Chrome’s JavaScript engine). This means more memory usage, because if they are threads within the same process, they cannot be shared as they are. To save memory, Chrome limits the number of processes it can start. The limit depends on the memory and CPU capabilities of your device, but when Chrome hits the limit, it starts running multiple tabs from the same site in a single process.

Browser access

What happens when you type a URL into the address bar

Let’s look at a simple Web browsing use case: You type a URL into a browser, and the browser pulls data from the Internet and displays a page. Here we’ll focus on the part where the user requests the site and the browser prepares to render the page – also known as navigation.

When you type a URL into the address bar, your input is handled by the UI thread of the browser process.

1. Process input

When the user starts typing in the address bar, the UI thread first asks “Is this a search query or a URL?” . In Chrome, the address bar is also a search input field, so the UI thread needs to parse and decide whether to send you to the search engine or to the site you requested.

Because Chrome’s address bar can be used as both the address bar and the search bar

2. Start the access

When the user presses the enter key, the UI thread makes a network call to get the site’s content. The loading icon appears in the title of the browser TAB, and the network thread looks up the domain name through the appropriate protocol, such as DNS, and requests the server to establish a TLS connection.

When the server returns a browser redirect request, the network thread notifies the UI thread that it needs to redirect, and then starts requesting resources at the new address.

When the server returns a browser redirect request, the network thread notifies the UI thread that it needs to redirect, and then starts requesting resources at the new address.

3. Process response data

When a network thread receives data from the server, it tries to get a content-type from the first few bytes in the data in an attempt to understand the format of the data.

When the returned data type is HTML, the data is passed to the rendering process for further rendering. However, if the data type is a ZIP file or other file format, the data will be passed to the download manager for further file preview or download.

Before starting rendering, the network thread checks the security of the data, and this is where the browser guarantees security. If the returned data comes from some malicious site, the network thread will display a warning page. The Cross Origin Read Blocking(CORB) policy also ensures that sensitive cross-domain data is not passed to the rendering process.

4. Rendering process

When all the checks are done and the web thread is confident that the browser can access the site, the web thread notifies the UI thread that the data is ready. The UI thread will find a rendering process based on the current site to do the rest of the rendering.

In the second step, when the UI thread passes the request address to the network thread, the UI thread already knows which site to visit. At this point, the UI thread can begin to find or start a rendering process at the same time as the network thread downloads the data. If the network thread gets the data as expected, the rendering process is ready to render. This action reduces the time between when the network thread starts requesting the data and when the rendering process can start rendering the page. Of course, if there is a request for redirection, the pre-initialized rendering process may not be used, but compared to the normal site visit scene, redirection is often a minority, in the actual work, also need to give a specific scenario according to the specific scenario, do not have to pursue a perfect solution.

5. Submit access requests

After the previous steps, the data and rendering process are ready. The browser process commits this access to the rendering process via IPC, and also ensures that the rendering process can continue to retrieve data via the network thread. Once the browser process receives a confirmation message from the rendering process, the access process ends and the document rendering process begins.

At this point, the address bar displays an icon indicating security, along with information about the site. The current site information is also added to the access history. To restore access to historical information, the historical information is stored on the hard disk when the TAB or window is closed.

Extra step. Loading complete

When the access is submitted to the rendering process, the rendering process continues to load the page resources and render the page. When the render process “finishes” rendering, it sends a message to the browser process. This message is sent when all the child pages (frames) in the page have finished loading, that is, when the onLoad event is triggered. When the “End” message is received, the UI thread hides the load status icon on the TAB title to indicate that the page is loaded.

But “over” doesn’t mean all the loading is over, because there may still be JavaScript loading additional resources or rendering new views.

Visit different sites

That was the end of an ordinary visit. When we enter another address, the browser process repeats the above process. However, before starting a new access, it is determined whether the current site cares about the beforeUnload event.

Beforeunload event alerts the user whether to access a new site or close a TAB. If the user refuses, the new access or shutdown is blocked.

Since all the work involving rendering and running Javascript occurs in the rendering process, the browser process needs to check with the rendering process whether the current site cares about unload before new access begins.

If an access is initiated from a rendering process, such as when the user clicks on a link or runs the JavaScript code location = ‘http://newsite.com’, the rendering process first checks beforeUnload. It then performs the same steps as the browser process initializes access, except that the access request is sent from the rendering process to the browser process.

When a new site request is created, a separate rendering process is used to process the request. To support events like Unload, the old rendering process needs to hold its current state. For a more detailed description of the lifecycle, see Page Lifecycle.

Service worker

The Service worker is a technology that allows web developers to control the cache. If the Service worker is implemented to fetch data from local storage, the original request will not be sent by the browser to the server.

It is worth noting that the code in the Service worker is running in the rendering process. When the access starts, the network thread checks whether any Service worker will process the current address request based on the domain name. If so, the UI thread will find the corresponding rendering process to execute the Service worker’s code. The Service worker lets the developer decide whether the request is fetching data from local storage or from the network.

Access preloading

If the Service worker finally decides to fetch data from the network, we can see some delays caused by this cross-process communication. Navigation Preload is an optimized mechanism to load resources at the same time as the Service worker starts. With special request headers, the server can decide what content to return to the browser.

Render && parse

The rendering process is responsible for the content of the page

The rendering process is responsible for everything that happens in the browser TAB. In a rendering process, the main thread is responsible for parsing, compiling or running the code. When we use the Worker, the Worker thread is responsible for running part of the code. Compositing threads and raster threads are also running in the rendering process, responsible for rendering pages more efficiently and smoothly.

The most important work of the rendering process is to transform THE HTML, CSS, and Javascript code into a page that the user can interact with.

renderer.png

The parsing process

The following sections focus on how the rendering process converts text from a network thread into an image.

The DOM to create

When the rendering process receives the message from the browser process to submit access and begins to accept HTML data, the main thread begins to parse the HTML text string and convert it into the Document Object Model (DOM).

DOM is a form of data used within a browser to express the structure of a page. It also provides an interface for Web developers to manipulate page elements by fetching and manipulating them in Javascript code.

The Standard for converting HTML text to the DOM is defined by the HTML Standard. We will see that the browser never throws an exception during the transformation, such as missing closing tags, mismatching start and closing tags, etc. This is because the HTML standard defines silent handling of errors. If you are interested in this, read An Introduction to Error Handling and Strange Cases in the Parser.

The loading of additional resources

A website often uses additional resources such as images, style files, and JavaScript code. These resources also need to be retrieved from the network or cache. The main thread should load them one by one as the HTML is converted, but for efficiency, the Preload Scanner runs at the same time as the transformation. When a preload scan finds a tag like IMG or link in the parser parsing of HTML, the request is sent to the web thread of the browser process, and the main thread decides whether to wait for the resource to be loaded, depending on whether the extra resource will block the conversion process.

JavaScript blocks the conversion process

When the HTML parser finds the

Tell the browser how to load the resource

If our Javascript code doesn’t need to change the DOM, we can add async or defer to the

Style Calculation

To present a page, DOM alone is not enough, because we also need styles to make the page look good. The main thread parses the style (CSS) and determines the style of each DOM element. These styles depend on the range of CSS selectors, which we can see in the browser developer tools.

computedstyle.png

Even if we don’t specify any style to the DOM, the

tag will appear larger than the

tag. This is because browsers build in different styles for different tabs. You can get these default styles from the Chromium source code.

Layout

After the style calculation is done, the rendering process knows the STRUCTURE of the DOM and the style of each node, but it’s still not enough to render a page.

Layout is the process of specifying geometric information for an element. The main thread iterates through the ELEMENTS and their styles in the DOM structure, creating a Layout tree with coordinates and element size information. The structure of the layout tree is very similar to the structure of the DOM tree, but contains only the elements that will be displayed on the page. Elements whose style is set to display: None will not appear in the layout tree, but those whose style is set to Visiblility: Hidden will. Similarly, when we use a pseudo-element that contains content (e.g. P ::before {content: ‘Hi! ‘}), the element will appear in the layout tree even if it doesn’t exist in the DOM tree, which is why we can’t get pseudo-elements using the API provided by the DOM.

Describing page layout information is a challenging task, and even on block-only pages you must consider font sizes and where to wrap lines, because you need to know the size and shape of the previous element when calculating the position of the next element.

CSS can float elements, overflow elements in parent elements, and redirect text. As you can imagine, the layout phase is a lot of work. In Chrome, there’s an entire team working on maintaining the layout, and you can watch the video for more details.

Paint

The DOM, styles, and layouts still don’t work. Imagine trying to copy a picture. We know the size, shape, and position of the elements in the drawing, and we also need to know the order in which they were drawn.

At this stage, the main thread traverses the layout tree and creates a draw record, which is a sequence of draw steps, such as drawing the background first, then the text, then the shape.

The rendering process is expensive

During the rendering process, any change in the data generated at any step will lead to a series of subsequent changes. For example, when the layout tree changes, drawing the changed part of the page requires refactoring.

When some elements are animated, the browser needs to draw those elements in each frame. When the continuity of each frame is not guaranteed, the user will feel stuck.

Normally the render operations can be synchronized with the screen refresh, but the fact that these operations are running in the main thread means that they can be blocked by running Javascript code.

To not affect rendering operations, we can Optimize Javascript operations into small chunks, and then use requestAnimationFrame(). See Optimize Javascript Exectuion for details on how to Optimize. Workers can also be used to avoid blocking the main process when a large amount of computation is required.

Compositing

The browser now knows the structure of the document, the style of each element, the geometry of the elements, and the order in which they are drawn. The process of converting this information into pixels on the screen is called rasterization, and rasterization is a category of graphics.

The traditional approach is to rasterize the content of the visible area. As the user scrolls through the page, more areas are constantly rasterized. With modern browsers, however, there is a more complex process called synthesis.

Compositing is a technique for splitting a page into multiple layers. Compositing threads can rasterize the layers in different threads and combine them into a single page. When scrolling, if a layer has been rasterized, a new frame is synthesized using an existing layer, and animation can be achieved by moving the layer.

Layer (Layer)

To determine which elements are included in the layer, the main thread traverses the layout tree to find the parts that need to be generated. For developers, when a part needs to be rendered in a separate layer, we can use the CSS property will-change to let the browser create the layer. The standards for how the browser generates the layer are available.

Although layering can optimize browser performance, it does not mean that you should give each element a layer. Too many layers can affect performance, so you should divide the layers on a case-by-case basis.

Grid threads and composite threads

When the layout tree and drawing order are determined, the main thread submits this information to the composite thread. The composite thread rasterizes the layers. A layer may contain an entire page or parts of a page, so the composite thread breaks the layer into chunks and sends them to the raster thread. Raster threads rasterize these blocks and store them in the GPU cache.

The composite thread can determine the priority of the raster thread blocks so that the parts that the user can see are rasterized first. A layer can also contain multiple blocks to support functions such as scaling.

When the block is rasterized, the Compositor thread uses Draw Quads to collect this information and create a Compositor frame.

Draw quads

Stored in the cache and contains information such as the location of the block, which describes how to compose the page using blocks.

Compositor frame

A collection used to store which Draw Quads are contained in a frame of a page.

A composite frame is then submitted to the browser process. If the browser UI changes, or if the plugin UI changes, another composited frame is created. So every time there’s an interaction, the compositing thread creates more compositing frames and then renders the new sections on the GPU.

The advantage of composition is that it is independent of the main thread. The compositing thread does not have to wait for the style calculation and Javascript code to run. This is why compositing is better for optimizing interaction performance, but the main thread must be involved if the layout or drawing needs to be recalculated.

In essence, the rendering process of the browser is the process of converting text into images and displaying new images when the user interacts with the page. In this process, the main thread in the rendering process completes the calculation work, and the composition thread and the grid thread complete the drawing work of the image. The more detailed concepts of forced layout, rearrangement, and redrawing will be explained later in this article.

Look at events from the browser’s perspective

When we hear an event, we usually think of it as typing in a text box or clicking the mouse, but from the browser’s perspective, typing an event means all the user actions. Mouse wheel scrolling or screen touching are input events.

When users interact with the page, the browser process first receives the event, however, the browser process which TAB is the only care about events occur, so the browser process event type and location information will be sent to the responsible for the current TAB of the rendering process, the rendering process will be appropriate to find out the elements of the incident and the trigger event listeners.

Processing of events by composite threads

In the previous section, we learned that the compositing thread can synthesize different raster layers using compositing techniques to optimize performance. If the page is not listening for any events, the compositing thread can generate new compositing frames completely independent of the main thread. But what if the page listens for events?

Mark the “slow Scroll” area

Since it is the main thread’s job to run Javascript, when the page has been synthesized by the compositing thread, the compositing thread will mark the areas where the event listens. With this information, the composite thread sends the event to the main thread for processing when it occurs in the region of the response. If it is not in the event listening region, the rendering process creates new frames directly without caring about the main thread.

Flags during event listening

A common approach in Web development is event proxies. With event bubbling, we can listen for events in the elements above the target element. See the code below.

document.body.addEventListener('touchstart'.event= > {  if (event.target === area) {    event.preventDefault();  }});
Copy the code

This way, you can listen for events more efficiently. But from the browser’s point of view, the entire page is now marked as a “slow scroll” area. This means that although some parts of the page do not need event listening, the composite thread still waits for the main thread to process the event after each interaction, and the optimization effect of the composite thread is lost.

To solve this problem, we can pass in passive: true (not supported by IE) to the event proxy. This tells the rendering thread that it still needs to send the event to the main thread for processing, but it doesn’t need to wait.

document.body.addEventListener('touchstart'.event= > {    if (event.target === area) {        event.preventDefault()    } }, {passive: true});
Copy the code

For details about using passive to improve the scroll performance, see using passive to improve the scroll performance in MDN.

Finding the event target

When the render thread sends an event to the main thread, the first thing to do is to find the target that the event is triggered to. Using the rendering information generated during the rendering process, target elements can be found according to the coordinates.

Reduce the number of events sent to the main thread

To keep the animation smooth, the display needs to refresh 60 times per second. For a typical touch event, the composite thread can submit the event to the main thread 60-120 times per second, and for a typical mouse event 100 times per second. Events are usually sent more often than screen refreshes.

If an event like TouchMove is sent to the main thread 120 times per second, it may take too long for the main thread to execute and affect performance.

To reduce the number of events sent to the main thread, Chrome incorporates sequential events. Events such as wheel, mousewheel, mousemove, PointerMove, touchMove are deferred until the next requestAnimationFrame.

And any discrete events like KeyDown, keyUp, mouseUp, mouseDown, TouchStart and Touchend are immediately sent to the main thread for processing.

conclusion

At this point, we can see how the browser works from the user’s input in the browser address bar to the display of the page image. So let’s summarize here.

  • The browser process, as the most important process, is responsible for most of the work outside the TAB, including address bar display, network requests, TAB status management, and so on.
  • Different rendering processes are responsible for different site rendering, and the rendering processes are independent of each other.
  • The rendering process obtains site resources from the browser process while rendering the page. Only secure resources are received by the rendering process.
  • In the rendering process, the main thread is responsible for most of the work except image generation. How to reduce the code running on the main thread is the key to optimize interactive performance.
  • The composite thread and the grid thread in the rendering process are responsible for the image generation, and the efficiency of image generation can be optimized by using the layering technology.
  • When the user interacts with the page, the event propagates from the browser process to the rendering process’s compositing thread and then decides whether to pass the event to the main thread of the rendering process according to the area that the event listens for.

Finally, thanks for reading.

If there are any mistakes in this article, please correct them in the comments section. If this article has helped you, please like 👍 and follow 😊.