What happens from entering the URL to rendering the page? How the browser works (part 1)

preface

This article is the sixth in a series of advanced frontier-road articles that delve into how browsers work, taking a look at browser history, and then taking a look at how the browser works from entering urls to rendering pages from the perspective of Chrome’s composition and multi-process architecture.

Browsers live and die

The tide of the Internet never stops, and the browser, the surfboard for surfing the Web, is getting better all the time. Since people entered the Internet age, there have been three browser wars. 1. The first browser war was between Internet Explorer and Netscape, and eventually Internet Explorer toppled Netscape with the sheer size of Windows. 2. The Second Browser War Netscape reemerged from the ashes as Firefox, which took a bite out of Internet Explorer. Firefox and Internet Explorer are in the middle of a long fight, suddenly out of thin air Chrome – the boy from the rich family of Google suddenly grew up extra strong, strong momentum, catch up with the two predecessors gasping for breath. Who on earth is Chrome, able to practice such magic, in just a few years to become a major mainstream of Internet browsing, not to say Firefox market share, even in a short period of time, beat Microsoft’s Internet Explorer, forming the world’s third browser war pattern? With the rise of the mobile Internet, the pattern of browser wars is more chaotic, I will not do the specific launch here.

Currently, there are five major browsers – IE, Firefox, Safari, Chrome and Opera

Gossip about Chrome’s pedigree

Out of curiosity, many people have gossiped about the history of Chrome, and discovered that the name Webkit is hidden behind Chrome. Those of you who have studied browsers have probably heard of Webkit at some point. Webkit is based on the KDE open source project, which is flourishing in Apple’s Safari project. There are many innovations in Webkit, including HTML5 and CSS3 trends in recent years. Webkit is small, flexible and powerful, and the industry loves it because of its open source code. Webkit can be found in everything from the Nokia S60 browser to Safari on the more expensive iPhone.

Never a company with a dull sense of smell, Webkit’s excellent nature also attracted the eye of the Internet hero. In September 2008, Google released a beta version of Chrome, and Chrome came out. Chrome uses Webkit code, inherits Webkit’s excellent typography engine, and renders pages with astonishing speed. Since Chrome uses Webkit source code and uses Webkit’s typography engine, can we assume that Google just added a shell on Top of Webkit to create Chrome?

webkit VS chromium VS chrome

The project name	origin	The difference between	The essence
webkit	Webkit originated from KDE’s open source project KHTML and thrived on Apple’s Safari project	It consists of two parts. One is WebCore, a typography engine (developed by Apple) that parses HTML and CSS frameworks. The other part is JSCore parsing engine, which is used to execute web JS scripts	Open source project, browser kernel, rendering engine
chromium	It was an open source project based on WebKit in the early stage. Later, due to the conflict between Webkit2 and its own sandbox design, Blink was initiated by Google as a branch of Chromium (different from The concept of WebKit) as a new alternative rendering engine	1. Google has recombed the code of Webkit, and Chromium code has much higher readability and compilation efficiency than Webkit. Compared with Chromium code, Webkit code is comparable to the book of god and much more difficult to develop. 2. Use a V8 engine, which is more efficient than Webkit’s JSCore	Experimental projects, open source software
chrome	Early Chrome just inherited the WebCore part of Webkit and used Google’s proud V8 engine on the JS engine, greatly improving script execution speed, which is an important reason why Chrome is so fast. Later, I replaced it with Chromium, an open source project initiated by myself	Commercial projects

Browser Composition

Part of the	describe
The user interface	Includes address bar, back/forward buttons, bookmarks directory, etc
Browser engine	Interface to query and manipulate the rendering engine
Rendering engine	It is used to display the requested content, for example, if the requested content is HTML, it parses the HTML and CSS and displays the parsed results.
network	Used to complete network calls, such as HTTP requests, it has a platform-independent interface that can work on different platforms
JS interpreter	Used to explain executing JS code
Data is stored	It belongs to the persistence layer, and the browser needs to save all kinds of data like cookies in the hard disk. HTML5 defines the Web Database technology, which is a lightweight and complete client storage technology

Rendering engine

The browser’s rendering engine, also known as the layout engine, is the browser’s core.

Browser kernel	Browser representation
Trident	On behalf of (IE Windows)
Presto	Represents the former Opera, currently using Blink
Gecko	On behalf of the Firefox
webkit	On behalf of the Safari
Blink	Chromium, a branch of the Webcore typography engine, uses V8

Multi-process Architecture of Browsers (take Chrome as an example)

The concepts of processes and threads may seem a bit vague to some front-end developers, but in order to better understand the multi-process architecture of browsers, let’s briefly discuss processes and threads.

Process and thread concepts

Concept: A thread is the basic unit of independent scheduling and dispatching, and the smallest unit of operational scheduling that an operating system can do. It is contained within the process and is the actual operating unit within the process. A thread is a single sequential flow of control in a process, and multiple threads can be concurrent in a process, each performing a different task in parallel. When a program is started, the operating system creates a block of memory for the program, which is used to store code, running data and a main thread to perform tasks. We call such a running environment a process. Multithreading can process tasks in parallel, but threads cannot exist alone. They are started and managed by processes. The use of multi-thread parallel processing in the process can improve the efficiency of operation, a process is a running instance of a program.

Process versus thread relationships

The module	describe
collapse	The failure of any thread in the process will cause the entire process to crash
Shared data	Multiple threads in a process can read and write the process’s common data
Recovery of memory	When a process is shut down, the operating system reclaims its memory
Process isolation	The contents of the processes are isolated from each other, and each process cannot access the data of the other processes

The early single-process architecture era

All the functional modules of the browser run in the same process (including the web, plug-ins, JavaScript runtime environment, rendering engine, pages, etc.), which consists of the following two threads

Page main thread: page rendering, page presentation, javascript environment, plug-in module
Network threads: Responsible for resource requests

Having so many functional modules running in a single process is a major factor in making single-process browsers unstable, fluid, and insecure

Is not stable. Plug-ins are the most likely module to cause problems. The accidental crash of a plug-in will cause the crash of the entire browser, and the crash of the rendering engine will also cause the crash of the entire browser. Some complicated JavaScript code may cause the crash of the rendering engine module
Not smooth. Rendering modules, JavaScript execution environments, and plug-ins all run in the same thread, meaning that only one module can be executed at a time. In addition, after running a complex page and then closing the page, there will be a situation that the memory can not be fully reclaimed. The memory leakage of the page is also an important reason for the single process to slow down
Not safe. Plug-in can use C/C++ code writing, through the plug-in can obtain any resources of the operating system, as for the page script, it can obtain system permissions through the vulnerability of the browser

Advantages: Page rendering is mutually exclusive, security, sandbox, robustness, site isolation (each Iframe starts a separate rendering process)

Chrome multi-process architecture (which processes are there? What are these processes responsible for?

Chrome uses a multi-process architecture, with a Browser process at the top that coordinates the Browser’s other processes. The latest Chrome includes one main Browser process, one GPU process, one NetWork process, multiple rendering processes, and multiple plug-in processes

process	The function
Browser process	Controls all user interfaces except tabs, including the address bar, forward/backward, and other interprocess coordination of the browser.
Network process	It is mainly responsible for the loading of web resources on the page. It was run as a module in the browser process before it became a separate process
Rendering process	The core task is to turn HTML, CSS, and JavaScript into a web page that users can interact with. Both the typography engine Blink and the JavaScript engine V8 run in this process. By default, Chrome creates a rendering process for each Tab. For security reasons, renderers are run in sandbox mode
GPU process	Responsible for rendering the entire browser interface. In fact, Chrome didn’t have a GPU process when it was first released. The original intention of using GPU was to achieve 3D CSS effect, but later the UI interface of web page and Chrome were drawn on GPU, which made GPU become a common requirement of browser. Finally, Chrome has introduced GPU processes on top of its multi-process architecture
Plug-in process	It is mainly responsible for the running of plug-ins. Plug-ins are prone to crash. Therefore, plug-ins need to be isolated through the plug-in process to ensure that the plug-in process crash does not affect the browser and page
The utility process	Sometimes the main browser process needs to do “dangerous” things like decoding images and decompressing files. If these “dangerous” operations fail, the entire main process will crash abnormally, which we do not want to see. So Chromium designed a utility process mechanism. If the main process needs to do some inconvenient tasks temporarily, a utility process can be started to take the place of the main process. The main process communicates with the utility process through IPC messages.

There are other processes as follows if you are currently in a well-resourced environment

UI process
Storage process
Equipment process
Audio process
Video process
The Profile process

How do these processes work together to render a page, from entering the URL to rendering the page?

What happens to navigation?

Perhaps the most common use of Chrome is to search for keywords in the address bar or navigate to a website, but let’s take a look at how browsers view this process. We know that the Browser Process handles most of the work outside the Browser Tab, which is further divided into different threads:

UI Thread: Controls buttons and input fields on the browser
Network Thread: Processes network requests and obtains data from the network
Storage thread: Controls file access

When we enter text in the browser address bar and hit Enter to get the page content, the process can be divided into the following steps in the view of the browser:

1. Process input

The UI thread of the browser process first needs to determine whether the user entered a URL or a query.

2. Start the navigation

When the user clicks enter, the UI Thread of the browser process notifies the Network Thread to obtain the web content and controls the spinner display on the TAB to indicate that it is loading.
Network Threads perform DNS queries and then establish A TLS connection for the request
If a Network thread receives a redirect header such as 301, the Network thread informs the UI Thread that the server requests a redirect, and another URL request is triggered.

3. Read the response

When a request response is returned, the Network Thread determines the format of the response Content based on the CONTent-Type and MIME Type fields in the HTTP response header
The next step is to pass the data to the Renderer process if the response content is in HTML format, or to the download manager if it is a ZIP file or other file.
The Safe Browsing check is also triggered at this point, and Network Thread displays a warning page if the domain name or request content matches a known malicious site. CORB detection is also triggered to ensure that sensitive data is not passed to the renderer process.

When all the above checks are complete and the Network Thread is sure that the browser can navigate to the requested page, the Network Thread informs the UI thread that the data is ready, The UI Thread will find a Renderer process to render the page.

Since it takes time for a network request to get a response, there is actually an acceleration scheme. When the UI Thread sends a URL request to the Network Thread, the browser already knows which site to navigate to. The UI Thread searches for and starts a rendering process in parallel. If all goes well, the rendering process will be ready when the Network Thread receives the data, but if it encounters redirection, the prepared rendering process may not be available and a new rendering process will need to be restarted.

The Browser Process sends an IPC message to the Renderer Process to confirm the navigation. Once the Browser Process receives a render confirmation message from the Renderer Process, the navigation Process ends and the page loading Process begins.

At this point, the address bar is updated to show the page information for the new page. The History TAB is updated to return to the navigated page using the back key, and this information is stored on the hard disk to make it easier to recover after closing the TAB or window.

6. Extra steps

How does the renderer process work (how does the renderer process network resources)?

The renderer process is responsible for almost everything in the Tab. The core purpose of the renderer process is to convert HTML CSS JS into a web page that users can interact with. The rendering process mainly contains the following threads:

Main thread
Worker thread Worker thread
Compositor threads
Raster thread

1. The main thread parses the HTML and builds the DOM Tree. When the renderer process receives the navigation acknowledgement and begins to accept THE HTML data, the main thread parses the text string and builds the DOM Tree. Specific parsing generated DOM Tree process is not described here, the previous article has been explained very clearly.

2. Load secondary resources Web pages often contain additional resources such as images, CSS, JS, etc. These resources need to be retrieved from the network or cache. The main process can request them one by one as it builds the DOM. To speed up the preload scanner runs at the same time. If there are tags such as in the HTML, The Preload scanner passes these requests to network Threads in the Browser process for downloading resources

JavaScript can block the parsing

When a

However, there are several ways to tell the browser what to do with a resource. For example, if you add properties such as Async or defer to the

It’s not enough to just render the DOM to get the exact Style of the page. The main process also parses the CSS based on the CSS selector to get the final computed Style value for each node. Even if the page doesn’t load any CSS, the browser will have a default style for each element.

When rendering objects are created and added to the tree, they have no position or size. The process of calculating these values is called Layout or reflow. Layout is a recursive process that starts with the root rendering object, which corresponds to the HTML document element, and continues recursively through some or all of the frame hierarchy. Computations are performed for each rendered object that requires geometric information. The position of the root render object is 0,0, and its size is viewport- the visible part of the browser window. All render objects have a Layout or reflow method, and each render object calls the Layout method of the children that needs the layout. Layout is the process of finding the geometry of all the elements. The specific process is as follows:

By traversing the DOM and the associated element’s computational style, the main thread builds a layout tree that contains coordinate information and box size for each element. A layout tree is similar to a DOM tree, but it contains only elements visible to the page. If an element is set to display: None, it does not appear in the layout tree. Pseudo-elements (such as :before or :after) are visible in the layout tree, though not in the DOM tree.

6. Paint each element

Even if we know the location and style information (size, shape) of the different elements, we still need to know the order in which the different elements were drawn to correctly draw the entire page. During the draw phase, the main thread traverses the layout tree to create the draw record. Drawing records can be regarded as notes recording the sequence of drawing elements. If you’ve ever used javascript to draw on A Canvas, you can clearly understand the process (move to (x1,y1), draw rectangle A, fill the background, insert text, move to (x2,y2), draw circle B, insert text, etc.)

7. Composite

layered

Once the Layer Tree is drawn and the order is determined, the main thread submits the information to the composite thread. Each Layer object is then rasterized by a composite thread. Since a layer can be as large as a page, the synthesizer thread splits them into blocks of title and sends them to the raster thread. The raster thread rasterizes each title and stores them in GPU memory

divisions
Use raster threads and composite threads

** Three common rendering processes ** During the life of the page, the page will be rendered at least once (steps 1-7 above) as it is generated. Reflow and repaint may also be triggered repeatedly during the user’s visit, which affects performance regardless of whether the page is redrawn or rearranged. The scariest thing about rearrangements is that they cost us a lot of performance. Here’s a brief introduction to these two concepts:

Repaint: The process of repainting an element after its appearance has been changed (background color, borders, etc.) without changing its layout.
Reflow: When DOM changes affect the element’s geometry (its position and size), the browser needs to recalculate the element’s geometry and place it in the correct position in the interface, a process called rearrangement.

Note both: redrawing does not necessarily lead to rearrangement, but rearrangement does lead to redrawing

JS/CSS> Compute Style > Layout > Draw > Render Layer Merge

The rendering process in this figure corresponds to the reflow rendering process, which will be drawn after layout.

JS/CSS> Compute Styles > Draw > Render Layer Merge

The rendering process in this image corresponds to the process of repaint rendering, which does not require a layout, but draws the current element without recalculating its parent element.

JS/CSS> Computing Styles > Render Layer Merge

The rendering process in this image is quite special, it does not choose layout and drawing, it only needs to be modified on the composition layer.

expand

Inside look at modern web browser (part 1)
Inside look at modern web browser (part 2)
Inside look at modern web browser (part 3)
Inside look at modern web browser (part 4)
Illustrate the basic workings of the browser
Repaint and reflow explained in detail