The age-old interview question: What happens when you type google.com into your browser and press Enter?
Introduction to the
Browser history
In 1991 Tim Berners-Lee released the first browser, WorldWideWeb, later renamed Nexus.
In 1993 NCSA developed Mosaic, a multi-platform browser, and licensed it to companies to create their own products.
In 1994 Marc Andreessen, one of the developers of Mosaic, founded Netscape Communications and developed the Mosaic Netscape browser (Netscape Navigator). After the release of the vast majority of the market share.
After acquiring the Mosaic license in 1995, Microsoft developed IE 1.0/2.0, bundled with Windows 95.
In 1996, Microsoft released Internet Explorer 3.0, which became the first commercial browser to support programming languages and CSS, and began to catch up with Netscape’s 86 percent market share.
Netscape opened the source code in 1998 under the name Mozilla.
In 1999, THE IE market share reached 99% due to the bundling of Windows.
In 2003 apple released the Safari browser, bundled with the Apple operating system.
Mozilla launched FireFox 1.0 in 2004.
In 2005, Apple opened source webKit, the core of its Safari browser. In the same year, Microsoft abandoned Internet Explorer due to performance and experience issues and launched a WebKit-based alternative to Internet Explorer, Edge.
Google developed Chromium based on WebKit in 2008 and released Chrome based on Chromium.
According to StatCounter, as of June 2021, Chrome accounts for 65.27% of the browser market share, while IE currently only accounts for 0.61%. It can be seen that IE is about to launch the historical stage of the browser. Edge, an upgrade to Internet Explorer, had just 3.4 percent of the market.
The high-level structure of the browser
The abstract hierarchical structure diagram of the browser divides the browser into the following subsystems:
- User Interface The content of the User Interface other than the TAB page content window. Include address bar, forward and back buttons, etc.
- The Browser Engine is used to transfer data between the user interface and the rendering Engine.
- The Rendering Engine is responsible for Rendering the page content requested by the user.
- Networking A functional module under the rendering engine responsible for network requests.
- JavaScript Interpreter A functional module under a rendering engine that parses and executes JS.
- Data Persistence helps browsers store various types of Data, such as cookies.
The rendering engine is often referred to as the browser’s kernel, and different browsers use different kernels:
Internet Explorer uses the Trident kernel, Firefox uses the Gecko kernel, Safari uses the Webkit kernel, and Chrome/Opera/Edge uses the Blink kernel (launched in 2013).
The browser process architecture
Processes and threads
A brief introduction to the concept of process threads.
When we start a program, a process is created to execute the task code and the process is allocated memory where the state of the application is stored.
When the application is closed, the memory space is reclaimed.
A process can request that the operating system create multiple processes, each with independent memory space. If data needs to be transmitted between two processes, data needs to be transmitted through the Inter Process Communication (IPC) channel.
A process can divide tasks into many small tasks and then create multiple threads to perform different tasks in parallel. Threads in the same process can communicate directly with shared data.
Early single-process browsers
With all functions running in a single process, there are many problems with a single-process structure:
- Instability when one thread freezes can cause problems for the entire process. For example, if you open a TAB and it freezes, the entire browser won’t work.
- Unsafe because the entire browser is in a process, so browsers can share data, so the JS engine can access the shared data in memory.
- There are too many things that a single thread is responsible for, and that can lead to performance problems.
So in order to solve the above problems, we now use multi-process browser architecture.
Multi-process browser architecture
- The browser process controls the browser UI functions, such as the address bar and the forward and backward buttons. Responsible for coordinating other browser processes.
- The renderer/alternate renderer is responsible for controlling the Tab page display. Renderers run in sandbox mode. Typically, browsers create a renderer process for each TAB page, depending on the process model the browser uses.
- The GPU process is responsible for rendering the entire browser interface.
- The plug-in process is responsible for controlling all plug-ins used within the browser.
- The network process is responsible for initiating and receiving network requests.
- Cache processes, extender processes, and so on
Browsers provide four process models.
-
Single process means having the browser engine and rendering engine use the same process.
-
Process-per-tab indicates that all sites in a TAB use the same Process.
-
Process-per-site Indicates that the same Process is used at the same site.
-
Process-per-site-instance (default)
Indicates that new processes are created when accessing different sites and different pages on the same site.
But if theA new page is opened from a pageAnd theThe new page and the current page belong to the same site, the new page will reuse the parent page’s rendering process.
Navigation process
1. Process input
When you type an address into the browser’s address bar, the UI thread in the browser process captures your input and determines whether it’s a search query or a URL.
Beforeunload can create “Leave this site?” when you are trying to switch from a site to a new site. Warning. So when a new navigation request comes in, the browser process must check the current renderer.
When the new navigation is a new site, the new renderer process is invoked, leaving the old renderer process to handle events such as uninstallation.
2. Network request
If it is a search query, the keyword is sent to the search engine.
In the case of a URL, when the user hits Enter, the UI thread launches a network thread call to retrieve the corresponding content.
The relevant content of network communication is not explained in detail here.
3. Parse the response
When a network thread obtains the data, it uses SafeBrowsing, Google’s internal site-security system, to check whether a site is a malicious one. SafeBrowsing checks whether a site is secure by examining its data, for example, to see if its IP is on Google’s blacklist.
If so, a warning page will be displayed and the browser will block your access, or of course continue.
The network process parses the HTTP response header data and forwards it to the browser process.
The browser process determines the data type based on the returned content-Type response header. If it is an HTML file, the data is passed to the renderer process. If it is a download request such as another file, submit it to the download manager.
4. Prepare the rendering process
When the returned data is ready and has passed security checks, the network thread notifies the UI thread that I’m ready, and the UI thread creates a renderer process to render the page.
The network process and the UI thread create the renderer process in parallel, and by the time the network thread receives the data, the renderer process has entered the standby phase.
5. Submit navigation
Now that the response header data and the renderer are ready, the browser process sends an IPC to the renderer to submit the navigation.
After receiving the message, the renderer process is ready to receive the HTML data by directly pipelining the data to the network process.
When the renderer process receives the data, it confirms the submission to the browser process. Once the browser process hears the confirmation that the submission occurred in the renderer process, the navigation is complete and the document loading phase begins.
At this point, the browser process will start to update its UI state, including the address bar, the history of forward and backward states, and so on.
Rendering process
The data received by the renderer process is HTML. The core task of the renderer process is to render HTML, CSS, JS, image and other resources into web pages that users can interact with.
1. Build a DOM tree
When the renderer process receives the submission information of the navigation and receives the HTML data, the main thread of the renderer process starts to parse the HTML and construct the DOM data structure. DOM (Document Object Model) is the internal representation of the page by the browser. Is an interactive data structure and API.
The HTML parsing algorithm consists of two stages, tokenization and tree building.
It starts with the Tokenniser tag, a lexical analysis process that parses the input into multiple tags. HTML tags include start tags, end tags, attribute names, and attribute values.
According to the identified tags, DOM tree construction is carried out. In the process of DOM tree construction, Document object will be created, and then the DOM tree with Document as the root node is constantly modified to add various elements to it.
HTML code references additional resources, such as images and CSS, that need to be downloaded from the network or loaded from a cache. These resources do not block HTML parsing because they do not affect DOM generation.
The preloaded scanner runs concurrently. If there is something like or like this in the HTML document, the preload scanner peeks at the markup generated by the HTML parser and sends a request to the network thread in the browser process.
However, when a script tag is encountered in the HTML tag parsing process, the HTML parsing process is stopped and the JS script is loaded and executed instead, because the JS script may change the HTML structure of the current page. So script tags need to be placed in the right place, or configured for asynchronous loading.
Such as:
<html>
<body>
content
<script>
document.write("--foo")
</script>
</body>
</html>
Copy the code
The DOM parser then executes the JavaScript script, and when it’s done, continues parsing.
<html>
<body>
content
<script type="text/javascript" src="foo.js"></script>
</body>
</html>
Copy the code
In this case, when JavaScript is parsed, DOM parsing is paused and the foo.js file is downloaded. When the download is complete, the js file is executed and the DOM parsing continues. This is why JavaScript files block DOM rendering.
<html>
<head>
<style type="text/css" src = "theme.css"></style>
</head>
<body>
<p>content</p>
<script>
let e = document.getElementsByTagName('p')[0]
e.style.color = 'blue'
</script>
</body>
</html>
Copy the code
When I access the style of an element in JavaScript, I have to wait for the style to be downloaded before I can proceed, so CSS blocks DOM parsing in this case as well.
2. Style calculation
After the HTML parsing is complete, we’ll have a DOM tree, but we don’t yet know the style of the nodes in the DOM tree.
The main thread parses the CSS and determines the computational style of each DOM node.
- Parsing the CSS
Parse CSS into styleSheets structures that browsers can understand.
- Transform property values to normalize them
This step standardizing the attribute values requires converting all values to standardized computed values that the rendering engine can easily understand, such as 2em to 32px, red to RGB (255,0,0), bold to 700, and so on.
- Calculate the computing style for each node
In addition to custom styles, browsers have built-in default style sheets.
3. Layout stage
Now the renderer has the entire DOM tree and the style of each node, but it’s not enough to draw the entire page.
Layout is a process of finding the geometry of elements. The main thread iterates through the DOM and evaluates styles, and creates a layout tree that contains information such as xy coordinates and border sizes.
A layout tree may have a similar structure to a DOM tree, but it contains only information related to what is visible on the page.
If display: None is applied, the element is not part of the layout tree (however, elements with visibility: Hidden are in the layout tree).
Similarly, if a pseudo-class in {content:”Hi! }, it is included in the layout tree, even if it is not in the DOM.
4. The layered
The main thread traverses the layout tree to generate the layer tree.
Why not separate each element into a layer? Because it affects the efficiency of subsequent synthesis across multiple layers.
So how do you decide if you want to upgrade to a new layer?
-
Elements with cascading context attributes are individually promoted to a layer \
-
Areas that need to be cropped will also be created as layers
5. Layer drawing
The main thread traverses each layer and generates the drawing instructions.
6. Rasterization operation
The main thread sends the draw instruction to the composite thread.
The compositing thread divides the layer into blocks and sends each block to the raster thread, which rasters each tile and stores them in GPU memory, of course giving preference to blocks near the raster viewport.
7. According to
Once all the blocks have been rasterized, the compositing thread sends the DrawQuad to the browser process, which receives the renderer’s output and displays it on the page.
The resources
- Developers.google.com/web/updates…
- www.html5rocks.com/zh/tutorial…
- Html.spec.whatwg.org/multipage/p…
- www.chromium.org/developers/…