1. Why do I need to know how browsers work

  1. Write better code.
  2. Provide a better user experience

2. Browser history

  • 1991: Berners Lee builds the first web browser, WorldWideWeb, which at the time was simple and supported only text images.
  • 1993: Mosaic, a browser that could display text and images at the same time, was unveiled and instantly embraced by users around the world.
  • 1994: Netscape is released, co-founded by the same people who developed Mosaic. Although Netscape could only display simple static HTML and had no JS or CSS, it was still hugely popular and successful around the world, gaining a majority of the market share (opear was also introduced that same year).
  • 1995: Microsoft releases Internet Explorer 1.0 and Internet Explorer 2.0, and the browser wars begin in earnest.
  • 1996: Internet Explorer 3.0 is released and integrated with the Windows operating system. Netscape’s market share reached 86%.
  • 1998: Netscape forms the Mozilla Foundation.
  • 1999: Within four years of its release, Internet Explorer, with the help of the Windows operating system, gradually displaces Netscape as the leading browser, reaching 99% of the browser market share by 1999.
  • 2003: Apple releases Safari, which is included in all of Apple’s operating systems.
  • 2004: FIrefox1.0 is released, beginning the second browser war.
  • 2005: Apple open-source WebKit, the core of its Safari browser.
  • 2008: Google creates a new project Chromium with Apple’s open source project WebKit as the kernel. Google released its own browser, Chrome, based on the project. Chrome has grown so fast that it has become the most popular browser in the world.
  • 2015: Due to the performance and experience problems of Internet Explorer, Internet Explorer gradually fell behind. In 2015, Microsoft abandoned Internet Explorer and launched Edge browser based on WebKit kernel.
  • 2020: Microsoft launches a new Microsoft Edge browser based on Chromium.

3. Browser structure

There are so many different types of browsers, but they all offer similar functionality and frameworks. The browser structure can be roughly divided as follows:

  • User interface: Used to display user interface content other than TAB window.
  • Browser engine: Exists between the user interface and the rendering engine and is used to transfer data between the user interface and the rendering engine.
  • Rendering engine: Is responsible for rendering the page content requested by the user. The rendering engine can also be divided into many modules.
    • Network module: Handles network requests.
    • JS parserParse and execute js.
  • Data persistence layer: Helps the browser store various types of data, such as cookies.

4. Rendering engine

The rendering engine is the heart and soul of a browser. We often refer to the rendering engine as the core of the browser, which varies from browser to browser.

  • IE:Trident
  • Firefox:Gecko
  • Safari:Webkit
  • Chrome/Opera/Edge:Blink

This article uses Chrome as an example to explain how the browser works unless otherwise specified.

5. Processes and threads

A browser is an application that runs on an operating system. Each application must start at least one process to perform its functionality. Each program often needs to run many tasks, and the process creates threads to help it perform these small tasks.

Two concepts are introduced here: processes and threads.

  • Process: process is the operating system for resource allocation and scheduling of the basic unit, can apply for and have computer resources, process is the basic implementation of the program entity.
  • Thread: A thread is the smallest unit of operation scheduling that an operating system can do. A process can have multiple concurrent threads, each thread performing different tasks in parallel.

Specifically, when we start a program, a process is created to execute the task code, and the process is allocated memory where the state of the application is stored. When the application is closed, the memory space is reclaimed. Processes can start more processes to perform tasks. Since the memory space allocated by each process is independent, if some data needs to be transferred between two processes, it needs to be transferred through the inter-process communication channel IPC.

Many programs are multi-process structures, so that one process doesn’t get stuck. Since the processes are independent of each other, this does not affect the entire application.

For example, think of your laptop as an application, and the external mouse is a process of that application. If there is a problem with the external mouse, it will not affect the continued use of the laptop.

A process can divide tasks into more small tasks, and then create multiple threads to execute different tasks in parallel. Threads in the same process can directly communicate with each other and share data.

6. Processes in the browser

Today’s browsers are multi-process architectures, but the early browsers were single-process architectures. There are probably page threads in a process that are responsible for page rendering and presentation, JS threads that execute JS code, and various other threads. The single-process structure raises a number of questions:

  • Instability: The death of one thread can cause problems for the entire process. For example, if you open a TAB and one of the tabs freezes, the entire browser may not work properly.
  • Unsafe: Data can be shared between browsers, causing JS threads to arbitrarily access all data within the browser process.
  • Not smooth: A process often has too many things to do and can run inefficiently.

Therefore, in order to solve the above problems, we now use the multi-process browser structure, according to the process function to disassemble the browser.

  • Browser process: Controls Chrome’s user interface other than tabs, including the address bar, bookmarks, back and front buttons, and coordinates with other processes in the browser.
  • Network process: responsible for initiating and receiving network requests.
  • GPU process: responsible for rendering the entire browser interface.
  • Plugin process: Responsible for controlling all plug-ins used by the site, such as Flash. By plugins I don’t mean extensions in the Chrome market.
  • Renderer process: Responsible for controlling the display of all contents within a TAB. Browsers create a process for each TAB by default.

7. Four process models for browsers

Chrome has four process models.

Process-per-site-instance (default)

By default, Chromium creates a renderer process for each instance of a site that a user visits, which ensures that pages from different sites are rendered independently and separate visits to the same site are isolated from each other. Simply put, new processes are created when visiting different sites and different pages on the same site.

Process-per-site

Indicates that the same site uses the same process.

Process-per-tab

Indicates that all sites in the same TAB use a process

Single process

Indicates that the browser engine and rendering engine share one process.

The specific benefits and disadvantages of each pattern are documented. Obviously, the process-per-site-instance model creates more processes and takes up more memory, but it is the safest. Each TAB and each site within the TAB are isolated from each other. When one of the TAB renderer processes freezes it does not affect the other tabs.

8. What happens when you type an address in the browser’s address bar

8.1. Determine the address or keyword

When an address is entered in the browser address bar, the BROWSER process’s UI thread captures the input. If you are accessing a web address, the UI thread starts a network thread to ask DNS for domain name resolution, and then starts connecting to the server for data. If you enter a list of keywords instead of an address, the browser determines that this is a search and uses the default search engine.

8.2. SafeBrowsing inspection

Let’s take a look at what happens when the network thread gets the data. First, it uses SafeBrowsing to check if the site is a malicious one, and if it is, it prompts a warning page that the site has a security problem and the browser will block access. Of course, the visit can be forcibly continued.

SafeBrowsing is a site-security system within Google that checks the data on a site to determine whether it is secure. For example, by checking to see if the site’s IP is on Google’s blacklist.

8.3. The renderer process renders

When the returned data is ready and has passed the security check, the network thread notifies the UI thread that it is ready. The UI Thread then creates a Renderer Thread to render the page. The browser process passes the data to the renderer process through the IPC pipeline to formally enter the rendering process.

The data received by the renderer is HTML. The core task of the renderer process is to render HTML, CSS, JS, image and other resources into web pages that users can interact with. The main thread of the renderer process parses the HTML and constructs the DOM data structure (that is, the document data model — the browser’s internal representation of the page, and the data structure and API that Web developers can interact with through JS).

8.3.1. DOM tree construction

HTML is firstly tokeniser tokenized, and the input HTML content is parsed into multiple tags through mnemonic analysis, and DOM number is constructed according to the identified tags. The Document object is created during DOM tree construction, and then the DOM tree with the document as its root node is constantly modified to add various elements to it.

HTML code often introduces some additional resources, such as images, CSS, JS scripts, etc. Images and CSS resources need to be downloaded over the network or loaded directly from the cache. These resources do not block HTML parsing because they do not affect DOM generation. However, when a script tag is encountered during HTML parsing, the HTML parsing process is stopped and the parsing is loaded and JS executed instead. The reason for this is that the browser does not know if JS changes the HTML structure of the current page. If document.write() is used to modify HTML in the JS code, the previous THML parsing is meaningless. That’s why it’s important to put the script tag in place, or use the Async or defer properties to asynchronously load the executing JS.

After the HTML parsing is complete, you get a DOM Tree, but you don’t know what each node in the DOM Tree should look like. The main thread parses the CSS and determines the computed style of each DOM node. Even if no custom CSS style is provided, the browser will have a default style sheet.

8.3.2. Generate a Layout tree

Once you know the DOM structure and the style of each node, you need to know where each node needs to be on the page, its coordinates, and how much area it needs to occupy. This stage is also called layout, and the main thread generates a Layout Tree by iterating through the DOM and calculating the style. Each node on the Layout Tree records the X, Y coordinates and border size.

Note that DOM Tree and Layout Tree do not correspond one to one. Nodes with display: None set will not appear on the Layout Tree.

If you add content to the before pseudo-class, the content will appear in the Layout Tree, not the DOM Tree. This is because the DOM is derived from HTML parsing and styles are irrelevant, whereas the Layout Tree is generated from the DOM and calculated styles. The Layout Tree corresponds to the nodes that are finally displayed on the screen.

8.3.3. Confirm the drawing sequence

Next you need to know in what order to draw the nodes. For example, the Z-index attribute affects the hierarchy of nodes drawn. Drawing the page according to the DOM hierarchy results in incorrect rendering.

To ensure that the correct hierarchy is displayed on the screen, the main thread traverses the Layout Tree to create a Paint Record. This table records the order of drawing, and this stage is called drawing.

8.3.4 rasterize

After knowing the document’s drawing order, it is turned into pixels to display on the screen. This behavior is called Rastering.

01- Early rasterization

Chrome pioneered a very simple way to rasterize only the content in the user’s Viewport, and as the user scrolls through the page, rasterize more content to fill in the missing parts. The obvious problem with this approach is that it can lead to presentation delays.

Synthesis.

As Chrome continues to improve, it now uses a more complex rasterization process called Composting. Composting is a technique for splitting parts of a page into layers, rasterizing them separately, and composing the page individually in a synthesizer thread.

In simple terms, all elements of the page are layered according to certain rules and rasterized, and then the content of the viewable area is combined into a frame for the user to display.

8.3.5. Complete process

  1. The main thread traverses the Layout Tree to generate the Layer Tree.
  2. After the Layer Tree is generated and the drawing order is confirmed, the main thread passes this information to the synthesizer thread.
  3. The synthesizer Thread rasterizes each layer. Since a layer can be as large as the entire length of the page, the synthesizer Thread cuts them up into many tiles and sends each tile to the Raster Thread.
  4. The rasterization thread rasterizes each graph block and stores them in GPU memory.
  5. When the block is rasterized, the synthesizer thread calls the collectiondraw quadsBlock information, which records the location of the block in memory and where to draw the block on the page. From this information, the synthesizer thread generates a Compositior Frame.
  6. The synthesizer Frame (Frame) is passed through IPC to the browser process, which then passes the synthesizer Frame to the GPU.
  7. GPU rendering is displayed on the screen, and the user can see the content of the page.
  8. When the page changes, such as scrolling through the current page, a new synthesizer frame is generated. The new frame is then passed to the GPU and then to the screen.

8.3.6. Rearrangement and redrawing

01 – rearrangement

When you change the size and position attributes of an element, Computed Style, layout drawing, and all the rest of the flow are redone.

02 – redrawn

When you change the color properties of an element, the layout is not retriggered, but style calculations and drawing are still triggered.

8.3.7. Performance optimization (rearrangement and redraw)

Rearrangements and redraws occupy the main thread, and since JS is also running on the main thread, there is a problem of preempting execution time. If you write an animation that constantly rearranges and redraws, the browser needs to run the style on every frame, calculating the layout and drawing operations.

Page only when the refresh rate of 60 frames per second, won’t make the user feel page caton, if there is a lot of running animation when JS tasks need to be performed, because of the layout, drawing and JS execution is run in the main thread, when at the end of the layout and draw a frame of time, if there is time, JS will get right to the use of the main thread.

If the JS execution time is too long, it will cause the JS to not return the main thread in time at the beginning of the next frame and the next frame animation will not render on time, which will cause the page animation to lag.

01-requestAnimationFrame()

This method is called on each frame and, via API callbacks, breaks up the JS running task into smaller chunks (each frame). Pause JS execution before each frame runs out and return to the main thread. This way, the main thread is ready to do the layout and drawing on time at the start of the next frame. React’s latest rendering engine, Reacg Fiber, uses the API for a number of optimizations.

02-Transform

The entire process of rasterization mentioned above does not occupy the main thread, but only runs in the synthesizer thread and raster thread, which means it does not have to compete with JS for the main thread.

CSS has an animation property called transform. The animation implemented by this property does not go through layout and drawing but runs directly in the synthesizer thread and rasterization thread, so it is not affected by JS execution in the main thread.

More importantly, the animation realized by Transform does not need to go through operations such as layout drawing and style calculation, so it saves a lot of operation time (convenient to realize responsible animation).

9. Content reference

This article is a summary of Luke’s video on station B UP main objTube. If you are not clear enough, you can go to the original video.

  1. Inside Look at Modern Web Browser1-4
  2. How Browsers Work
  3. The Process Models,
  4. High Performance Animations
  5. Webkit Technology Insider by Zhu Yongsheng