preface

Hello, this is the first article in a series on browser architecture.

The goals of this article are as follows:

  • Review browser and network these two pieces of knowledge, through this process, can establish a complete browser, network knowledge system.
  • Share with everyone and make progress together. If we can have a little progress, I will be happy to fly.
  • The hope is to establish a top-down approach to learning.

Small voice BB:

  • Opinions are limited, if there is any improper description, please also point out, if there are mistakes, will be timely correction. (Doge lives)
  • There are a lot of references in the content of the article. I have sorted out the contents under my superficial understanding. The links of references are placed under the sub-title. If there is infringement, give me a message, delete immediately.

The outline

  1. The past and present of browsers and browser kernels
  2. Processes and threads
  3. Evolution of browser architectures
    1. Single-process architecture
    2. Multi-process architecture
    3. Service-oriented architecture
  4. Introduction to multi-process architecture
  5. Process model
  6. The composition of the browser kernel
  7. Expand the data

The past and present of the browser and its core

From: the history of the browser UserAgent series, history repeats itself: from KHTML to WebKit to Blink

In this section, we briefly list the history of browsers and browser core

The browser

  • 1990: Nexus(WorldWideWeb) is born
  • January 23, 1993: Mosaic is born
  • December 1994: Netscape(Mozilla) is born
  • April 1995: Opera is born
  • August 16, 1995: Internet Explorer was born
  • September 23, 2002: Firefox is born
  • January 7, 2003: Safari is born
  • September 2, 2008: Chrome is born

If you’re interested in this section, you can goBrowser history UserAgentTake a look.

Browser kernel

The browser kernel consists of three main technical branches: a typographic rendering engine, a JavaScript engine, and others.

  • Trident 1997
  • 1998 KHTML and KJS
  • In 2000, Gecko
  • In 2001, its
  • In 2003, Presto
  • 2010 Hybrid Engine (dual core)
  • Blink 2013
  • 2015 EdgeHTML

Some interesting things have happened in the development of the browser kernel, as can be seen in _History repeats itself: from KHTML to WebKit to Blink

summary

This part of the history actually need not remember, the point that needs to know is KHTML — >Webkit — >Blink due to the many different ideas, resulting in the continuous progress of the browser kernel.

Processes and threads

What’s the Diff: Programs, Processes, and Threads Wikipedia Threads

process

A process, “process,” is a program that is already running in a computer. The program itself is only a description of the instructions and data and their organization; the process is the actual running instance of the program (those instructions and data).

thread

A thread is the smallest unit in which an operating system can schedule operations. In most cases, it is contained within a process and is the actual operational unit of the process.

The difference and connection between the two

When the user gives the command to run the program, the process is created. Processes require some resources to complete their work and are sequential, meaning that only one process can be running at any one time per CPU core, and different processes are isolated from each other.

Each thread in the process shares that memory and resources. In a single-threaded process, the process contains one thread. Processes and threads are the same.

In multithreaded processes, the process contains multiple threads to allow more than one user to run the same program at the same time without colliding. Each thread will have its own stack, but all threads in the process will share the heap. That is, different threads in the same process can run in parallel.

summary

  • Understand the concept of processes and threads
  • The contents between processes are isolated from each other. There can be multiple threads in a process, and threads in the same process can be parallel

To learn more about the differences between Programs, Processes and Threads, read What’s the Diff: Programs, Processes, and Threads

Evolution of browser architectures

How browsers work and Practice Multi-process and servitization

  • Multi-process Architecture
  • Inside look at modern web browser (part 1)

Before: Single-process architecture

Single-process browser means that all the functional modules of the browser run in the same processThese modules include networks, plug-ins, JavaScript runtime environments, rendering engines, pages, etc. The architecture of the single-process browser is shown below:

Having so many functional modules running in a single process is a major factor in making single-process browsers unstable, fluid, and insecure.

Problem 1: Instability Early browsers needed plug-ins to implement powerful functions such as Web video and Web games, but plug-ins are the most problematic module and run in the browser process, so the accidental crash of one plug-in can cause the entire browser to crash.

In addition to plug-ins, the render engine module is also unstable, and often some complex JavaScript code can cause the render engine module to crash. Just like plug-ins, a crash of the rendering engine can crash the entire browser.

Problem 2: Not smooth All render modules, JavaScript execution environments, and plugins for pages run in the same thread, which means that only one module can be executed at a time.

In addition to the scripts or plug-ins mentioned above that can make single-process browsers slow down, memory leaks on pages are also a major cause of single-process slowdowns. Usually the browser kernel is very complex, running a complex page and then closing the page, there will be a situation of memory can not be fully reclaimed, which leads to the problem is that the longer the use time, the higher the memory footprint, the slower the browser will become.

Problem 3: Unsafe

Again, this can be explained in terms of plug-ins and page scripts.

Plug-ins can be written using C/C++ code, through plug-ins can obtain any resources of the operating system, when you run a plug-in in the page also means that the plug-in can fully operate your computer. If it’s a malicious plug-in, it can release a virus, steal your passwords and raise security issues.

As for page scripts, they can gain access to the system through a vulnerability in the browser, and they can also do malicious things to your computer, which can also cause security problems.

Now: Multi-process architecture

Modern multithreaded architectures are more stable because they place applications in separate processes that are isolated from each other. Crashes in one application typically do not compromise the integrity of other applications or browsers, and each user’s access to other users’ data is limited.

Use a separate process for the browser TAB to protect the entire application from errors and interference in the rendering engine. Chrome also limits each rendering engine process’s access to other processes and to the rest of the system. In some ways, this brings the benefits of memory protection and access control to Web browsers.

The future:Service-oriented architecture

Chrome servitization diagram, moving different services into multiple processes and one browser process.

Because multi-process architectures require more resources and a more complex architecture, Chrome is making architectural changes to make each part of the browser application run as a service that can be easily split into different processes or aggregated into the same process. (Take a closer look at the merging of different processes into the browser process.)

Specific ideas are as follows:

  • When Chrome runs on powerful hardware, it may split each service into separate processes to provide greater stability.
  • But if it’s on a resource-constrained device, Chrome consolidates the service into a single process, saving on memory footprint.

The process model solution is described in more detail in the process Model section of the Multi-process Architecture tutorial.

Introduction to multi-process architecture

This section refers to the following:

  • Inside look at modern web browser (part 1)
  • Geek Time – How browsers work and practice

Different processes

  • Browser: mainly responsible for interface display, user interaction, sub-process management, and storage.
  • Renderer: Controls all content displayed on the site in tabs.
  • Plugins: Plugins that control the use of websites, such as Flash.
  • GPU: dedicated to processing GPU tasks, independent of other processes.

There are more processes, such as the Extension Process and the Utility Process. If you want to see how many processes are running in Chrome, click on the menu icon in the upper right corner → Choose more Tools → Task Manager.

This opens a window with a list of currently running processes and their CPU/ memory usage information.

summary

In this section, you’ve seen what processes exist in the browser and what each browser process does.

Process model

Process-models from: Browser multi-process architecture

Site and site – the instance

First, let’s introduce the concepts of site and ****site-instance.

  • Site refers to the same registered domain name (example, google.com) and protocol (example., https://)). For example, z.baidu.com and B.baidu.com can be regarded as the Same site.
  • A site-instance is a collection of linked pages of the same site. We think of two pages as connected if they can get references to each other in script code. A new page and an old page that meet the following two conditions and belong to the same site defined above belong to the same site-instance
  1. The user clicks on a new page in this way
    • New pages opened by JavaScript code (such as window.open)

Four process modes

Chromium provides four process modes that affect how browsers assign pages to renderers. For example, using one mode will assign a new process to a TAB, while using another mode will not. Here are the four modes.

  • Process-per-site-instance(default)
    • Use one process for the same site-instance
  • Process-per-site
    • The same site uses one process
  • Process-per-tab
    • Each TAB uses one process
  • Single process
    • All tabs share one process

Why use process-per-site-instance

Because this model takes into account both performance and ease of use, it is a relatively neutral and universal model

  • The ability to open many fewer processes means less memory footprint compared to process-per-tab
  • Compared with process-per -site, it can better isolate unrelated tabs under the same domain name, which is more secure

The composition of the browser kernel

Quote:

  • From browser multi process to JS single thread, JS running mechanism is the most comprehensive combing
  • Inside WebKit Technology
  • howbrowserswork
  • Browser kernel principle – Chromium Blink foundation

The browser kernel consists of three main technical branches: a typography engine, a JavaScript engine, and others.Next we can take a look at what’s in the browser kernel.Figure: WebKit architectureFigure: WebKit2 interface and process modelFigure: Blink architectureThe browser rendering process is multi-threaded, so what are the threads in the process?

  1. GUI rendering thread
    • Responsible for rendering browser interfaces, parsing HTML, CSS, building DOM trees and RenderObject trees, layout and drawing, etc.
  2. JS engine thread
    • Also known as the JS kernel, it handles Javascript scripts. (For example V8 engine)
    • The JS engine thread is responsible for parsing Javascript scripts and running code.
  3. Event trigger thread
    • Belongs to the browser, not the JS engine, and is used to control the event loop.
  4. Timing trigger thread
    • The legendarysetIntervalwithsetTimeoutThe thread
  5. Asynchronous HTTP request threads
    • After the XMLHttpRequest connects, the browser opens a new thread request

Figure: Browser kernel threads

Expand the data

Because the Chrominu project is so large, you can get bogged down in details if you try to read through it. It is recommended to read if necessary. The difficulty here is not because the document is difficult to understand, but mainly because the project is large, there is a lot of information, and people are confused.

  • Chromium developer documentation
  • Blink Development Document
  • Chromium blog
  • Protecting Browsers from Extension Vulnerabilities

conclusion

The first article in this series, starting with the history of browsers, introduces the evolution of browser architectures, the role of different processes in browsers, and process patterns.

If you have a basic understanding of the browser architecture, how many processes the browser has, and what they do, you have met the purpose of this article. For a more in-depth look, see the browser architecture details section of the “Inputing URLS to Render Completion series (1)”

If you like it, you can like it and bookmark it to see the output of this series of articles. Welcome to supervise (urge more).

Thank you again for

Maybe it’s not very good, but once you start, you keep going. It’s the same story

Not afraid of infinite truth, into an inch of joy