How browsers work

Which processes a TAB contains

A TAB contains five processes:

Browser main process: mainly responsible for interface display, user interaction, child process management, and storage functions.
Render process: The core task is to convert HTML, CSS, JavaScript into a web page that the user can interact with. Blink and JavaScript engine V8 both run in this process. Chrome starts a render process for each TAB by default.
GPU process: Supports 3D rendering of CSS effects, graphics, and videos
Network process: mainly responsible for the page network resources loading.
Plug-in process: responsible for running browser plug-ins

Rendering process

The render process consists of the following threads:

GUI rendering thread: responsible for parsing HTML, CSS. Build the DOM tree, layout and draw
The JS engine thread is responsible for parsing and executing JS code. It is mutually exclusive with the GUI rendering thread. When the JS engine is executing, the GUI thread is suspended until the JS engine is idle
Event-triggering thread: The thread responsible for handling event polling
Timing trigger thread: timer related thread
Asynchronous HTTP request thread: the thread responsible for handling browser Ajax requests

Rendering process

Building a DOM tree: Convert an HTML string into a DOM tree structure that the browser can recognize
Parse CSS into CSSOM: Parse CSS code and introverted CSS code into structured styleSheets that the browser can understand
Create a Rendering Tree based on the DOM Tree and CSSOM that contains only visible elements, such as the head tag or the display: None tag that is not in the Rendering Tree
Layout calculation: Calculate the coordinates of the nodes in the Rendering Tree based on the Rendering Tree you just calculated
Layering, compositing, display: Layering, compositing, and displaying the rendered tree on the screen.

Rearrange and redraw

concept

Rearrange: If you change the position of the set of elements through JS and CSS, it triggers the browser to rearrange, recompose the render tree, and trigger the entire process
Redraw: If the value changes the color of the element, but does not change the geometry of the element, the browser will not trigger a relayout, and will go directly to the draw phase, skipping the layout calculation process, so the update is more efficient.

How to reduce rearrangement redraw:

Avoid using style as much as possible, and rename the className and change the className if you need to manipulate the DOM element node.
To add or clone an element, insert the element into memory via the documentFragment, and then appendChild to the DOM element.
Use dispaly: None as rarely as possible. Use visibility:hidden instead. Dispaly: None will cause rearrangement, and visibility:hidden will cause redrawing.
Do not use the Table layout, because a small operation may cause the entire Table to be rearranged or redrawn.
When using the resize event, do shaking and throttling.
Use the Absolute/fixed properties for animated elements.
The CSS Transform is used to animate the effect because the composite animation is performed directly on the non-main thread.
When modifying elements in batches, you can first remove the elements from the document flow, and then put them into the document flow after the modification is complete.

What happened from entering the URL to displaying the page?

The DNS
Establishing a TCP Connection
Sending an HTTP request
The server processes the request
Request returns
The browser parses the rendering interface and performs the rendering process

Detailed summary:

1. The user enters the URL, and the browser will judge whether it is a search or a URL according to the information entered by the user. If it is a search content, it will synthesize the search content + the default search engine into a new URL. If the user enters a URL that conforms to the URL rules, the browser will add a valid URL to the content according to the URL protocol
1. After the user enters the content and presses enter, the browser navigation bar displays loading state, but the previous page is still displayed, because the response data of the new page has not been obtained
1. Browser process The browser builds the request line information and sends the URL request to the network process via interprocess communication (IPC)
1. The network process obtains the URL, first checks the local cache to see whether there is a cache file, if there is, intercepts the request, directly returns 200; Otherwise, enter the network request process
1. The network process requests the DNS to return the IP address and port number corresponding to the domain name. If the current domain name has been cached by the DNS data cache service, the DNS data cache service directly returns the cached information. Otherwise, the system sends a request to obtain the IP address and port number resolved based on the domain name. If the port number is not available, the default value is 80 for HTTP and 443 for HTTPS. If it is an HTTPS request, you also need to establish a TLS connection.
1. Chrome has a mechanism that allows a maximum of six TCP connections to be made to the same domain name at a time. If there are 10 requests under the same domain name at the same time, four of them will be queued until the ongoing request is completed. If the current number of requests is less than six, the TCP connection is directly established.
1. The TCP three-way handshake establishes the connection, and the HTTP request is sent down with the TCP header, including the source port number, destination port number, and serial number used to verify data integrity
1. The network layer adds an IP header to the packet, including the source and destination IP addresses, and continues the transmission down to the underlying layer
1. The underlying layer is transmitted to the destination server over the physical network
1. The network layer of the destination server host receives the data packet, resolves the IP header, identifies the data part, and forwards the unsolved data packet to the transport layer
1. The transport layer of the destination server host obtains the data packet, parses the TCP header, identifies the port, and transmits the unsolved data packet to the application layer
1. At the application layer, HTTP parses the request header and the request body. If a redirect is required, HTTP directly returns the status of the HTTP response data code301 or 302. At the same time, attach the redirect address in the Location field of the request header, and the browser will redirect according to code and Location. If not, the server first determines whether the requested resource has been updated based on the if-none-match value in the request header. If it has not been updated, the server returns a 304 status code, which tells the browser that the previous cache is still available. Otherwise, return the new data, 200 status code, and if you want the browser to Cache the data, add the field cache-Control: max-age =2000 in the corresponding header, and the response data is returned to the network process in the order of application layer – transport layer – network layer – transport layer – application layer
1. After data transmission is complete, TCP disconnects the connection with four waves. If the browser or server adds the following message to the HTTP header: Connection: keep-alive, TCP keeps the Connection Alive. Maintaining a TCP connection saves the time of establishing a connection next time.
1. The network process parses the obtained data packet and determines the type of the response data according to the content-type in the response header. If it is a byte stream type, the request is handed over to the download manager, and the navigation process is finished. If the type is TEXT/HTML, the browser process is notified that the document has been obtained and is ready to render
1. The browser process is notified based on whether the current page B was opened from page A and whether it is the same site as page A (which is considered the same site as the root domain name and protocol). If this is true, the process of the previous page is reused; otherwise, A separate rendering process is created
1. The browser will send a message of “submit document” to the rendering process. After receiving the message, the rendering process will establish a “pipeline” for data transmission with the network process. After the document data transmission is complete, the rendering process will return a message of “confirm submission” to the browser process
1. After receiving the submit confirmation message, the browser updates the page status, including the security status, the URL in the address bar, and the forward and backward historical status, and updates the Web page. The Web page is blank
1. The rendering process parses the document and loads sub-resources. HTML is converted to a DOM Tree by the HTM parser. CSS is converted to a CSSOM Tree by the CSS rule and the CSS interpreter. Form a render tree (do not contain HTML specific elements and the specific location of the element to be drawn), through the Layout can calculate the specific width, height and color position of each element, combine them, start to draw, and finally display in the new page display on the screen

Web security

XSS attacks

Cross-site scripting (XSS) : hackers inject malicious scripts into HTML files or DOM to attack users while browsing the page

How to defend:

The front end and server verify that the characters contain sensitive characters, such as script tags
Use the httpOnly attribute to protect user information by making it impossible to read cookie information through JS

CSRF attacks

Cross-site request forgery (CSRF) : a cross-site request initiated by a hacker using the login status of the user to induce the user to open the hacker’s website. In simple terms, a CSRF attack is when a hacker exploits a user’s login status and uses a third-party site to do something bad.

How to defend:

Take full advantage of the SameSite attribute of cookies and prohibit third-party cookies
Verify the source site of the request
CSRF Token: Each request carries a Token generated by the server. The server verifies whether the current Token is valid and prevents third-party websites from maliciously requesting the interface.

Principles of Garbage Collection

Intergenerational hypothesis

Most objects have a relatively short lifetime in memory
Objects that don’t die live longer

Generational collection

New Generation: The new generation stores objects that have a short lifetime. The capacity ranges from 1 to 8 meters. The objects are collected by the secondary garbage collector
Old generation: Old generation stores long-lived objects with large capacity and is collected by the main garbage collector

To simplify the process

Mark the active and inactive objects in the space
2, after the completion of marking, unified recovery of inactive objects
3. Defragment memory

Detailed process

How the secondary garbage collector works

1. The secondary garbage collector divides the new generation of memory into the object area and the free area. The newly added objects are allocated to the object area, and a garbage collection will be triggered when the object area is nearly full
2. First, the secondary garbage collector will mark the objects in the object area to distinguish between active and inactive objects
3. After the marking is completed, the secondary garbage collector will copy the marked as active objects in the object area to the free area, and then arrange these objects in an orderly manner, while emptying the object area, so this process is done in memory sorting.
4. After the replication is complete, the roles of the object area and the free area are reversed, which completes the garbage collection operation.

How the main garbage collector works

Because of the small space in the new area, it is easy to fill up with garbage. To solve this problem, objects that exist after two garbage collections are promoted to the old area.

The main garbage collector uses the mark-sweep algorithm.

1. Marking stage: Starting from the root element, recursively traversing the root element. During the traversing process, reachable objects are marked as active objects, and unreachable objects are marked as garbage data
2. Clean up garbage data, but memory fragments will be generated in this process, so the mark-collation algorithm is produced. That is, after the end of the marking process, garbage data is not directly cleared, but the live objects are moved to one side, and then clean up garbage data outside the boundary of the live objects.

Incremental tag

Perform a garbage collection operations will block the main thread execution, if the garbage collection execution time is too long, will lead to obvious card phenomenon, based on this, the garbage collection operations can be classified into many small tasks, these small task execution time is very short, can alternate between other js task execution, so as to not cause caton phenomenon, The same is true for React fiber.

Three-color labeling

Because of the incremental marking method, the task is decomposed into many small tasks. If the previous black and white marking method is still adopted, it is impossible to determine where the current marker is between the two small tasks. To solve this problem, V8 adopts the three-color marking method:

White refers to unmarked objects
Gray refers to the currently accessed object itself
Black means that both self and member variables are marked

In this way, every time in the tagging task, it is necessary to continue tagging when grey objects are found. When there are no grey objects in the whole object, the tagging is completed.

Which processes a TAB contains

Rendering process

Rendering process

Rearrange and redraw

concept

How to reduce rearrangement redraw:

What happened from entering the URL to displaying the page?

Web security

XSS attacks

CSRF attacks

Principles of Garbage Collection

Intergenerational hypothesis

Generational collection

To simplify the process

Detailed process

How the secondary garbage collector works

How the main garbage collector works

Incremental tag

Three-color labeling

Related Posts

Front-end automated testing and Karma introduction

Using Axios (Promise) in Vue

Have you really figured out HTTP? (a)