This is the 29th day of my participation in the August Wenwen Challenge.More challenges in August

The browser

I believe that web developers are no strangers to browsers. A browser is simply an application consisting of an address bar, a menu bar, a TAB, a page window, and a status bar, but that’s just what we see on the surface. There are more complex logic and components behind the browser, such as the browser engine, rendering engine, JS interpreter, network, data storage, etc.

Browser kernel

Browser kernel is divided into: rendering engine and JS engine two parts, because the JS engine is more and more independent, the kernel also tends to only refer to the rendering engine, mainly responsible for the network request back resources to be analyzed and typeset and then presented to the user.

Different browsers have different kernels. Four of the five major browsers contain cores:

Internet Explorer uses the Trident kernel

Firefox uses the Gecko kernel

Safari uses the WebKit kernel

Chrome also used the WebKit kernel in its early days, and started using the Blink kernel in 2013

Opera first used Presto, then abandoned it for WebKit, and then followed Chrome with Blink.

How browsers work

So what happens when we type a URL into the address bar and a beautiful page is presented to us? Let’s look at a graph:

The image above simply shows the entire process from when we enter a URL to when the browser gives us a beautiful page. Let’s break it down step by step:

Step 1 When we enter the URL in the address bar and press Enter, the browser will create an HTTP request and then request the DNS (domain name resolution server) through the network process. The DNS will translate the domain name in the URL into the IP address and port number of the target server. Then TCP connection is established with the target server through IP address and port number. After the connection is established, the target server will construct an HTTP response according to the request information, that is, the response, and then the content to be accessed will be returned to the browser through HTTP response. At this point, the request is completed. The browser gets what we want. The next thing the browser does is parse the response and turn it into a beautiful page.

What the browser gets in the previous step is actually a piece of HTML code, as shown below:

Once the HTML code is in hand, the browser creates a stack of memory in the memory bar that provides an environment for the code to execute, and a main thread that parses and executes the code line by line. We all know that the JavaScript language is single threaded, meaning that you can only do one thing at a time, so there is a technical term involved here: “stack in, stack out.” In other words, when parsing HTML code line by line, you must wait until the last line of code is finished and out of the stack before the main thread is empty and the next line of code can be put on the stack to execute.

But there is a special case, that is, if there are any link in the process of parsing/script/img/audio/vedio request, the browser will open up a new thread to request these resources, the original main thread will continue to execute the following code. It is important to note that there is no conflict between the new thread and the main thread. We say JavaScript is single-threaded, but browsers can be multithreaded. This means that the js code is executed step by step in a single thread. When some special tags (such as link/script/img) are encountered, the browser will create a new thread to execute the code separately, and the main thread will continue to execute the following code, so that there is no network or other reasons for blocking.

Here’s an example: If A hotel is only one attendant is responsible for order and responsible for serving, one day she was giving A table guests order, just then the cook notice she said the food table ready, B can serve, but as A person can only do one thing at the same time (the equivalent of js is single thread), so only to wait for A table of the order B dishes on the table, At this time, the guests at table B were anxious. Why didn’t they give me my dishes? I won’t come next time. The boss immediately decided to hire another server (literally creating a new thread in the browser) so that server A could take table A’s order while server B could take table B’s order.

A DOM tree is generated when the returned code has been executed from top to bottom.

We’ve already got a DOM tree in the previous step, but it’s just a bare, undecorated DOM tree. Now we’re going to add some decoration to the tree –CSS parsing

We all know that when we visit a website, the reason why we see so many beautiful pages is because of the power of CSS. All CSS on a page is made up of a stylesheet CALLED CSSStyleSheet, which is a set of CSSRules, each of which is made up of a selector section and a declaration, Declaration is a collection of CSS attributes and values called key-values.

After CSS parsing is complete, cssrule matching is performed, that is, to find an HTML element that meets the selector part of each CSS rule, such as a class selector. Text-color is used to match all HTML elements with class= “text-color” tags. The Declaration part is then applied to the element, as are the other selectors. The actual rule matching process takes into account factors such as default and inherited CSS properties and the priority of the rule.

At this point, the parsing of cssRule is complete.

After parsing the CSS in the previous step, it’s time to apply CSS rules to HTML elements. First we need to calculate the layout, which is the process of arranging and calculating geometric information such as the size and location of each element on the page. HTML uses the streaming layout model. The basic principle is that the page elements are arranged from left to right and from top to bottom in a sequence of traversal to determine their location.

An HTML element corresponds to a Block region represented by a CSS box model, and the HTML element is divided into two basic types, Inline and Block. Inline elements do not wrap lines and are laid out from left to right. The presence of Block elements means that you need to switch from top to bottom to the next line of layout. In addition to the order of the basic installation Inline and Block for fluid layout of the elements, there are some special designated layout, such as absolute/relative/fixed/flex and float liquid layout.

After the layout stage, we have turned the DOM tree with CSS into a DOM tree with layout information, and then we have to render and draw the DOM.

After the above steps we have resolved HMTL, CSS respectively and obtained a DOM tree with layout information through the layout, the next is to render the content out. This uses the Paint module, which maps the DOM tree with layout information into a visual image. It traverses the DOM tree and invokes the drawing method of each node to display its content on a canvas or bitmap, and finally renders it in the browser application window as the actual page that the user sees. The size and position of each node have been calculated in the previous layout stage, and the content of the node depends on the corresponding HMTL element, text, image and so on.

In general, layout and drawing are quite time-consuming operations. If the DOM tree had to be rearranged and redrawn every time it was slightly changed, known as backflow and redrawing, it would be quite inefficient. Therefore, the browser kernel generally implements an incremental layout and incremental drawing. When the content or style of a DOM tree node changes, the kernel determines its scope of influence, marks other nodes affected by the node during the layout phase, and marks a dirty area during the drawing phase and notifies the system to redraw. Therefore, in daily development, we must minimize the occurrence of backflow and reconvergence to improve the performance of the program.

From resource download to final page presentation, the rendering process can be simply understood as a linear series transformation process combination. The original input is URL address, and the final output is page Bitmap, which successively goes through Loader, Parser, Layout, Paint modules, and finally forms a page Bitmap.

That’s how browsers work.