Writing in the front

I wanted to do a summary article on browser rendering a long time ago. I have read many articles about this on the Internet before, there are many articles about the principle of browser rendering mechanism, but I always feel a little bit less interesting, there is no series of the whole process. Therefore, I intend to connect these two parts and explain the meaning behind them from the perspective of principle and practice.

There are probably a lot of principles in this article, so if you have any questions about this section we can talk to each other in the comments section.

Processes and threads

A process is the smallest unit of operating system resource allocation and contains threads.
Threads are process-managed, and browsers use the multi-process model.

In daily use, we use the browser to visit the website on a TAB page. If a TAB page fails, it has no impact on other TAB pages. In fact, each TAB page is a separate process.

They are independent of each other.

Let’s take a look at this:

Processes in the browser are divided into the following five categories:

Browser processes: You can understand browser processes as a unified “scheduler” to schedule other processes, such as when we type a URL in the address bar, the browser process first calls the web process. It does some subprocess management and some storage.
Render process: This process is the most important one for us, each TAB page has a separate render process, its main function is to render the page.
Network process: This process controls requests for static resources and sends them to the renderer for rendering.
GPU process: This process can call hardware for rendering, thus achieving rendering acceleration. Css3 attributes such as Translate3D trick the GPU process into calling to enable hardware acceleration.
Plugin process: The Plugin in Chrome is also a separate process.

Each process is independent of each other and does not affect each other, so let’s have a general understanding of the browser’s five processes and understand the general role of each.

From the input`URL`What happens between page display

I’m sure every qualified front-end engineer knows this question already, and there are plenty of answers on the web.

But most people here have a very superficial understanding of the problem, so I’ll try to explain the process from all sides.

Network Resource Layer

First of all, let’s take a look at the lifecycle of a normal URL entry in terms of resource loading.

When we enter a URL in the address bar, the browser process listens for the interaction. It then assigns a rendering process to prepare the page, while the browser process calls the web process to load resources.

After the network process finishes loading the resources, the resources will be handed over to the rendering process for page rendering. This is the whole loading process from a process perspective.

The big picture is that the browser process is scheduled, the loader process is finished loading the resources and then handed over to the renderer process to render the loaded resources.

Let’s take a closer look at what happens during the request after the URL is entered.

Network Layer 7 Protocol (`OSI`)

Let’s take a look at this graph for a moment:

For those of you who don’t know, take a look at the seven stages involved in a network request.

A detailed description of each of them can be found here

I can group these seven layers into the following four:

Usually, the application layer, presentation layer, and session layer are collectively referred to as the application layer. The main protocol of the application layer is HTTP.
In the transport layer, the Http protocol in our browser is based on TCP to carry out network transmission. (Common transport protocols include TCP and UDP)
The network layer generally uses IP protocol.
Of course both the data link layer and the physical layer are referred to as the physical layer.

Let’s take a look at how browsers load urls from the perspective of layer 7 protocols.

First of all when we inputurlEnter a domain name and the browser will look for the requested file in the disk/memory cache to see if it hits the cache. If the IP address matches the cache, the IP address is directly fetched from the cache.

If a strong cache is hit, the resource will be returned without going to the following steps.

We will ignore the effects of caching for the moment, but we will discuss negotiated caching and strong caching in more detail in the following sections.

Suppose we are visiting the page for the first time and there is no cache:

If the domain name we are accessing has not been resolved, we need to resolve the domain name entered in the address bar. Domain name resolution relies on the DNS protocol to resolve domain names into IP addresses. The IP address is the real IP address.

DNS, you can think of it as a mapping table that maps domain names to IP addresses. In fact, is a distributed database, through the domain name to find the corresponding IP address.

Note that DNS resolution is based on UDP, not TCP.

Here’s a quick question: why is DNS resolution based on UDP and not TCP?

Our DNS resolution process is a server lookup process. Because domain names are level 1 / level 2… Domain name, so each level of domain name is iterated over if it’s TCP, for every domain name query, the DNS server is going to do three handshakes. If the domain name lookup is based on TCP, TCP will do three handshakes each time. Udp doesn’t. It just sends the packet and confirms it.

TCP is more secure and reliable than UDP (because of the three-way handshake and four-way wave), but this also causes it to consume more time than UDP.

Udp is commonly used in video or live broadcast. For us, udp is more used in DNS resolution because of its speed. Of course, even if a packet is lost, we can send it again.

The TCP transmission process is called segmented transmission, that is, it is divided into multiple packets. Packets are sent one by one and the next packet is sent after receiving the response. One of the benefits of this approach is undoubtedly greater reliability and security. However, it is not as time-sensitive as udp protocol in real time (direct communication without establishing a connection).

In this case, the system resolves the CORRESPONDING IP address using the domain name and port number based on DNS resolution.

Once we have the IP address, then we need to use the IP to find the web address.

Now if our request address ishttps, through theipAn extra step is added before addressingsslNegotiation ensures data security.

When IP addressing succeeds, the browser knows the address of the server. Instead of sending the data immediately, the data will be queued. For example, if there are multiple requests under a domain name, a maximum of 6 TCP links can be established under http1.1, that is, a maximum of 6 requests can be sent at the same time, they will enter a queue waiting time.

When the queue ends, the request is sent. So what you’re going to do with TCP is you’re going to create a link and then you’re going to transfer data after you’ve established a three-way handshake.

We’ve already said that TCP is segmented, that if it’s a very large packet, TCP will break up the packet and call it multiple packets for sequential transmission.

If packet loss occurs during TCP transmission, TCP resends packets.

If you’re interested, you might want to think about why TCP connections are sometimes made three times and four times. Note disici1sh

The server receives the packets in sequence.

After TCP establishes the connection, the browser sends the requested data through an HTTP request.

One HTTP request contains

The request line
Request header
Request body

Connection:keep-alive is enabled by default in HTTP1.1. This function allows the previous TCP Connection to be reused for a certain period of time before the next request is sent without the need to re-establish the Connection. (that is, keep the same domain name TCP connection open for a certain period of time).

At this time, the server receives the data sent by the request, according to the request line, request header, request body for parsing. After parsing, the response line, response header, and response body are returned.

Note: There are some special status codes in the server return status code

301/302 Both of these status codes represent redirects. Returning either of these status codes will repeat the previous sequence of operations based on the domain name returned by Location in the return header.
304 Status code Indicates that the browser does not request to download the resource again after the resource is removed from the cache.

This is how a basic browser responds to a web request for a URL.

In order to`taobao.com`Let’s find out, for example

Having said so much dry theory, let’s try it in practice.

First we open a new browser TAB and type taobao.com in the address bar

Since I am entering this page for the first time, there is no cache. As mentioned earlier, the browser process starts a page rendering process and the web process starts the request process.

First let’s open the Chrome Developer Tool:

So if you’re interested you can try it yourself, so here when we type http://taobao.com/ the browser will parse DNS and TCP three way handshake and then send the request, When it gets the Response and finds that the Response Status is 302, it redirects to http://www.taobao.com/ based on the returned Location.

We then redirected to http://www.taobao.com/, requesting access to the 301 status code again, and were redirected to https://taobao.com/.

Then the DNS resolution again, Tcp establish the connection this step.

We recommend that you do this in the new traceless browser page, where we remove the DNS cache and any browser cache interference mechanisms to see the results more pure.

Here we have had a glimpse of the redirected domain name access, and we can see that each redirected DNS resolution and TCP connection establishment is very time consuming. Therefore, in our real project, we should try to avoid resource redirection, and if there is a redirect resource, we should try to replace it with a new address connection.

Let’s take the third https://www.taobao.com/ request as an example to analyze the stages of the next request (without any caching) :

Analyze the meaning of a complete waterfall diagram request

Let’s take a look at the corresponding waterfall in Chrome:

The Queueing phase represents the Queueing phase, where the browser queues requests in the following cases:
- Request with higher priority.
- Six TCP connections have been opened for this source, which is the limit. Only applicable to HTTP/1.0 and HTTP/1.1.
- The browser briefly allocates space in the disk cache
例句 : Meaning stagnation, requests may be halted for any of the reasons described in the queue above. (For example, after the link is started, there is some reuse of TCP connections and some proxy logic)
The DNS Lookup step begins DNS resolution, resolving our request domain to an IP address.
The Initial Connection phase represents the amount of time we spend doing both TCP/retry and SSL negotiation.
The SSL step is how long it takes to negotiate SSL when we request an HTTPS domain name.
Request sent Indicates that the request is sent
TTFB ‘ ‘TTFB stands for Time To First Byte. This time includes a round-trip delay and the time it takes for the server to prepare a response. In layman’s terms, this is the time when we send the request to the first byte that receives the response.

The TTFB step is usually a rough indication of how long it takes the request server (the backend) to process the request from acceptance to return of the response.

Content DownloadNeedless to say, it’s the time we download the local response.

Chrome also supports up to 6 concurrent TCP links in the same domain under HTTP1.1. Note that this is a TCP connection, not an HTTP request.

Since pipelining was introduced in 1.1, clients can send multiple requests simultaneously. This further improves the efficiency of the HTTP protocol.

Let’s use A small example, in the same TCP connection, first send A request, then wait for the server to respond, then send A request B. The pipeline mechanism allows the browser to make both A and B requests at the same time, but the server responds to A and B requests in the same order.

Of course, careful students will notice that we visit taobao.com basically all static resources are based on http2 protocol to implement, here we briefly introduce the various stages of HTTP:

`http`The stages of development

HTTP 0.9 originally supported only HTML transmission, with no request headers in the request.
HTTP 1.0 introduced request and response headers so that you can distinguish between images, HTML or JS based on the request header.
HTTP 1.1 establishes a TCP connection for each HTTP 1.0 request and disconnects the TCP connection when the request is complete. This is undoubtedly time-consuming. Therefore, in HTTP 1.1, a request header connect:keep-alive is enabled by default to reuse a TCP connection. Of course, even with the introduction of the long keep-alive link, there is still a problem that HTTP 1.0 is based on one request to receive the response before sending the next request. Pipelining mechanism is proposed for this mechanism 1.1. However, it is important to note that the server processes requests on the same TCP connection one by one, so this can cause a serious problem: header blocking.

If the first sent request loses a packet, the server waits for the request to be re-sent before processing it back. Then the next request will be processed. Even browsers are based on Pipelining to send multiple requests simultaneously.

The HTTP 2.0Many optimization points are proposed, among which the most famous one is solvedhttp1.1Queue head blocking problem in.
- multiplexing: The same parameter can be usedtcpLink, based on the binary frame layer to send multiple requests, support to send multiple requests at the same time, while the server can also process requests in different order without having to process each request in the order of return. So that settles itThe HTTP 1.1Queue head blocking problem in.
- The head of compressionIn:http2In the protocol, the request header is compressed to achieve the performance of submission transmission.
- Server push: http2The server can proactively push corresponding resources to the client so that the browser can download and cache corresponding resources in advance.
http3.0Based on:tcpIn case of packet loss, it is necessary to wait for the next packet. inhttp3Once and for alltcpThe queue head blocking problem it is based onudpProtocol and add a layer on topQUICThe agreement.

I haven’t done much research on HTTP 3.0 and 2.0, so I won’t do a detailed comparison. If you have more detailed suggestions, please leave them in the comments section. I will add this content if necessary later.

about`The HTTP 1.1`the`pipelining`The mechanism and`The HTTP 2.0`Multiplex of

In fact, this question has always puzzled me at the beginning, what is the difference between them? Until one day I saw this answer on StackOverflow

HTTP/1.1 without pipelining: Each HTTP request on a TCP connection must be responded to before the next request can be made.
HTTP/1.1 with Pipelining: Each HTTP request on a TCP connection can be sent immediately without waiting for a response from the previous request. The responses will be returned in the same order.
HTTP/2 multiplexing: Each HTTP request over a TCP connection can be sent immediately without waiting for a previous response to return. Responses can be returned in any order.

Browser rendering

First let’s take a look at the rough loading diagram for browser loading

Roughly speaking, this is what a browser’s rendering process looks like, but there are too many details involved. Such as the loading order of some files, whether to block, CRP key render path and so on…

Let’s peel back the veil of browser rendering layer by layer.

`css`with`js`for`dom`The influence of

You can see CSS’s detailed analysis of JS and DOM builds here.

`css`Whether it will block`Dom`

Let’s start by looking at how CSS affects the DOM:

forcssThe load is not blockingdomThe construction of.
forcssWill block when loadedafterthedomRender of the node.

To understand these two sentences, let’s take a look at the content of this Demo:

First of all, we try to set the network speed in Chrome to low speed. You will find that the page will print the node with the corresponding ID =app first, but the page CSS does not render any content at this time, and the page will be rendered after the CSS is loaded.

This means that loading CSS does not block Dom Tree construction, but loading and parsing CSS files does block page rendering.

`js`Whether it will block`Dom`

In fact, there is no doubt that js execution must block Dom Tree and Css OM. There are two special async and defer properties that I’m sure you’re already familiar with. I’m not going to go into that part here.

In fact, as long as we grasp a principle here, in the rendering process JS thread and render thread are mutually exclusive relationship.

why`css`Put the above`js`In the following

Now that we’re clear about js and CSS blocking, let’s take a look at a classic interview question: why is CSS at the top and JS at the bottom?

why`css`On the top

As we mentioned earlier, CSS loading and parsing does not block Dom building, but it does block rendering of subsequent elements on the page. As a result, if CSS is placed at the top, subsequent Dom elements will be rendered only after the CSS code has been parsed.

Some of you might be thinking, well, if I put CSS at the bottom, does the Dom element render first and then wait for the style to be resolved and then the page is redrawn, and does that make the page appear faster to the user?

Let’s take a look at this code:

We can see that putting CSS at the bottom of the page does produce two renders. But the first render without any style is actually an “invalid render”.

In the meantime, let’s look at the overhead of rendering the CSS once at the top and rendering it twice at the bottom:

We used Chrome Performance to analyze the code that placed CSS at the bottom and found that the browser actually drew elements twice, meaning that redrawing (and possibly backflow) would have occurred if CSS was placed at the bottom, which is a very time-consuming process.

We’ll talk about redraw/backflow later on how they’ve been avoided.

So with CSS at the top:

The first rendering of a page is done only once by the browser, without unnecessary redrawing and backflow steps.

why`js`It needs to be on the bottom

We talked about js actually blocking Dom Tree building and rendering. At the same time, JS is executed only after the CSS file is loaded.

Note that in web processes, parsing HTML preloads all external links ahead of time. To put it simply, CSS and JS external links can load network resource requests in parallel.

Without further ado, let’s also use performance to look at code like this:

Here we put js in front of the element, first of all, the subsequent element will not be built or rendered until JS execution is complete. The rendering thread will only continue to build the Dom Tree and render the page after js loads and parsing is complete.

Js blocks HTML parsing and rendering, and it is important to note that js execution must wait for the previous CSS to load and execute. Ensure that JS can manipulate style.

So if there is JS after CSS, the loading process of CSS can also block DCL events indirectly.

Of course, we will discuss defer and Async in detail in the future.

Here’s an added bonus: before the page parses the Html, the browser scans for external links and sends them to the web process for download. So CSS and JS downloads can be done in parallel.

That’s why we put JS at the bottom. Because js at the bottom will wait for the page to complete rendering before blocking to execute subsequent JS.

The illustration`css`and`js`The loading

cssLoading execution blocks subsequentjsAt the same timecssLoading blocks the rendering of the page.
cssLoading may block subsequentdomParsing, depending on whether there is a follow-upjsTo judge.
jsLoading and parsing are blocked laterdomThe parsing.

Written in the end

I was going to go through the whole browser rendering process and then through to performance optimization, but halfway through, the points that are really involved in performance and rendering are a huge branch.

Over and over again, here is still going to give you a general and render process to comb a complete idea. There will be different articles to split out different details to break down one by one.

Or if you are interested in any part of the browser rendering, or if you have any questions about any of the points in the article, please leave me a comment in the comments section.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Fully parse browser rendering principles

Writing in the front

Processes and threads

From the input`URL`What happens between page display