This is the eighth part of this series. The first part is Web performance Metrics. The next part is a rendering of the page rendering process
There are a lot of articles on how pages should be presented on the Internet, so you can go and search for them. This is also a classic interview question, for example:
- What happens when you go from URL input to page presentation?
- O&m development answer: What happens from entering the URL to displaying the page
- What happens when we type a URL into the browser?
- What happens after the browser enters the URL and returns (Hyperdetailed version)
- What-happens-when
- What happens when the browser enters the URL?
- Common interview question: What happens after a browser enters a URL
- What happens from the time you enter the URL until the page loads?
There are many, many things that have not been listed, some are simple, some are very detailed. Page how to present the process to fine speech with a few books speak not over, everyone have their own respective emphasis of summary and focus, you can see more complementary, are also suggest everyone to write an article belongs to own about how page rendering, can help unclog the knowledge context, leak fill a vacancy.
Before discussing performance optimization strategies for Web pages, it is necessary to comb through the process from navigation to rendering to the screen to understand performance problems at which stages in the rendering process, and then propose corresponding optimization measures for these problems.
Network knowledge is vast, I with uneasy mood to try to comb the process, in series page loading process as the theme, as far as possible to cover the core knowledge and performance related content 😇.
In addition, this article refers to some books, and the source of the non-original pictures has been indicated. For more details, I suggest you read these books:
- TCP/IP Volume 1: Protocols
- How the Web Is Connected
- The Definitive GUIDE to HTTP
- Illustrated HTTP
Front knowledge
To understand the network loading process of a page, we need to know how the network communicates, including communication objects and communication protocols.
Client & server
Communication objects refer to computers connected to the Internet, which can be divided into clients and servers.
- Client: the device that users access the network and the applications on the device, such as PC-pc-browser and mobile-phone-APP
- Server: Stores web pages and applications, which can be divided into proxy servers and source servers (source servers are relative to proxies)
The process of a client downloading a web page from a server can be illustrated in the following figure.
Client and server
Network Protocol & Protocol layering
Communication between a client and a server over a network requires a prior agreement to determine how data should be encapsulated, addressed, transmitted, routed, and received at the destination. This agreement is called a protocol.
These protocols are implemented on devices from different manufacturers and operating systems, as long as the devices support these protocols and follow the same protocol when communicating with other devices.
As with the W3C performance Working Group, Web performance standards, and major browser vendors, we need a standardization body to write specifications for communication protocols that can be implemented by computers from different vendors so that computers around the world can communicate with each other.
OSI seven layer reference model
In 1984, the International Organization for Standards developed an International standard called Open Systems Interconnection, or OSI, As a reference model for communication protocol design, it is often called OSI reference model.OSI seven layer communication model
The OSI reference model divides the communication protocol into seven layers and defines the roles of each layer:
The application layer
Provides services to the application and specifies the details related to communication within the application. Protocols include file transfer, email, and remote login (virtual terminal).
The presentation layer
To convert information processed by an application into a format suitable for network transmission, or to convert data from the next layer into a format that the upper layer can process. Therefore, it is mainly responsible for data format conversion.
Specifically, it is to convert the data format inherent to the device into the network standard transmission format. Different devices may interpret the same bit stream differently. Thus, keeping them consistent is the main role of this layer.
The session layer
Responsible for establishing and disconnecting communication connections (logical paths through which data flows), as well as data segmentation and other data transmission-related management.
The TCP transport layer
Plays the role of reliable transmission. It is processed only on the communication nodes, not on the router.
The network layer IP
Transfer the data to the destination address. The destination address can be an address that multiple networks connect to through a router. This layer is therefore responsible for addressing and routing.
Data link layer
Responsible for the communication and transmission of interconnected nodes on the physical plane. For example, communication between two nodes connected to an Ethernet. The sequence of 0 and 1 is divided into meaningful data frames and sent to the peer end (data frame generation and reception).
The physical layer
Responsible for 0, 1 bit flow (0, 1 sequence) and the high voltage, the light between the exchange.
TCP/IP four-tier model
At the same time, the Internet Engineering Task Force (IETF) developed the TCP/IP protocol. Like Web performance standards, TCP/IP has a standardized process:
- Draft stage
- Proposed standard stage
- Draft standard stage
- The standard stage
The protocols were eventually standardized, documented in RFC (Request For Comment) documents and published on the Internet. RFC not only records the protocol specification content, but also contains the implementation and application of the protocol and experimental information.
All communication protocol files can be found at rfc-editor.org. A file named RFC-index. TXT contains an overview of all RFCS and their corresponding protocol numbers, which can be retrieved from the home page. Here I have compiled some of the protocol documents that we often use.
Table: RFC protocol documents
agreement | RFC |
---|---|
The HTTP 1.1 | RFC 2616- HTTP / 1.1 RFC 7230 – HTTP/1.1 Message Syntax and Routing RFC 7231 – HTTP/1.1 Semantics and Content RFC 7232HTTP / 1.1 – Conditional Requests RFC 7233- HTTP / 1.1 Range Requests RFC 7234- HTTP / 1.1 Caching RFC 7235- HTTP / 1.1 Authentication |
HTTP 2 | RFC 7540 – HTTP/2 RFC 7541 – HPACK: Header Compression for HTTP/2 |
DNS | RFC 6895 – Domain Name System (DNS) |
TLS | RFC 4346 – TLS |
TCP | RFC 793 – TCP |
UDP | RFC 768 – UDP |
IP | RFC 791 – IP RFC 8200 – Internet Protocol, Version 6 (IPv6) Specification |
If the content of a protocol specification is extended, a new numbered RFC document is recorded, and the new RFC document specifies which existing RFC is extended. If an existing protocol specification is modified, a new RFC document is issued and the old RFC is invalid. The new RFC document specifies which existing RFC is to be invalid.
The protocol is not spelled out here as Web performance standards are, but you can click through to see what specifications the protocol defines just by looking at the table of contents.
After seeing the specific content of the protocol standards formulated by IETF, we should also find that these protocols attach great importance to practicability and can be applied in practice according to the content defined by the protocol.
TCP/IP protocol cluster is widely used because of its strong feasibility, while OSI reference model only roughly defines the role of each layer, and does not define the protocol and interface in detail, so it is only a reference model, and does not give the implementable scheme.
However, many communication protocols are guided by the OSI reference model and can be mapped to one of the seven layers. We can map various protocols in the TCP/IP protocol cluster to the OSI reference model. When learning communication protocols, You can know the position and function of the protocol in the whole communication function by the protocol layer.
Page loading process
The structure of the communication model of page loading based on HTTP protocol is shown in the figure below. The description of the page loading process in this paper will be carried out layer by layer by referring to the classification of communication functions in the OSI reference model and the functions of TCP/IP. Use OSI reference model and TCP/IP protocol cluster function definition to deepen the understanding of page loading process.
The OSI model is very fine-grained, with each layer being independent, but in practice it is not possible to achieve such fine-grained independent layering. As you can see, the application layer, presentation layer, and session layer are all implemented in the application. You can see the implementation code in the Chrome source code. They are not completely implemented in isolation.
There is also a five-layer model in the figure, because generally speaking in the TCP/IP four-layer model, the network interface layer will be divided into data link layer and physical layer.
** Communication model and architecture implementation
There are many ways to trigger a request, such as address bar input, TAGS such as IMG and video in HTML, initiated in Script (including beacon), etc., and eventually there will bea request URL.
For HTML documents, you typically start with input in the address bar. The browser first parses the URL to determine which application layer protocol is used to initiate the request. If HTTP is used, the browser searches the local cache and then decides whether to initiate the network request.
Let’s start the page load journey with URL parsing.
Parsing the URL
The browser’s first job is to parse the URL. Once the URL is parsed, the browser knows how to access which resource in which location.Information in the URL
The first thing you need to know is the protocol type of the URL, which determines what type of request will be created, for example:
- http://uses HTTP to access the Web server
- File:// Reads the client’s local file
- ftp:// Uses the FTP protocol to access the FTP server and upload and download files
The URL specifications vary with the protocol type. Browsers need to parse urls based on the protocol type.Various formats of urls from How the Web Is Connected
According to this rule, we can obtain this information from the URL of the HTTP request:
- Protocol type: HTTP or HTTPS defines communication rules between the Web client and Web server
- Server domain name: Obtain the IP address of the target server through domain name resolution
- Port number: The number used to identify the server program to connect to
- File pathname: the location of the resource on the server
The communication rules defined by the protocol are applied to the application layer to prepare HTTP packets and receive and parse HTTP packets. The IP address and port number obtained through the server domain name are used in the TCP/IP layer connection establishment phase, and the file path is used to search for resources after the request reaches the server.
Internal redirection (HTTP > HTTPS)
The HTTP Strict Transport Security specification defines a Security mechanism that, when accessing the https:// address, a server can add strict-transport-security to the response header, telling the browser to take note of this information, HTTP is automatically replaced with HTTPS when the site is accessed later using http://.
// max-age is the expiration time strict-transport-security: max-age=86400Copy the code
So, when we visit www.taobao.com/ and parse the URL to find that the HTTP protocol is used, the browser will query to see if it has logged the strict-transport-security information stored during the previous visit to www.taobao.com/, If it exists and has not expired, a network request is not made to the server, but is redirected directly to www.taobao.com/ within the browser.
More details are covered later in “Browsers Receiving Responses – HTTP Redirection.”
Find the HTTP cache
If no internal redirection is required, the client’s local cache is looked up by the resource’s URL. Here to hand tao home main.m.taobao.com/?sprefer=sy… As an example.
You can type Chrome ://cache in the address bar to view the resources cached by the browser. Chrome records the responses to all requests, including documents, static resources, Ajax requests, redirects, and so on. Because storage space is limited, these records are cleared according to certain rules.
Note: Chrome ://cache ://net-internals/# DNS ://net-internals/# DNS ://cache ://net-internals/# DNS ://net-internals/# DNS ://net-internals/# DNS ://net-internals/# DNS ://net-internals/# DNS ://net-internals/# DNS After downloading, you need to delete the latest version of your local Chrome to see the entry.
I downloaded it here (www.chromedownloads.net/).
View resources cached in Chrome
Just because the browser has a cache doesn’t mean it can be used, we need to look at the details of the cache resource. For example main.m.taobao.com/?sprefer=sy… Cache information.
The browser accesses the resource cache for the first time
This is the information cached by the browser the first time it accesses it, returning a status code of 200 and two cache-relevant response headers:
cache-control: max-age=300, s-maxage=600
etag: W/"2c8d-17145f41a6e"
Copy the code
Cache-control tells the browser that the resource will be cached for 300 seconds (5 minutes) on the client and 10 minutes on the proxy cache server.
When we visit main.m.taobao.com/?sprefer=sy again in 5 minutes… When the browser reads the resource directly from the local cache without sending a network request.
Fetch resources from the local cache
When accessing the resource five minutes later and finding that the resource has expired, the browser initiates a network request to obtain the resource from the server and adds if-none-match to the request header, whose value is the Etag value in the cache copy. That is, the browser sends the Etag returned by the server to identify the resource to the server. After receiving the request, the server compares the Etag with the server resource to determine whether the resource is changed.
When the resource does not change, the server returns 304, telling the browser to fetch the cached copy directly from the local server.
The browser negotiates caching with the server
At this point, many people will find that accessing main.m.taobao.com/?sprefer=sy… Instead of fetching the cached copy directly from the local directory, the browser initiates a network request.
The normal caching mechanism would be to fetch a copy of the local cache during the cache lifetime and then make a network request after the cache expires, but in this case, Chrome does something else.
You’ll notice that the browser ignores cache-control when the local cache is valid: Max-age =300. Cache-control :max-age=0 is added to the request header, indicating that the browser does not want to obtain the cache directly from the local resource. Instead, the browser should send a request to the server to verify the validity of the local resource and then decide whether to use the local cache.
The browser added the request header cache-control:max-age=0 to negotiate the cache with the server
Why do browsers do this? Do all resource caches have this policy?
Reload, Reloaded: Faster and leaner Page Reloads, they learned, taking into account the fact that users often reload pages for two reasons: Before the page crashes or wants to refresh to see new content, all resources are fetched from the cache to solve the problem of refreshing the page when the page crashes, but when the user wants to refresh the content, it is very difficult to see the new page, especially on mobile devices.
As a result, Chrome changed its behavior when reloading a page to only verify that the main resource was updated, ensuring maximum reuse of cached resources and freshness of the page.
This is probably what happens when you visit main.m.taobao.com/?sprefer=sy… Cache-control :max-age=0; max-age=0;
Also, when we reopen a Tab to visit main.m.taobao.com/?sprefer=sy… “, you will find that the browser reads directly from the cache, and after multiple accesses, even if a new Tab is opened, the network request is sent without reading directly from the cache.
Is Chrome ignoring Cache-Control: max-age? When a user refreshes a page or opens the same page multiple times, the browser guesses that the user wants to see a fresher page. So for the main document, it makes a web request and negotiates with the server whether to use a local cache. For other static resources, use cache-Control in the cache copy and the browser will not ignore and add max-age=0.
Cache-control :max-age=0
HTTP caching is based on cache-Control, Etag, and if-none-match headers. For historical reasons, however, there are also some cache-related request headers and response headers, which are not specified here and are briefly listed in a table.
HTTP cache-related protocol headers
The process of finding the cache can be seen in this flowchart:
Obtaining cache process
Making a Network request
If there is no local cache or cache resources are not available, you need to make a network request.
Application layer: DNS resolves domain names
Before launching a network request, we need to establish a connection with the target server. Before establishing a connection, we need to know the IP address of the target server, which requires domain name resolution.
Domain Name System (DNS) is a protocol at the application layer like HTTP. It provides domain name to IP address resolution service. Users usually use host names or domain names to access each other’s computers, rather than directly through IP addresses.
The DNS provides the service of searching IP addresses by domain names or reverse-searching domain names from IP addresses. For the PRINCIPLES of DNS resolution, you are advised to refer to How Networks Are Connected. The following figure shows the DNS resolution process.
The procedure for querying DNS servers from How Are Networks Connected is as follows:
- 0. Query DNS records from the browser cache. If no, query DNS records from the operating system cache.
- 1. Send a request to the local DNS server for the IP address of www.lab.glasscom.com. If the local DNS server has cache and has not expired, the IP address is returned directly.
- 2. The local DNS server queries the root DNS server, and the root DNS server returns the com server address.
- 3. The local DNS server queries the IP address of the com server www.lab.glasscom.com, and the COM server returns the glasscom.com server address.
- 4. The local DNS server queries the IP address of the glasscom server www.lab.glasscom.com, and the Glasscom server returns the address of the glasscom server.
- 5. The local DNS server queries the IP address of the lab.glasscom.com server www.lab.glasscom.com. The level 3 domain name (lab.glasscom.com) server returns the IP address of the Level 4 domain name (www.lab.glasscom.com) server, which is the IP address of the Web server.
- 6. The local DNS server returns the IP address of www.lab.glasscom.com to the client
- 7. The client sends a request to the Web server based on this IP.
This query process is based on two premises:
- DNS server information of the root domain is stored on all DNS servers on the Internet.
- The IP addresses of the DNS servers that manage lower-level domains are registered with their upper-level DNS servers, and the IP addresses of the upper-level DNS servers are registered with the upper-level DNS servers.
Application layer: Prepares HTTP request messages
To initiate a network request, the browser prepares an HTTP message. In HTTP 1.1, a request message consists of the request method, request URI, protocol version, optional request header field, and content entity.
The composition of the request message
So the browser needs to make sure that:
- Request method
- The request URI
- Protocol version
- Request header field
- Content of the entity
Each content involves a lot of knowledge, here is a brief introduction.
Request method
The request method represents the type of server requested to access. HTTP/1.0 and HTTP/1.1 support the following methods: By default, the browser initiates a GET request. If a request is initiated using a script, the request method specified in the script prevails.
Methods supported by HTTP/1.0 and HTTP/1.1
Protocol version
HTTP version number: indicates the HTTP version used by the client. So far, the HTTP version is HTTP/3. Here is a brief list of the features defined for each release.The development of the HTTP protocol
At present, most of the major browsers have supported HTTP/2 by the end of 2015, and some of the top sites at home and abroad have implemented HTTP/2 deployment.
About HTTP/1, HTTP/2, HTTP/3 introduction suggestions look at the teacher Li Bing “browser working principle and practice” vivid introduction to the development of HTTP protocol
Request header field
The request header field mainly provides the browser and server with the size of the packet body, the language used, authentication information, and some negotiation content. Such as:
- Access sources: Host, Referer, user-agent
- Authentication information: Cookie and Authorization
- Cache negotiation information: if-modified-since, if-none-match
- Content Encoding, character set, and data format negotiation: Accept, accept-encoding, accept-language
- other
Instead of taking care of the data itself, the browser (the application) delegates the message to the server through the network control software (the protocol stack) in the operating system. After the HTTP request message is prepared in the application layer (browser), it is passed to the network control program (TCP/IP protocol stack) in the operating system.
Presentation layer: HTTP/2 binary framing
If HTTP/2 is used, HTTP/2 defines a binary framing layer above the transport layer and below the application layer, which we can map to the presentation layer in the OSI seven-layer model, which is responsible for data format conversion.
We can see that the framing layer converts HTTP 1.1 messages into HTTP/2 frames.
The problem with a single TCP connection is that only one request can be made at a time, so the client must wait until the response is received before making another request. This is the “plug blocking” problem. As discussed earlier, a typical workaround is to open multiple connections; One connection per request. However, if messages can be broken down into smaller independent parts and sent over a connection, this problem can be solved. This is exactly what HTTP/2 hopes to achieve. The message is broken up into frames, a stream identifier is assigned to each frame, and then they are sent independently on a TCP connection. This technology enables full bidirectional request and response message reuse, as shown in the figure below.
If HTTP/2 is used for communication, this section describes the differences between HTTP/1 and HTTP/2 and how to divide frames.
Session layer: Applications establish connections through the Socket delegate protocol stack (TCP/IP)
We can put the application program (application layer) through Socket interface to inform the operating system (transport layer) to establish a connection of this process corresponds to the OSI seven layer model of the session layer, application program and operating system interaction process is shown in the figure, more detailed content is recommended to refer to the book “How the Network is Connected”.How is the Network connected
The connection operation requires a three-way handshake, which is implemented at the transport layer. Here we introduce the process of establishing connection, sending and receiving data, and disconnecting connection with four waves implemented by transport layer.
The overall TCP flow
The first step in the data sending and receiving operation is to create a socket. Typically, an application on the server side creates a socket and enters the connection waiting state at startup. The client typically creates a socket when the user triggers a specific action that requires access to the server. At this stage, network packets have not been transmitted.
After the socket is created, the client initiates a connection operation to the server.
- First, the client generates a TCP packet with a SYN of 1 and sends it to the server (①). The header of the TCP packet also contains the initial sequence number used by the client to send data to the server, and the window size required by the server to send data to the client, A.
- When this packet arrives at the server, the server returns a TCP packet (②) with a SYN of 1. As with ①, the header of the package contains the sequence number and window size, as well as the acknowledgement that the package has been received
ACK number B.
- When this packet arrives at the client, the client returns a TCP packet (③) to the server containing the ACK number for confirmation.
At this point, the connection is complete and the two parties enter the data sending and receiving phase. The operation of the data sending and receiving phase varies somewhat from application to application, using the Web as an example.
- First the client sends a request message to the server. TCP splits the request message into blocks of a certain size and prefixes each block with a TCP header, which is then sent to the server (④). The TCP header contains the ordinal number, which indicates the byte number of data that is currently being sent.
- When the server receives data, it returns the ACK number (⑤) to the client.
- In the initial phase, the server is just constantly receiving data, and as the data is being sent and received, the data is being passed to the application,
The receive buffer is gradually released. At this point, the server needs to inform the client of the new window size. When the server receives the request message from the client, it will return the response message to the client. This process is just the opposite (⑥⑦).
After the response message from the server is sent, the data receiving and receiving operation ends and the disconnect operation begins. Take the Web as an example:
- The server initiates the disconnect process first.
- In this process, the server sends a TCP packet with FIN 1 (⑧), and the client returns an ACK number (⑨) confirming receipt. Next, the two parties exchange a group of TCP packets with FIN 1 in the opposite direction and TCP packets containing ACK numbers.
- Finally, after waiting some time, the socket is deleted.
Session layer: TLS negotiation
After the three-way handshake is used to establish a connection at the transport layer, if HTTPS is used, the TLS handshake is required to establish a secure connection. This handshake process, which can be called TLS negotiation, determines which password will be used to encrypt the communication, authenticate the server, and establish a secure connection before the actual data transfer begins. So it takes another four or three round trips to the server before the content request is actually sent.
Every TLS connection starts with a handshake. If the client has not previously established a session with the server, both parties perform a full handshake process to negotiate the TLS session. During the handshake, the client and server perform the following four main steps.
- (1) Exchange supported functions and reach agreement on required connection parameters.
- (2) Verify the certificate presented, or use other means for authentication.
- (3) Agree on the shared master key that will be used to protect the session.
- (4) Verify that the handshake message has not been modified by a third party.
(1) The client starts a new handshake and submits its supported functions to the server. (2) The server selects connection parameters. (3) The server sends its certificate chain (only if server authentication is required). (4) Depending on the key exchange method chosen, the server sends additional information to generate the master key. (5) The server notifies itself that it has completed the negotiation process. (6) The client sends the additional information required to generate the master key. (7) The client switches the encryption mode and notifies the server. (8) The client calculates the MAC of the handshake message sent and received and sends it. (9) The server switches the encryption mode and notifies the client. (10) The server calculates the MAC of the handshake message sent and received and sends it.
Assuming no errors, at this point the connection is established and you can start sending application data.
Transport layer: Encapsulates packets into segments
When the connection is established, the browser can finally make a data request. The application program calls the write method in the Socket library to deliver the HTTP message prepared by the application layer to the protocol stack. The data of the application program is generally large, so the data is split at the transport layer according to the size of the network packet. Divide the data into packet segments and add the TCP header.
Split sending of application data
Network layer: encapsulated into datagrams
The IP module receives TCP packets, adds an IP header and an Ethernet MAC header before sending network packets.IP datagram encapsulation process
Data link layer: encapsulated into frames
The data unit transmitted from IP to the network interface layer is called IP datagram. After the NETWORK adapter driver obtains the datagram from the IP module, it adds a header and an initial frame delimiter to the beginning of the datagram, and a frame verification sequence to detect errors to the end. After adding the header, start frame delimiter, and FCS, we are ready to send the packet over the network cable. This stream of bits transmitted over Ethernet is called a frame.How are Networks connected packets sent by network cards
Physical layer: digital conversion to electrical signals
The internal structure of a network card is as follows. The MAC module in the network card converts digital information into electrical signals, bit-by-bit, starting with the header, which are then sent by the PHY, or MAU transceiver module.From How Networks Are Connected network card
In this case, the rate at which digital information is converted into an electrical signal is the transmission rate of the network. The PHY (MAU) module then converts the signal into a format that can be transmitted over the network cable and sent over the network cable.
The network adapter converts digital information into electrical or optical signals
The request arrives at the proxy server
Proxy is a mechanism between the client and the Web server to transfer access operations, including forward proxy and reverse proxy.
If you think of the Internet outside the LAN as a vast repository of resources, then resources are distributed at various sites on the Internet. LAN clients can access the resources in this library only through a unified proxy server, which is called forward proxy.
That is, on the same LAN as the client, the forward proxy server enables the client to access the Internet to access Internet resources. Therefore, the forward proxy server can be used as a firewall to monitor and manage LAN access to the Internet.
Forward proxy is used in the following scenarios: scaling a wall, monitoring and managing LAN access to the external network, caching proxy, and hiding visitor information.
Forward agent
If the LAN provides resources to the Internet so that other users on the Internet can access resources within the LAN, you can also use a proxy server, which provides a service called reverse proxy service. That is, the reverse proxy server and the server reside on the same LAN. The reverse proxy server allows extranet clients to access sites on the LAN to access resources on the site.
The reverse proxy
Reverse proxy is mainly used in the following scenarios:
- Firewall, protect and hide raw resource server
- Load balancing, when there are multiple servers, uses proxy servers to forward requests
- Reverse proxy cache, used to cache resources on the server
We can share the load of the Web server by adding a proxy server between the client and the server in three ways:
Load balancing
When user traffic is high, you can handle it by adding multiple servers. The load balancing proxy can forward the load balancing requests to the source server to reduce the load on the source server and improve the processing speed of the server.
Caching proxy server
The proxy server is deployed on the server LAN. The proxy server forwards requests to the source server every time. When the resource bit of the source server changes, the proxy server returns 304 and directly obtains resources from the cache server, reducing the processing time of the source server and the network transmission time of response data.
Content delivery service
A service that deploys the proxy server at the edge of the Internet and provides a cache refresh mechanism. If the proxy server has a cache and the cache is valid, the proxy server directly returns the cache resources without sending requests to the source server, which directly saves the network transmission time and server processing time. Generally static resources are published to the CDN cache server.
CDN content distribution: returns cached resources or sources
The original idea of caching is just like the working process of the cache server mentioned above. It saves the data that has been accessed and then uses it when it is accessed again to improve the efficiency of the access operation. However, this approach is ineffective for the first access, and each subsequent access requires a query to the original server to see if the data has changed, worsening the response time if network congestion occurs.
One way to improve this is to have the Web server notify the cache server immediately when the raw data is updated, so that the data on the cache server is always up to date, so that there is no need to confirm whether the raw data has changed every time, and the caching effect can be applied from the first access. The cache server used by the content distribution service does just that.
Content delivery service operators deploy many cache servers on the Internet. These servers are neither on the client’s LAN nor on the server’s LAN. When the client accesses the Web server, the client accesses the cache server closest to the user.
Deploy caching servers at the edge of the Internet
How do you find the nearest cache server to the client? When the DNS server returns the Web server IP address, we can manipulate the returned content so that it returns the IP address of the cache server closest to the client.
In addition, static pages are usually deployed on the cache server, and dynamic pages cannot be stored on the cache server. But we can separate the dynamic parts, where the content changes every time, from the static parts, where the content does not change, and only keep the static parts in the cache.
Here is an example of Tencent cloud official website to illustrate the working process of CDN. Assume that the domain name of your service source site is www.test.com. After the domain name access CDN starts to use the accelerated service, when your user initiates an HTTP request, the actual processing flow is as follows:
From Tencent cloud official websiteThe details are as follows:
- When a user sends a request to an image resource (for example, 1.jpg) at www.test.com, the user sends a domain name resolution request to the Local DNS.
- When Local DNS www.test.com will find already configure CNAME www.test.com.cdn.dnsv1.com, parsing the request will be sent to the Tencent DNS (GSLB), GSLB independent research and development of scheduling system for Tencent cloud, The request is assigned the best node IP.
- Local DNS Obtains the resolved IP address returned by Tencent DNS.
- The user obtained the resolution IP address.
- The user initiates an access request to resource 1.jpg to the obtained IP address.
- If the node cache corresponding to this IP has 1.jpg, the data will be directly returned to the user (10), at which point the request ends. If 1.jpg is not cached, the node sends a 1.jpg request (6, 7, and 8) to the service source. After obtaining the resources, the node caches the resources to node (9) and returns the resources to user (10) based on the user-defined cache policy (for details, see cache expiration configuration in the product documentation). At this point, the request ends.
Load balancer: forwards requests
When the number of visits to a server increases, all the visits are directed to the same server, and server performance issues arise. You can deploy your application on multiple Web servers, and then use a device called a load balancer. The client sends requests to the load balancer, and the load balancer decides which Web server to forward the requests to.
To the client, a load balancer is a Web server. You need to register the DNS server with the IP address of the load balancer instead of the actual address of the Web server. The client considers the load balancer as a Web server and sends requests to the LOAD balancer.
The job of the load balancer is to decide which Web server to forward the request to based on the load of the Web server.
A load balancer for assigning access to multiple Web servers
Requests are forwarded to the source server
When the CDN returns to the source or the request reaches the load balancer, the request needs to be forwarded to the source server, at this time the source server needs to receive the request, process and return the response.
Unpack and assemble
When the network packet arrives at the Web server, the server receives the packet and processes it. Servers can be divided into many categories based on their purpose, and their hardware and operating systems differ from those of clients. However, the network related parts, such as network card, protocol stack, Socket library and other functions are the same as the client. TCP and IP have the same functionality, or specifications, regardless of hardware and operating systems.
The server now receives the network package in the following format:Packets sent by network cards from How Networks Are Connected
The server needs to unpack, take out the client’s data (what the application is responsible for generating), find the server’s application and hand it over. The specific process is as follows:
The link layer
The NIC driver determines the protocol type based on the MAC header and delivers the packet to the corresponding protocol stack.
The network layer
The IP module of the protocol stack checks the IP header.
- Check the IP address of the recipient to determine whether the IP address is sent to the recipient.
- Check the IP
The content in the header determines whether the packet is fragmented. If the packet is fragmented, the packet is temporarily stored in memory. After all fragments arrive, the fragments are assembled and restored to the original packet.
- Check the protocol number field in the IP header and forward the packet to the appropriate module. case
For example, if the protocol number is 06 (hexadecimal), the packet is forwarded to the TCP module. If it’s 11 (ten
Hexadecimal), is forwarded to the UDP module.
The transport layer
Here we assume that the packet is handed over to the TCP module for processing, and then we move on. The previous steps are the same for any packet, but subsequent TCP modules operate differently depending on the contents of the packet.
- If the TCP module receives a packet that initiates a connection:
- Verify the TCP header’s control bit SYN.
- Check the receiving port number;
- Make a new copy of the corresponding waiting socket;
- Record the IP address and port number of the sender.
- When receiving a packet, the TCP module:
- Locate the corresponding socket according to the sender IP address, sender port number, receiver IP address, and receiver port number of the received packet.
- Blocks of data are pieced together and stored in the receive buffer;
- Return an ACK to the client
The network packet is unpacked and assembled from the link layer to the network layer and then to the transport layer, and finally sends the user number to the Web server on the server side.Allocate received packets from How Networks Are Connected
Handle the request
The server receives the HTTP request message and processes it according to the content in the received request message. The request message contains a command called a “method” and a URI (file pathname) representing the data source, which the server program returns to the client, but the internal workings of the server vary depending on the method and URI.
In the simplest case, shown in the figure below, the request method is GET and the URI is an HTML file name. In this case, you simply read out the HTML document from the file and return it as a response message.
How does the Web work
However, the file contents specified by the URI are not limited to HTML documents, but can also be a program. For example, when making an Ajax request, it may be a Controller on the server side, which needs to query or update data from the database through some business logic in the program, and finally return the results.
There may also be access control before the process of the server program, such as determining whether to log in or not to allow the next query or update data operations, otherwise the result is not returned, and a 401 is returned, indicating that there is no access permission.
Returns a response
When the server has finished processing the request message, it can return the response message. The process is the same as when the client sends a request message to the server.
First, the Web server calls write of the Socket library, passing the response message to the protocol stack. In this case, we need to tell the protocol stack who the response message should be sent to, but we don’t need to tell the client directly such information as THE IP address, but just the descriptor of the socket used for communication. The socket holds all of the communication state, including the information about the object, so all you need is a descriptor.
Next, the stack splits the data into network packets and sends them with headers. These packets contain the addresses of the receiving clients, which are forwarded by switches and routers to the client over the Internet.
Browser receives response
The response message sent by the Web server is broken up into multiple packets and sent to the client, which then needs to receive the data. First, the network card restores the signal to digital information, and the protocol stack assembles the split network packet and pulls out the response message, which is then passed on to the browser. This process is the same as the server’s receive operation.
Now that the browser receives the response from the server, it needs to parse it first. The browser decides what to do next based on the status code and HTTP header message of the response.
Status code | meaning | Common status code |
---|---|---|
1xx | Message: The received request is being processed | 101: Switching Protocals The server is switching the protocol to the one listed in the Update header as specified by the client. Can refer toProtocol Upgrade Mechanism. |
2xx | Success: The request is successfully processed | 200: OK The request was successful and the message body contains the requested resource |
3xx | Redirection: Further action is required to complete the request | 301: Moved Permanently redirected Permanently 302: Found Temporary Redirects (HTTP 1.0) 304: Not Modified The requested resource is Not Modified. The server does Not need to return the resource and the client can use the local resource 307: Temporary Redirect (HTTP 1.1) Chrome is used for internal redirects |
4xx | Client error | 401: Unanthorized Not authorized 404: Not Found The server could Not find the resource corresponding to the requested URL |
5xx | Server error | 500: Internal Server Error Indicates an Internal Server Error 502: Bad Gateway The agent Gateway is incorrect |
HTTP redirection
When a server resource is moved, the client is told to go somewhere else, but we generally try to avoid this situation. The other redirects we see more often are:
- When the url is entered on the mobile end, the server returns the mobile resources to the client by means of redirection for self-adaptation.
- IO. We applied for a new domain name, Chang20159.com, hoping to redirect to Chang20159.github. IO
- The HTTP url is redirected to the HTTPS url
As it happens, Taobao includes all of these situations. When we open the mobile mode in the browser and type taobao.com in the address bar, we can see seven redirects.
Main.m.taobao.com/?sprefer=sy… Is the final destination of the resource. Since the document has not been changed on the server, the server returns 304, telling the browser to fetch it from the local cache.
Seven redirects from visits to Taobao.com
We’ll see 3 Internal redirects out of 7. Why is that?
Because when accessing www.taobao.com, the server adds this to the response header:
strict-transport-security: max-age=31536000
Copy the code
This is used to inform the browser that the server uses Strict Transport Security (HSTS) to ensure that the site can only be accessed using HTTPS. When the browser finds this response header, it stores this setting with the domain name www.taobao.com as the key. The validity period is the value of max-age.
When I access www.taobao.com using HTTP, the browser finds this configuration and internally redirects directly to www.taobao.com.
So instead of sending the request to the server, the 307 Internal Redirect is a virtual response created by the browser.
If you open Chrome ://net-internals/# HSTS, Delete domain security policies.
The www.taobao.com request response becomes Permanently redirected (301 Moved Permanently Permanently).
The process of redirecting the request to the server is the same as described above. After redirecting to the final storage address of the resource, we finally get the required response data.
The next step is to detect the response body and do further processing.
Detection response body
Common HTTP requests are requests for HTML documents, static resources (JS/CSS/ images) and Ajax data requests, etc. When we get the HTTP response message, we need to know whether the response message is complete, whether it is compressed, what type it is, etc., before the browser can proceed with the next operation. For example, if it is an HTML document, you need to render it.
Detection truncation (Content-Length)
Content-length indicates the size of the entity body in bytes. This size includes all Content encoding. For example, if a text file is gzip compressed, content-Length indicates the compressed size, not the original size.
After receiving the response entity message, the client needs to use content-Length to detect packet truncation. The client can determine whether the packet ends by content-Length. In this way, packet truncation caused by server crash can be detected and multiple packets sharing persistent connections can be correctly segmented.
Detection of content-encoding
When the server returns a response, it may compress the response content to help reduce the transmission time, and it may scramble or encrypt the content to prevent unauthorized third parties from seeing the contents of the document.
Content-encoding The response header specifies the algorithm used for server Encoding. When the client receives the response, it needs to decode it according to the algorithm used for Encoding.
Examples of content encoding from the Authoritative GUIDE to HTTP
The commonly used some of the Content is as follows, specific can consult the Content – Encoding | MDN
Detecting media types and character sets (Content-Type)
The Web can process multiple types of data, including text, image, sound, video, and so on. Each type of data is displayed differently. Therefore, you must first know what type of data is returned, otherwise it cannot be displayed correctly. At this point, we need some information to determine the data Type, in principle based on the value of the Content-Type header field at the beginning of the response message.
The content-type response header specifies the MIME Type of the response entity. This value is typically a string like the following.
Content-Type: text/html
Copy the code
The part to the left of “/” is called “main type”, indicating the large classification of data. The “subtypes” on the right represent specific data types. In the example above, the main type is text and the subtype is HTML. The data types in the above example represent HTML documents that follow the HTML specification. Some of the main types are listed below.
The data Type defined by the Content-Type from the How the Web Is Connected message
In addition, if the data type is text, you need to determine the encoding mode. In this case, you need to use charset to attach the information indicating the encoding mode, as follows: UTF-8 indicates that the encoding mode is Unicode
Content-Type: text/html; charset=utf-8
Copy the code
Note that the content-Type header specifies the media Type of the original response entity. For example, if the entity is content-encoded, the content-Type header specifies the Type before the encoding.
Synthesize and determine the data type of the responding entity
The method used by content-type fields to represent data types is defined in the MIME specification. MIME is a unified standard, but it is only a principle specification. You need to make sure the Web server sets the content-Type value correctly, but that’s not always the case. If the Web server administrator is not careful, the content-Type value can be incorrect due to incorrect Settings. Therefore, checking content-Type against principle does not always ensure that you can accurately determine the data Type.
Therefore, sometimes we need to combine other information to determine the data type, such as the extension of the request file, the format of the data content, and so on. For example, we can check the extension of the file as an HTML file if it is.html or.htm, or we can check the content of the data as an HTML document if it begins.
Not just for text files like HTML, but also for images. Images are compressed binary data, but they also have information at the beginning that indicates the format of the content, and we can use this information to determine the type of data. However, there is no universal specification for this part of the logic, so it varies from browser to browser and from version to version.
For example, in the Chrome browser source code for MIME judgment logic:
Detecting MIME types is tricky because we need to balance compatibility issues with security issues. Below is a survey of other browser behavior, followed by a description of how we intend to behave.
HTML payload, no Content-Type header:
- IE 7: Render as HTML
- Firefox 2: Render as HTML
- Safari 3: Render as HTML
- Opera 9: Render as HTML
Here the choice seems clear: => Chrome: Render as HTML
HTML payload, Content-Type: “text/plain”:
- IE 7: Render as HTML
- Firefox 2: Render as text
- Safari 3: Render as text (Note: Safari will Render as HTML if the URL has an HTML extension) * Opera 9: Render as text
.
Processing response body
After determining the data type, the next step is to call the program used to display the content according to the data type and display the data. For basic data types such as HTML documents, plain text, and images, the browser has the ability to display these content itself, so it is the browser’s responsibility to display them.
Data that the browser can display itself, such as HTML documents and images, is delegated to the browser to display on the screen in this manner. However, the Web server may also return other types of data, such as data from word processing, slides, and other applications. The data cannot be displayed by the browser itself, and the browser invokes the corresponding program.
These programs can be plug-ins of the browser, or they can be independent programs. In any case, different types of data correspond to different programs. The corresponding relationship is set in the browser. The called program is then responsible for displaying the corresponding content.
We’ll explain in detail how browsers display HTML documents later.
summary
There you have it, a page loading process, many of which have not been explored in detail, but Timeline is enough as a performance bottleneck for finding the loading process. Now let’s draw the loading timelinePage loading process
We can see that loading the main document resource from the network requires such a long step. The main factors that affect performance are:
- Eight client-server round trips are required before data can be sent
- One DNS resolution was performed
- TCP three-way handshake
- TLS 4 handshakes
- Each application layer data is sent in segments
- The larger the data, the longer the transmission time
- TCP slow start
- When the document is loaded, requests for resources in the document are concurrent
- HTTP / 1.1
- HTTP/2
- HTTP/3
- First-screen and non-first-screen loading policies
- preload
- Document preloading
- Static resource preloading
- Data preloading
- Lazy loading
- Lazy loading of images
- JS is loaded on demand
- preload
If not targeted optimization, the first screen is almost impossible to achieve the goal of second open. In the following part, according to the network loading process, some optimization strategies are proposed for the performance problems in the network from four aspects: cache policy, loading policy, network optimization, and resource optimization.
thinking
Write bald, has been unable to think 🤣
😇 series article
- preface
- What is Web Performance?
- Why care about Web performance?
- Web performance optimization history
- W3C and Web Performance Working Group
- Performance criteria
- Performance indicators
- Page rendering process
- Load paper
- Apply colours to a drawing paper
- Optimization strategy
- Load performance
- Rendering performance
- To optimize the practice
- How do I troubleshoot loading performance problems?
- How do I troubleshoot rendering performance issues?
- Performance testing tool
- Performance non-degradation mechanism
- explore
- Performance optimization and artificial intelligence
- The effect of image size on memory, FPS and CPU