What happens between entering the URL and seeing the page?

  • URL parsing
  • Cache check
  • The DNS
  • TCP three-way handshake
  • The data transfer
  • TCP waved four times
  • Page rendering

Step 1: URL parsing

Address resolution

agreement Default port number Transfer protocol
http 80 Hypertext transfer protocol
https 443 Security Settings (SSL/TSL) certificate authentication is carried out on the basis of HTTP
ftp 21 It is mainly used for file transfer between client computer and server

URL encoding

For some special characters, we need to decode and encode them when passing them on the client side and server side

  • EncodeURI, decodeURI: Encoding and decoding of Chinese, space, etc

  • EncodeURIComponent, decodeURIComponent: Decoding of Chinese, space: / and other encoding (generally used in query string some parameters contain HTTP path parameters)

  • Escape unescape is mainly used to encode and decode information (such as cookies) during data transmission between different pages on the client.

Step 2: Cache check

Cache location: Memory Cache, Disk Cache

  • Open web page: FindDisk CacheIf yes, send a network request if no
  • Plain refresh (F5) : Because TAB is not closed, soMemory CacheIt is available and will be used first, secondDisk Cache
  • Forced refresh :(ct+F5 browsers do not use caching and therefore send requests with headersCache-contro: no-cache(with Pragma: no-cache for compatibility), the server simply returns 200 and the latest content

Strong cacheExpires/Cache-Control

The browser’s handling of strong caching is judged by the corresponding header returned from the first request for the resource

  • Expires: cache expiration time, used to specify when a resource Expires (HTTP/1.0)
  • Cache-control:Cache - contro: Max - age = 2592000Within 2592,000 seconds (30 days) of first retrieving the resource, send the request again to read the information in the cache (HTTP/1.1)
  • If both exist together,Cache-ControlPriority overExpire

The server has set some “resources (static resources htmU/ CSS /js/ image)” “forced cache” mechanism, in the validity period of the browser cache, we in addition to clear the cache refresh, normal loading page, are from the cache data, rather than from the server to get again

  • Advantages: Fewer requests to the server, faster loading of resources, faster page rendering
  • Disadvantages: When our resources are updated in the server, but there is still a local cache, so that the client can not get the latest information in time, the end solution
    • HTML pages are not cached. Every time the resource is published, the content is updated and the resource file name is different [Webpack name sets HASH]. In this way, the resource file requested by the page is also changed
    • Even if the file name is the same, it is ok to add a timestamp to the end of the requested resource file, which is a retrieval, rather than a cache
    • No strong cache setup, based on negotiated cache implementation (in real projects, it is often both)

Negotiate the cacheLast-Modified/ETag

Negotiation cache is a process in which the browser sends a request to the server with the cache id after the cache is invalid, and the server decides whether to use the cache based on the cache ID

  • The negotiated cache takes effect, returning 304 and Not Modified

  • Failed to negotiate cache, return 200 and request result

  • Last-Modified If-Modified-Since
    • The first time a resource is accessed, the server returns the resource as set in the response headerLast-Modified(the last modification time on the server), the browser receives the file and the response header
    • The next time the resource is requested, the browser detects that it hasLast-Modified, so addIf- Modified-SinceThe request header, the value isLast-ModifiedThe values;
    • The server receives the resource request again, based on theIf- Modified-SinceIf there is no change, return 304 and an empty response body, read directly from the cache, ifIf-Modified-SinceIs less than the last modification time of the resource in the server, indicating that the file has been updated, so the new resource file and 200 are returned
    • butLast-ModifiedIt can only be timed in seconds. If the file is modified in an imperceptible amount of time, the server will assume that the resource is still hit and will not return the correct resource
  • The ETag and If - None - Match
    • An Etag is a unique identifier (generated by the server) that is returned to the current resource file when the server responds to a request. The Etag is regenerated whenever the resource changes
    • The next time a loaded resource sends a request to the server, the Etag value returned last time is put in the request headerIf-None-MatchThe server only needs to compare the messages sent from the clientIf-None-MatchFollow the resource on your own serverETagConsistency is a good indicator of whether the resource has been modified relative to the client.
    • If the server findsETagIf they do not match, the new resource (including the new one, of course) will be returned to the regular GET200 packageETag) to the client
  • The actual scene

A packed static file

Data cacheLocalStorage

Each page refresh first checks whether there is data locally, and whether the stored time is still valid (the validity period is set by yourself) : at the validity period, the data is directly retrieved locally, if there is no data or expired, re-send the request, and then the retrieved latest results are stored again.

Step 3: DNS resolution

DNS resolution: DNS server (DNS server)

  • After the server is deployed, it has an external IP address. The server can be found based on the external worker P
  • External IP can’t be remembered, but you can remember the “domain name”
  • Domain name resolution server (DNS) : records the information corresponding to the host address (external IP address) of the domain namewww.baidu.com - > 127.0.0.1
  • DNS resolution refers to the process of searching for the external IP address of the DNS server based on the domain name in the URL identified by the browser
    • DNS resolution is also cached: the browser usually records the resolution locally once it has done so
    • So every DNS resolution: local DNS server resolution (recursive), root/top-level/authoritative DNS server resolution (iterative)

Each DNS resolution takes 20 to 120 milliseconds

To optimize the

Reduce the number of DNS requests

Use fewer domain names in the page (resource information should be published on different servers as much as possible)

Server deployment resources have greater benefits, server use rationalization, HTTP concurrency, but its concurrency is limited by the source, the same source at a time up to 4~7 concurrent; If we deploy them separately, there are many sources and more HTTP that can be concurrently deployed, so large corporate servers are basically deployed separately (especially for rich media resources such as images).

DNS Prefetch

In the case of not reducing the number of DNS resolution records, we can cache the DNS pre-resolution.

  • Processing method
<link rel="dns-prefetch" href="//g.alicdn.com"/>
Copy the code
  • Tmall source code

Step 4: TCP three-way handshake

  • Seq Number, which identifies the byte stream sent from the TCP source to the TCP destination and is marked when the initiator sends data
  • Ack id. The ack id field is valid only when the ACK flag bit is 1. Ack =seq+1
  • Flag bit: a total of six flag bits, including URG, ACK, PSH, RST, SYN, and FN, are defined as follows
    • URG: Urgent Pointer is valid
    • ACK: Confirms that the serial number is valid
    • PSH: The receiver should send the packet to the application layer as soon as possible
    • ORST: Resets the connection
    • SYN: Initiates a new connection
    • FIN: Releases a connection

TCP and UDP, TCP is the client and server to establish or cancel the connection based on the three-way handshake and four-way wave, ensuring the stability of communication, UDP two ends of the establishment of a connection does not have so many processing steps, direct rapid establishment, will be unstable (instant communication).

Step 5: Data transfer

HTTP Packet Information

Send data HTTP Request

  • Starting line, such as URL,POST / HTTP/1.1

  • Headers, for example:Authorization,Last-Modified
  • The body of the request, for example, the POST request passes information

Send data HTTP Response

  • Start line, for example, network status codeHTTP/1.1 404 Not Found.

  • Headers, such as server time
  • Most of the information the client needs is here

HTTP message, a message | MDN

Response status code

Used to indicate whether an HTTP request completed successfully. Responses are divided into five types: informational response, success response, redirection, client error, and server error.

The HTTP response status code, a | MDN

concurrency

Step 6: TCP wave four times

Connection: keep-aliveAfter the first communication channel is established (TCP three-way handshake), the server and client do not actively close the channel. In this way, the next request is sent without the TCP three-way handshake, which saves the network communication time

Http1.0 default Connection: keep-alive is not a keep-alive Connection, which needs to be manually processed. HTTP1.1 default Connection: keep-alive is a long Connection, provided that the same source sends requests to different sources

Step 7: Render the page

Some differences between HTTP1.0 and HTTP1.1

  • Compose deposit processingIn HTTP1.0, we mainly use headersIf-Modified-Since.ExpiresHTTP1.1 introduces more cache control policies for exampleEntitytag.If-Unmodified-Since.If-Match.If-None-MatchMore cache headers are available to control the cache strategy.
  • Bandwidth optimization and network connection usageHTTP1.1 introduces the range header field in the request header, which allows you to request only a portion of the resource. The return code is 206(Partial Content), which makes it easy for developers to make the most of bandwidth and connections.
  • Management of error notificationsAdd 24 error status response codes in HTTP1.1. For example, 409(Conflict) indicates that the requested resource is in Conflict with the current state of the resource. 410(Gone) Indicates that a resource on the server is permanently deleted.
  • The Host header processingIn HTP1.0, it is assumed that each server is bound to a unique address, so the URL in the request message does not pass the hostname. However, with the development of virtual hosting technology, there can be multiple virtual hosts (multi-homed Web Servers) on a physical server, and they share the same IP address. HTTP1.1 Request and response messages should support the Host header field, and an error will be reported if there is no Host header field in the Request message (400 Bad Request)
  • A long connectionHTTP1.1 supports long-Connection and Pipelining processing that delivers multiple HTTP requests and responses over a TCP connection, reducing the cost and latency of establishing and closing connections. Connection: keep-ave is enabled by default in HTTP1.1 to compensate for the fact that HTTP1.0 requires a Connection to be created on every request.

New features of HTTP2 compared to http1.x

  • New BinaryFormat, http1.x parsing is text-based. There are natural defects in format parsing based on text protocol. There are various forms of text expression, and many scenarios must be considered in order to achieve robustness. Binary is different, only recognizing the combination of 0 and 1. Based on this consideration, HTTP20 protocol parsing decides to use binary format, which is convenient and robust.
  • MultiPlexingConnection sharing, i.e. each request is used as a connection sharing mechanism. A request. Corresponds to an ID, so that there can be multiple requests on a connection. Requests for each connection can be randomly jumbled together, and recipients can assign requests to different server requests based on the REQUEST ID
  • The header compressionHTTP2.0 uses encoder to reduce the size of the headers that need to be transferred. The communication parties cache a table of header fields. This avoids duplicate header transmission and reduces the size of the required transmission.
  • Server pushFor example, my web page has a sytle. CSS. When the client receives the sytle. CSS data, the server will push the sytle.js file to the client.

What is the difference between multiplexing in HTTP2.0 and long connection multiplexing in HTTP1.x?

  • HTTP/1.* : responds to a request once, establishes a connection, closes when exhausted; A connection is established for each request.
  • HTTP/1.1: several requests queue serialized single thread processing, the following request waiting for the return of the previous request to get the opportunity to execute, once a request timeout, etc., the subsequent request can only be blocked, there is no way, that is, people often say the thread head blocking.
  • Multiple HTTP/2 requests can be executed in parallel on a connection at the same time. A request task takes a long time and does not affect the normal execution of other connections.

Summary of Performance Optimization

Using the cache

  • Implement strong cache and negotiated cache for static resource files (extension: files are updated, how to ensure timely refresh?)
  • For not updated frequently USES the local storage interface data for data cache (extended cookies/localStorge vuex | redux difference)

DNS optimization

  • Server by server deployment, increasing HTTP concurrency (resulting in slow DNS resolution)
  • DNS Perfetch

TCP three handshakes, four waves

  • Connection: keep alive

The data transfer

  • Reduce data transfer size
    • Content or data compression (Webpack, etc.)
    • GZIP compression must be enabled on the server side (generally about 60% compression)
    • Batch requests for large quantities of data (for example, pull-down refresh or paging to ensure less data for the first load)
  • Reduce the number of HTTP requests
    • Resource file merge processing
    • The fonts icon
    • Figure CSS Sprite – Sprit
    • Base64 of the picture The picture is transferred to Base64 online

CDN server “geographic distribution”

Using HTTP2.0

The network optimizer focuses on front-end performance optimization because most of the consumption occurs at the network layer, especially the first page load. How to reduce wait time is important “less white screen effect and time”. Loading User-friendly experience, skeleton screen: client skeleton screen + server skeleton screen, picture lazy Loading