1. The URL parsing

  • Protocol :(the protocol by which the client communicates with the server and is responsible for transporting this information)
    • HTTP: Hypertext transfer protocol, default port 80, client and server to transmit byte streams, rich text and other content other than text
    • HTTPS: HTTP based, SSL encryption, security, default port 443
    • FTP: used to upload and download files, such as the local code upload server
  • Domain name:
    • Top-level domain name: qq.com
    • The level-1 domain name is www.qq.com
    • Second level domain: sports.qq.com
    • Level 3 domain name: kbs.sports.qq.com
  • Port number:
    • Used to distinguish different services on the server, each service is an item
    • The port number ranges from 0 to 65535
  • Code:
    • URL:www.xxx.com?form=http://www.qq.com…
    • The browser finds that two HTTP addresses parse into two urls, which are really just parameters, so it needs to encrypt the parameters
    • EncodeURI /decodeURI encodeURI/decodeURI encodeURI/decodeURI encodeURI/decodeURI encodeURI/decodeURI encodeURI/decodeURI
    • EncodeURIComponent/decodeURIComponent encoding/decoding question mark pass parameters in the URL information
    let str = `http://www.xxx.com?from=The ${encodeURIComponent('http://www.qq.com/?lx=xx')}`
    // "http://www.xxx.com?from=http%3A%2F%2Fwww.qq.com%2F%3Flx%3Dxx"
    Copy the code

2. Check the cache

  • The first time the page is loaded, there is no cache. The information is directly obtained from the server and cached to the client

  • The second time to load the page, first check whether the local cache, if there is a cache and not invalid, do not obtain from the server, directly obtain from the local cache

  • The cache location

    • Memory Cache: Memory Cache
    • Disk Cache: Disk Cache
  • Open the web page and check whether the disk cache has a match. If so, use the match. If not, send a network request

  • Normal refresh F5: Since the page is not closed, memory cache is available and is used first, followed by disk cache

  • Forced refresh by right-clicking: the browser Cache is skipped, and the server returns the new content in the request header with cache-control: no-cache

2.1 Strong Cache Expires/cache-control

  • The browser’s handling of the strong cache is determined by the response header returned by the first request for the resource, which the server sets to return directly
    • Expires: cache expiration time, used to specify the expiration date of a resource (HTTP/1.0)
    • Cache-control: cache-control: max-age=2592000 Cache-control: cache-control: max-age=2592000 Cache-control: cache-control: max-age=2592000 Cache-control: cache-control: max-age=2592000
    • Cache-control takes precedence over Expires when both are present

  • There is no local cache, from the server, the local cache, and the server has nothing to do (as long as the local cache, absolutely do not go to the server), HTML pages, generally do not set a strong cache, to prevent the server to update the file, the client or the local cache page, so the page can not be updated in time
  • General CSS JS images set strong cache
    • If the server has updates, first the page will be updated without strong cache, as long as every time there is an update CSS, or JS, set a timestamp behind the requested CSS and JS, or based on webpack as long as the resource file is updated, generate resources with different file names, replace the file name with hash value.

2.2 Negotiated Cache Last-Modified (1.0)/ETag (1.1)

  • Negotiation cache is performed only when strong cache does not exist or is invalid

  • Negotiation cache is a process in which the browser sends a request to the server with the cache id after the cache is invalid. The server decides whether to use the cache based on the cache ID

  • Last-modified Indicates the time when the current resource file was Last Modified on the storage server

  • ETag stores an identifier generated based on the modification time of the current resource

  • If-modifued-since stores the last time the server returned

  • If-none-match stores the last returned identifier

  • The server compares if-modifued-since/if-none-match with the latest resource identifier/time, If the same is returned 304, directly read cache, If not consistent, directly return the latest resource and identifier

  • Strong cache, as long as the cache has, never go to the server, negotiation cache, to check whether the resource on the server is updated, no update, return 304, read cache, update, return the result and the new identity

  • HTML can be used for negotiation caching

2.3 Data Caching

3. The DNS

  • After purchasing a domain name, you need to perform domain name resolution and register the domain name and external IP address with the DNS server

  • For the first request, there is no LOCAL DNS cache and a complete DNS resolution is required, which takes 20~120ms
  • The second time basically resolves the cached record directly with the previous DNS

3.1 DNS Resolution Optimization

3.1.1 Reducing the Number of DNS Requests
  • Reduce the number of DNS requests: a project should only access the same server, and do not access too many servers and domain names. However, for better optimization, a project usually has many servers and forms a server cluster

  • Advantages of server splitting: Rational use of resources, enhanced anti-pressure ability, increased HTTP concurrency (a server can allow 6 to 7 HTTP concurrency).

    • The formation of cross-domain

3.1.2 DNS prefetch
  • With browser multithreading, DNS is resolved when the page is rendered

4. TCP three-way handshake

  • Seq Serial number: identifies the byte stream sent from the TCP source to the TCP destination. The initiator marks this byte stream when sending data
  • Ack confirmation number: This field is valid only when the ACK flag bit is 1. Ack = SEq + 1
  • Sign a
    • ACK: Confirms that the serial number is valid
    • PST: resets the connection
    • SYN: Initiates a new connection
    • FIN: Releases a connection
    • .

    1. Client -> server SYN=1 sends a new link, seq identifies X
    1. The server tells the client that it has received a SYN=1 new link from the server to the client, ACK=1 confirms that the sequence number sent is valid, and sends a new identifier seq=1, confirming that the sequence number is valid ACK= X +1
  • 3. The client gets ACK=1, the serial number is valid, and then sends it back

  • Why not use two or four handshakes: one for each

  • As a reliable transmission control protocol, THE core idea of TCP is to ensure reliable data transmission and improve transmission efficiency

  • UDP is fast, messages, video streams use UDP more, but unstable, TCP slower, but stable

5. Data transmission

  • The HTTP message
    • The request message
    • The response message
  • Response status code

6. TCP wave four times

  • Disconnect The FIN releases the link

  • Why connect three handshakes and close four handshakes

    • After receiving a SYN request packet from a client, the server can directly send a SYN+ACK packet
    • But close the connection, when the server side FIN and kindly message, probably does not immediately close links, so can only reply to a first ACK packet, told the client that your FIN I received, only wait for the server all the message to send out, I can send a FIN message, so you can’t send together, so need four waves
  • In the HTTP1.0 era, TCP reprocessed every time

  • Http1.1 era, long Connection, continuously open, continue through the established channel,Connection: keep-alive

Also HTTP1.0 VS HTTP1.1 VS HTTP2.0