HTTP

DNS

In the domain name system (DNS), the process of converting a domain name into an IP address is called DNS resolution

Core system architecture

  1. Root DNS Server: manages the IP addresses of top-level DNS servers, such as com, NET, and CN
  2. Top-level DNS Server: an authoritative DNS Server that manages each domain name. For example, the COM top-level DNS Server can return the IP address of the Baidu.com domain name Server
  3. Authoritative DNS Server: Manages the IP addresses of hosts under its own domain name. For example, the Baidu.com Authoritative DNS Server can return the IP address of www.baidu.com.

The DNS query

  • Query first from the root DNS server, then from the top-level DNS server, then from the authoritative DNS server.
  • The search process is from the right to the left of the domain name

DNS cache

  • Companies and agents have their own DNS servers to cache DNS
  • The operating system caches DNS resolution
  • The operating system also has a special host mapping file known as the host file
  • Browsers also have DNS caches

Complete DNS query

  1. Check your browser’s DNS cache
  2. Check the DNS cache of the operating system
  3. View the host file
  4. Check the corporate DNS cache
  5. Check the carrier DNS cache
  6. The query starts at the root DNS server and continues until the end

Request method

  • GET Obtain resources
  • POST Transmits the entity body
  • PUT Transfers files. Like THE FTP protocol, the file content is contained in the body of the request packet
  • The HEAD header returns only information about the resource, such as the modification time, but does not include the resource content
  • DELETE Deletes a file, as opposed to the PUT method
  • OPTIONS asks which request method the server supports

Status code

1xx

  • 101 Switching protocol, used when using WebSocket

2xx

  • 200 OK request succeeded. Return normal data
  • 204 No Content The response is normal, but there is No response Content. Generally, the PUT and DELETE methods respond to 204
  • 206 Partial Content Is an HTTP block request that responds to only part of the data.

Status code 206 is usually accompanied by the content-range header field, indicating the specific Range of the body data in the response packet for the client to confirm. For example, content-range: Bytes 0-999/2000, meaning that the first 1000 bytes of a total of 2000 bytes are retrieved.

3xx

  • 301 Moved Permanently Redirected Permanently
  • 302 Found Temporary redirect
  • 304 Not Modified Cache redirection

It is used for conditional requests such as if-modified-since to indicate that the resource has not been Modified and is used for cache control

  • 303 See Other Similar to 302, the redirected request is changed to GET
  • 307 Temporary Redirect Temporary redirection, similar to 302. Methods and entities in the request remain unchanged after redirection
  • 308 Permanent Redirect Permanent redirection. Similar to 301, methods and entities in requests remain unchanged after redirection

4xx

  • 400 Bad Request An error occurs in the Request packet
  • 401 Unauthorized Requires authentication
  • 403 Forbidden The server forbids access to resources
  • 404 Not Found The resource was Not Found
  • 405 Method Not Allowed The server disallows requests that use the current HTTP Method
  • 413 Request Entity Too Large: The body in the Request packet is Too Large.
  • 414 request-uri Too Long: The URI in the Request line is Too large.
  • 429 Too Many Requests: The client sent Too Many Requests, usually due to the server’s connection restriction policy;
  • 431 Request Header Fields Too Large: A field or the total size of the Request Header is Too Large

5xx

  • 500 Internal Server Error Indicates an Internal Server Error
  • 501 Not Implemented the requested method was Not supported by the server
  • 502 Bad Gateway The Gateway is incorrect
  • 503 Service Unavailable The server is overloaded or stopped for maintenance and cannot process requests temporarily
  • 504 Gateway Timeout The Gateway times out

TCP state

TCP is stateful. The TCP protocol is in the CLOSED state at the beginning, ESTABLISHED after a successful connection, fin-wait after a disconnection, and finally CLOSED

layered

  • The application layer

HTTP, DNS, and FTP are all application layers

  • The transport layer

Provides data transfer between two computers in a network connection. Both TCP and UDP are transport layers

  • The network layer

Process packets in the network. IP belongs to the network layer

  • Data link layer

Handles the hardware part that connects to the network. The network adapter and optical fiber are the data link layer

cookie

  • Cookie The cookie can be accessed, which is easy to cause XSS attacks
  • HttpOnly can be transmitted only through HTTP. Other access methods, including JS access, are prohibited
  • SameSite protects against cross-site request forgery (XSRF) attacks

Set SameSite=Strict to send cookies only when visiting the SameSite; SameSite=Lax allows sending on this site and third party Get requests to be sent as well;

  • Secure indicates that only HTTPS can be used for encrypted transmission. Plaintext HTTP forbids transmission. But the Cookie itself is not encrypted, the browser is still in clear text form

Request header

  • The Accept client wants to Accept format types such as text/ CSS text/ HTML
  • Accept-encoding Encoding supported by the client, for example, gzip Deflate BR

If-xxx condition request

  • If-modified-since carries the browser cache time, the server comparison time, same return 304. The browser cache resource time is determined by last-Modified of the response header
  • If-none-match carries the ETag value. If the ETag value of the server is compared with this value, 304 is returned
  • If-range If the specified value (ETag or time) matches the requested ETag or time, the server treats the request as a Range request, otherwise all resources are returned

Response headers

  • ETag Indicates the unique identifier of the resource calculated by the server. The value can be strong ETag or weak ETag. The weak ETag starts with W/
  • Location Is the new URI for redirection
  • Allow Allows clients to request methods
  • Content-encoding Encoding of the response resource
  • Content-range Specifies the Range of resources to respond to, mainly for Range request processing
  • Content-type Specifies the format Type of the response resource, such as text/ CSS text/ HTML
  • Content-length Specifies the size of the response resource in bytes
  • Last-modified Time when the response resource was Last Modified

The cache

  • Forced caching, if in effect, does not need to interact with the server
  • The negotiated cache needs to interact with the server, whether or not it is in effect

Mandatory cache

  • Neither force refresh (CTRL +F5) nor refresh (F5) will work
  • Address bar enter, page link jump, open a new window, forward, back only effective
  • Pragma
  • expires
  • cache-control
  • The status code is 200

Pragma (disused)

  • No-cache: use the cache, but use it to check whether the server has a new version

expires

  • The GMT expiration time has a lower priority than cache-control

cache-control

  • No-store: no cache is allowed and a request must be sent to the server to obtain new resources
  • No-cache: use the cache, but use it to check whether the server has a new version
  • Must-revalidate: Indicates that the cache will continue to be used if the cache expires.
  • Max-age: indicates the cache duration, in seconds
  • Public: both client and proxy can be cached
  • Private: the proxy cannot be cached

Status code 200

  • From Memory The cache is fetched from memory and its life cycle depends on whether the current browser TAB is closed
  • From Disk Cache Is obtained from disks

Negotiate the cache

  • Forced refresh (CTRL +F5) is invalid
  • Refresh (F5), address bar enter, page link jump, open a new window, forward, backward only effective
  • ETag/If-Not-Match
  • Last-Modified/If-Modified-Since
  • The status code is 304

ETag/If-Not-Match

  • ETag is generated by the server, and different servers may generate different values
  • Accurately identify whether resources have been modified
  • Computing ETag costs performance
  • The priority is higher than last-Modified

Last-Modified/If-Modified-Since

  • Whenever a resource is modified, it is returned to the client, regardless of whether the content has changed
  • The priority is lower than ETag/ if-not-match

The status code is 304

When the client request header if-none-match or if-modified-since is the same as the server’s ETag or last-Modified, the server returns 304(Not Modified)

vary

Controls the cache, which determines whether a cache should be used as a response for a future request header or a new response should be requested from the source server.

idempotence

It means that the same request being executed once is the same as being executed multiple times in a row, and the state of the server is the same.

The GET, HEAD, PUT, DELETE, and OPTIONS methods are idempotent.

POST is not idempotent.

Request Waterfall map parsing

  • Queued and Queueing are queue header blocking times, which took 1.25 seconds to be processed by the browser
  • Cost was the time it took the browser to allocate resources and dispatch connections, which cost 2.58 milliseconds
  • Initial Connection time to the server, which took 17.08 ms
  • SSL is the SSL handshake time, which took 15.84 milliseconds
  • Request sent Time for sending data, which took 0.26 milliseconds
  • Waiting(TTFB) Time spent Waiting for a server response. TTFB(Time To First Byte) is the First Byte response Time, including the server processing Time and network transmission Time. 105.65 ms was spent
  • Content Download Received the data, which took 1.95 milliseconds

HTTPS

  • The default port is 443
  • HTTP over SSL/TLS

SSL/TLS

  • SSL, or Secure Sockets Layer, is Layer 5 (session Layer) in the seven-tier model
  • For development reasons, it was renamed TLS (Transport Layer Security) in 1999
  • TLS version 1.2 is currently used
  • TLS is composed of several sub-protocols, such as recording protocol, handshake protocol, warning protocol, password change protocol, extension protocol, and uses symmetric encryption, asymmetric encryption, identity authentication and other technologies
  • TLS cipher suite naming convention is “key exchange algorithm + signature algorithm + symmetric encryption algorithm + digest algorithm”

When establishing connections using TLS, browsers and servers need to select an appropriate set of encryption algorithms for secure communication. A combination of these algorithms is called a cipher suite, or cipher suite. TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 is a cipher suite. ECDHE is the key exchange algorithm, RSA is the authentication algorithm, AES is the symmetric encryption algorithm, 128 is the strength, GCM is the grouping mode, and SHA256 is the signature hash algorithm

  • The Record Protocol specifies the basic unit for sending and receiving data in TLS: Record.
  • Handshake Protocol is the most complex subprotocol of TLS. It is more complex than TCP’S SYN/ACK. During Handshake, the browser and server negotiate TLS version number, random number, password suite, and other information, and then exchange certificate and key parameters. Finally, the two parties negotiate the session key, which is used in the subsequent hybrid encryption system

OpenSSL

  • A well-known open source cryptography library and toolkit that supports almost all publicly available encryption algorithms and protocols
  • Is the concrete implementation of SSL/TLS, Web server Apache, Nginx and other bottom layer are based on it to achieve TLS
  • Open source

Symmetric encryption algorithm

  • It indicates that the encryption key and decryption key are the same
  • AES is commonly used
  • AES stands for Advanced Encryption Standard. The key length can be 128, 192 or 256. AES is the most widely used symmetric Encryption algorithm with high security and performance
  • The latest Encryption grouping mode is called AEAD (Authenticated Encryption with Associated Data), which adds authentication function at the same time of Encryption. GCM, CCM and Poly1305 are commonly used
  • Only one key is used for fast operation, and the key must be kept secret. Therefore, secure key exchange cannot be achieved

Asymmetric encryption algorithm

  • Also known as public key encryption algorithm
  • One, called a public key, can be made public to anyone. The public key is encrypted and can only be decrypted with the private key
  • One is called a private key. Must be strictly confidential, private key encryption can only be decrypted with the public key
  • “Public key” and “private key” are different, “asymmetric”
  • TLS includes DH, DSA, RSA, and ECC
  • The security of RSA is based on the mathematical problem of integer factorization, using the product of two very large prime numbers as the material from which the key is generated
  • ECC (Elliptic Curve Cryptography) is a mathematical problem based on “discrete logarithm of Elliptic Curve”. It uses specific Curve equations and basic points to generate public and private keys. The subalgorithm ECDHE is used for key exchange and ECDSA is used for digital signature
  • It solves the key exchange problem but is slow

Mixed encryption

  • At the beginning of communication, asymmetric algorithms such as RSA and ECDHE are used to solve the key exchange problem first
  • It then uses random numbers to generate a “session key” used by the symmetric algorithm, and encrypts it with a public key
  • Finally, the other party uses the private key to decrypt the session key. In this way, the secure exchange of symmetric keys is realized, and asymmetric encryption is no longer used, but symmetric encryption is used.

SSL Handshake Procedure

  1. The client sends a message to the server containing the following contents: encryption mode supported by the client, compression method supported, SSL version number, random number generated by the client, and text content “Hello”.
  2. After receiving the message, the server sends back a Hello with the encryption mode selected from the encryption mode supported by the client, the random number generated by the server, and the SSL version number of the server.
  3. Then the server sends a Certificate packet to the client, which contains the public key Certificate of the server.
  4. Then the Server sends Server Hello Done to the client, indicating that the initial handshake is complete.
  5. After receiving the handshake message from the server, the Client responds with the Client Key Exchange message. The message contains a random password called pre-master secret used in communication encryption and is encrypted with the public Key certificate received in the third step.
  6. Then the client sends a Change Cipher Spec message to inform the server that all subsequent data will be encrypted using the Master Secret generated in Step 5
  7. The client then sends the Finish packet, which contains the overall check values of all packets connected so far for integrity verification.
  8. After receiving the Change Cliper Spec packet sent by the client, the server responds with the Change Cliper Spec packet.
  9. The server then sends the Finish packet to the client, indicating that the server has correctly parsed the overall verification value sent by the client. Then, the SSL handshake is complete.
  10. The data encrypted with Master Secret is then transmitted using HTTP.

HTTP2

The head of compression

  • Developed a special “HPACK” algorithm, on both sides of the client and server to establish a “dictionary”, with the index number to represent the repeated string, but also uses Huffman coding to compress the integer and string, can reach 50%~90% of the high compression rate.

binary

  • Both the request header and the request body are binary

multiplex

  • Send multiple requests at the same time, and the server responds to multiple requests at the same time and not in order, solving the problem of “queue head blocking”

Server push

  • Actively push resources to clients

The data flow

  • Packets are not sent sequentially, and each packet has a unique ID that marks which data stream it belongs to.
  • The data flow ids are odd on the client and even on the server
  • Both the client and server can send a signal frame to cancel the current data stream and leave the TCP connection open.
  • The client can specify the priority of the data flow, and the server responds according to the priority

HTTP3

QUIC

  • Quick UDP Internet Connections, launched by Google in 2013.
  • HTTP/3 is HTTP over QUIC.
  • It belongs to the transport layer in the hierarchical structure

The characteristics of

  • UDP – based transport layer protocol
  • Reliability is modified on the basis of UDP, and provides tcp-like features and reliability such as packet retransmission, congestion control, and transmission rhythm adjustment
  • The transmission of a single data stream is sequential, and multiple data streams are unordered
  • Quick handshake provides 0-RTT and 1-RTT connection establishment

Round-trip Time (RTT) Time required for data transfer from one end of the network to the other end

  • Connection migration has a specific UUID to mark each connection, so that data can continue to be transferred without disconnection even if the network environment changes but the UUID remains the same
  • TLS1.3 transport Security protocol is used, which has a shorter handshake time and reduces protocol latency

Welcome to wechat public number: Naonaofront-end