History of HTTP protocol

With the development of network technology, HTTP/1.1 designed in 1999 could no longer meet the daily needs, so Google began to push for the development of SPDY, the next generation protocol, but SPDY did not become a standard protocol. Because the HTTP2 standard was developed with the involvement of the SPDY development team and has made much reference to the DESIGN of SPDY, SPDY is generally regarded as the predecessor of HTTP2.

So far, there are three major versions of HTTP: 1.0, 1.1, and 2. (In fact, many enterprises have not even adopted HTTP2, let alone 3.0.)

HTTP1.0

Problems with HTTP1.0

  • Poor security, content is transmitted in plain text:

    At HTTP2, data is transferred in the binary protocol format. Prior to this, data was transferred in the plaintext protocol format.

  • Wasted performance, unable to reuse connections

    Each time a request is sent, a TCP connection needs to be established. The process of TCP connection and release takes time, which reduces the network utilization.

  • Head of line blocking

    HTTP1.0 states that the next request must be sent after the response to the previous request arrives. If the response to the previous request does not come back, the next request must wait for all subsequent requests to block.

  • The server cannot push messages

    Before HTTP2, the server could not push messages

HTTP1.1

Compare some of the new features of HTTP1.0:

  • Long connection (connection multiplexing)

    HTTP1.1 can use the Connection: keep-alive field to establish long connections (long connections are enabled by default). This enables multiple HTTP requests and responses to be transmitted in a SINGLE TCP Connection. The client can tell the server to close the Connection by carrying the field Connection:false in the request header. The number of connections on the server has an upper limit. The server can run the keep-alive :timeout command to set the timeout period. If the client does not initiate a request within the specified period, the connection is closed.

  • Pipeline mechanism was introduced to improve the efficiency of SENDING HTTP requests

    Although through long connection achieved a single connection to send multiple HTTTP request, but each request or response after return will have to wait for the last request to send, therefore the introduction of pipeline mechanism (pictured), namely when sending a request, don’t need to wait the request and response results back behind can send directly to the next request and all requests.

    Pipeline mechanism is not A “silver bullet”, the pipeline mechanism requires that the order of sending requests and receiving responses should be consistent to the first in first out, when sending A, B, C three requests in turn, must ensure that the order of receiving responses is A>B>C. If the response to request A does not end, subsequent request responses cannot be received and block. Pipeline mechanism is also not widely used due to the problem of queue head blocking.

  • Breakpoint continuation (default only supports breakpoint download)

    If a client fails to download a file from the backend, the client can continue downloading the file from the position where the connection was interrupted last time. The implementation principle is that the client needs to record the download while downloading. When the client disconnects and re-establishes the connection, the request header will add Range: The first-offset-last offset field specifies which interval of data to fetch, so that the back end only returns data between first and last offset

HTTP2

HTTP2 before its birth, HTTP1.1 has become the mainstream of the world, so HTTP2 to comply with the HTTP1.1 standard, which means facing two major problems

  • You cannot change the “http://” or “https://” URL paradigm

  • The communication model of HTTP request/HTTP Response packet structure cannot be changed. The HTTP protocol should be “once and for all”, with one request corresponding to one response

  • Compatible http1.1 How to invent a new HTTP2 protocol without changing the request/response message structure. HTTP/2 and HTTP1.1 are not parallel. HTTP2 is the protocol between HTTP1.1 and TCP, and acts as a translation layer.

  • Multiplexing means that each HTTP request can share a connection, each HTTP request is assigned an ID, and all requests are jumbled together in each connection. Recipients can assign requests to different server requests based on their IDS.

  • Header Compression The HEADER of an HTTP request or response always carries a large number of repeated fields. HTTP /2 compresses the header to improve transmission efficiency and performance. The client and server caches an index table respectively. After the transmission, the header field key is retrieved from the index table.

  • HTTP1.X is different from HTTP1.X, which uses binary protocol to parse packets into multiple frames. The popline mechanism of http1.1 must respond sequentially. each response is a complete packet. Http2 sends requests in frames, and the response order is out of order, even if the previous request blocks, it does not affect subsequent requests. In short, based on TCP, it is inevitable that the problem of head blockage will occur.

  • Server push

HTTPS

HTTP requests have several major security issues due to their statelessness and plaintext transmission

  • Communications are intercepted, and a third party can intercept and view communications
  • A third party can intercept and modify a communication when the content is tampered with
  • The communication identity is impersonated, and the third party can impersonate the client and server to participate in the communication

To solve the following problems, we need to understand some basic concepts of cryptography

Symmetric encryption: Encrypts/decrypts data using the same key

Asymmetric encryption: Data is encrypted and decrypted using two keys: the public key is used for encryption and the private key for decryption

Private key: string obtained through a special algorithm

Public key: The public key can be obtained through a special algorithm. This process is irreversible. That is, the public key can be calculated with the private key, but the private key cannot be calculated with the public key

Symmetric encryption is used to encrypt packets

Symmetric encryption can be used to encrypt packets when packets are transmitted in plain text. The server and client share the same key. When a client sends a packet to the server, the client uses the key to encrypt the packet, and the server uses the key to decrypt the packet. On the contrary, the server uses the key to send packets to the client

In this way, the message is encrypted, but the transmission of the key becomes a new problem. How to ensure that the key is not leaked during communication? I can use asymmetric encryption (one-way) to encrypt the key. At the same time, to prevent three parties from impersonating clients or servers to participate in communication, a signature mechanism is introduced, that is, the sender uses key signature and the receiver uses key check to verify the identity of the sender.

Bidirectional asymmetric encryption

The two sides of bidirectional asymmetric encryption communication have a pair of public and private keys respectively. As shown in the figure above, the client has a pair of public and private keys (PubA and PriA), and the server has a pair of public and private keys (PubB and PriB). Both sides retain each selfish key to disclose their public keys.

There are two operations: encryption and decryption, signature and verification

  • Encryption and decryption When a client sends a packet, the public key PubB of the server is used to encrypt the packet, and the server uses its own private key PriB to decrypt the packet. On the contrary, the server uses the public key PubA of the client to encrypt the packet, and the client uses its own private key PriA to decrypt the packet

  • Signature and verification When sending packets, the private key of the sender is used for signature and the public key of the sender is used for signature verification. If the data is tampered with, the test will fail. This prevents third parties from tampering with data. When sending packets, the client uses its private key PriA for signature, and the server uses the client’s public key PubA for signature check, and vice versa.

That is, signature of one’s own private key + encryption of the other party’s public key – public key verification of the other party + decryption of one’s own private key

Unidirectional asymmetric encryption

One-way asymmetric encryption is the main form of B/S communication, in the Internet, the website is open to the server without verifying the validity of each client, only the client can verify the validity of the server. For example, Baidu, Taobao such sites are open, we need to verify their authenticity and legitimacy when visiting these sites, in order to prevent phishing to steal user information.

As shown in the figure above, the client communicates with the server. The client encrypts the request message using the server’s public key, the server decrypts the request message using its private key, and the server responds to the client in plaintext.

In this case, only unidirectional data encryption is implemented. How to ensure that the response packets of the server will not be intercepted or tampered with?

This is where symmetric encryption is used

  • First of all, the client and server will negotiate the secret key before sending the real message, as shown in the figure above. Before sending the real data, the client first uses the public key PubB of the server to encrypt the symmetric encryption key PriC and send it to the server. The server uses its own private key PriB to decrypt and obtain the key PriC. In plaintext, the client is notified that receiving the symmetric encryption key PriC succeeded. In this way, both communication parties know the symmetric encryption key PriC.

  • Symmetric encryption: Encrypting/decrypting packets The client encrypts the packets using the PriC key and sends the packets to the server. The server also decrypts the packets using the PriC key. The server uses the key PriC to encrypt the response packet, and the client uses the key PriC to decrypt the response packet

The above one-way asymmetric encryption + symmetric encryption communication model is the prototype of SSL/TLS

CA organizations and digital certificates

Before we talk about CA bodies and digital certificates, let’s talk about the downside of asymmetric encryption — it’s easy to find third-party interceptors

The public key of a server is public. A third party can intercept and tamper with the public key of a server to communicate with a client to obtain data. When a client obtains a server public key, how can I determine whether the public key belongs to a legitimate server rather than a third party?

As shown in the figure above, when a third party participates in communication, it first impersonates a server. The client thinks that the third party C is the server and sends its public key PubA to the third party C. Then the third party C impersonates the client and sends its public key PubC to the server, successfully deceiving the public key PubB of the server. The third party C then impersonates the server and gives a forged public key PubC to the client

In this way, both the client and server think they are communicating with each other, but the whole process is actually intercepted by a third party. The root cause of this problem is that the public key is transmitted in plaintext. The client cannot determine whether the public key belongs to the target server, and the server cannot determine whether the public key is the real client. How can I solve this problem?

  • Instead of sending a public key, the server/client sends a digital certificate, which is an enhanced public key
  • A digital certificate is issued by the CA. The CA verifies the authenticity of the digital certificate obtained by the client or server

Digital certificate issuance and verification

Digital certificate issuing: A CA is also a pair of public and private keys. In the case of a public key, only the CA knows the public key. The server/client sends its own information and public key to the CA, and the CA uses its own private key to generate a digital certificate from this information

Digital certificate verification: when one party sends a digital certificate in B/S communication, the other party can use the CA’s public key to verify the digital certificate. If someone forges the digital certificate, the verification will fail.



As shown in the figure above, the server/client first submits its own information and public key to the CA. The CA returns a digital certificate after authentication, and then uses the digital certificate instead of the public key for communication.

The authenticity of CA institutions

Each CA level has its own digital certificate. The top ROOT CA is recognized worldwide

As shown in the figure above, certificate issuance and validation are two opposite processes, both of which are one-level verification and one-level issuance

The server sends a digital certificate to the client, and the client takes the digital certificate with it to the LEVEL C3 CA for verification. To verify the validity of the level C3 CA, the client takes the C3 certificate with it to the level C2 CA for verification. Follow this process until ROOT CA authentication is successful

Finally, in network communication, each entity uses a digital certificate to authenticate communication, instead of the initial Public Key. This standard system is called PKI(Public Key Infrastructure).

SSL/TLS

SSL is called secure Sockets Layer (SSL), and TLS is called Transport layer security protocol HTTPS = HTTP + SSL/TLS SSL/TLS supports other application-layer protocols such as FTP, which is located between the application-layer protocol and TCP

Four handshakes for SSL/TLS

The SSL/TLS protocol is designed for communication security, and must be passed before sending HTTP requestsFour handshakes for SSL/TLSThe identity verification and key negotiation are complete

  • As shown in the figure above, SSL/TLS first shakes the client to ask for a server-side digital certificate
  • SSL/TLS Second handshake The server returns a digital certificate, and the client starts to verify the certificate after receiving it successfully
  • The SSL/TLS handshake negotiates the key for the third time. The client encrypts the symmetric encryption key and sends it to the server
  • SSL/TLS Fourth handshake The server sends a message indicating that the key is successfully received to the client

HTTPS complete communication model

https = tcp + SSL/TLS + http

The completehttpsThe communication process goes through three stages

  • 1. TCP establishes a connection through a three-way handshake
  • 2. Verify the identity of the communicating party through SSL/TLS and negotiate the symmetric encryption key
  • 3. Use the key to encrypt all HTTP requests/Responses of the TCP connection

Phase 1 and phase 2 only go through once when the connection is established, and then if the connection is disconnected, each HTTP request/ Response only goes through three phases without too much performance loss.