The HTTP 0.9

HTTP is the original version of the HTTP protocol. It only supports GET and requests to access resources in HTML format.

The HTTP 1.0

  • Added request mode POST and HEAD;
  • The request line must have a protocol version field at the end (HTTP /1.0), and each communication must include headers that describe some metadata.
  • No longer limited to HTML version 0.9, according to the content-Type can support a variety of data formats, namely MIME multi-purpose Internet mail extensions, such as text/ HTML, image/ JPEG, etc.
  • It also supports cache, which allows clients to access a unified website within a specified period of time.
  • Other new features include Status Code, multi-character set support, multi-part Type, authorization, cache, and Content encoding.

However, version 1.0 works in such a way that only one request can be sent per TCP connection (short link by default). When the server responds, the connection is closed. The next request requires a TCP connection to be established again, which does not support Keep-alive.

Long links and short links

Short connection: The client and server are connected only when they send and receive data, and are disconnected once they send and receive messages. Advantages: Easy to manage, and all existing connections are useful connections

Long connection: The two parties establish a connection and do not disconnect after one read/write. Advantage: saves TCP connection and closing time, saving time. Frequent users are suitable for long connections. Disadvantage is that if someone malicious attacks, a large number of long connections will be generated, which will damage the server. So you can close some connections that have not been used for a long time and limit the maximum number of client connections

The cost of creating a TCP connection is high because of the three-way handshake between the client and the server and the slow start. As a result, HTTP 1.0 has poor performance. The more external resources a web page loads, the more this problem becomes.

To solve this problem, some browsers use a non-standard Connection field when making requests.

Connection: keep-alive

The HTTP 1.1

  • The biggest change in version 1.1 is the introduction of persistent connection. That is, TCP connections are not closed by default and can be reused by multiple requests without declaring connection: keep-alive. [Fixed] Version 1.0 of keepalive, a TCP connection can allow multiple HTTP requests; The client and server can close the connection if they find the other side inactive for a period of time. However, it is standard practice for the client to send Connection: close on its last request, explicitly asking the server to close the TCP Connection.

  • The pipeline mechanism is added to allow multiple requests to be sent simultaneously in the same TCP connection, increasing the concurrency and further improving the efficiency of THE HTTP protocol. For example, the client needs to request two resources. In the past, in the same TCP connection, first send A request, and then wait for the server to respond, and then send A request B. The pipeline mechanism allows the browser to make both A and B requests at the same time, but the server responds to A and B requests in the same order.

  • Added request modes such as PUT, PATCH, OPTIONS, and DELETE.

  • The Host field has been added to the client request header to specify the domain name of the server. HTTP1.0 assumes that each server is bound to a unique IP address, so the URL in the request message does not pass the hostname. However, with the development of virtual hosting technology, there can be multiple virtual hosts (multi-homed Web Servers) on a physical server, and they share the same IP address.

The Content – Length field

A TCP connection can now send multiple responses, and there has to be a mechanism to distinguish which response packets belong to. This is what the Content-Length field is for, declaring the length of the response.

Content-Length: 3495

The code tells the browser that the response is 3495 bytes long, and that the following bytes belong to the next response.

In version 1.0, the Content-Length field was not required because the browser noticed that the server had closed the TCP connection, indicating that all the packets had been received.

Although version 1.1 allows reuse of TCP connections, all data traffic within the same TCP connection is sequential. The server processes requests in queue order. The server does not process another response until it has processed one. If the first request takes a long time to process, there will be many requests waiting in the queue, which will cause the problem of “queue blocking”. In addition, HTTP is a stateless connection. Therefore, repeated fields need to be added to each request, reducing bandwidth utilization.

The HTTP 2.0

To solve the problem of low utilization of version 1.1, HTTP/2.0 was proposed. Add duplex mode, that is, not only the client can send multiple requests at the same time, the server can also process multiple requests at the same time, solve the problem of queue head congestion (HTTP2.0 uses multiplexing technology, do the same connection concurrently processing multiple requests, and the number of concurrent requests than HTTP1.1 several orders of magnitude larger); HTTP requests and responses in the status line and request/response headers are some of the information field, and no real data, so all of the information field in version 2.0 set up a table, for each field in the table index, the client and server to use the table together, between them, taking the index number to represent the information field, This avoids the repetitive, cumbersome fields of older 1.0 versions and delivers them in a compressed manner, improving utilization.

In addition, the server push function is also added, that is, the server actively sends data to the client without request.

Multiplexing allows multiple request-response messages to be sent simultaneously over a single HTTP/2 connection

The current mainstream protocol version is HTTP/1.1.

  • Multiplexing: For HTTP/1.x, even if the long connection is enabled, the request is sent in serial mode. In the case of sufficient bandwidth, the bandwidth utilization is not enough. HTTP/2.0 adopts multiplexing mode, which can send multiple requests in parallel to improve the bandwidth utilization.

  • Binary protocol

    • The HTTP/1.1 header is definitely text (ASCII encoded), and the data body can be either text or binary. HTTP/2 is a completely binary protocol. Headers and data bodies are binary and collectively referred to as “frames” : header and data frames.

    • One advantage of the binary protocol is that additional frames can be defined. HTTP/2 defines nearly ten frames, setting the stage for future advanced applications. If you do this with text, parsing the data becomes cumbersome, while binary parsing is much more convenient.

  • multiplex

    • HTTP/2 multiplexes TCP connections so that both the client and the browser can send multiple requests or responses at the same time in a single connection without having to follow the sequence, thus avoiding “queue congestion”.

    • For example, in A TCP connection, the server receives both A request and B request, and responds to A request first, only to find that the process is time-consuming. It sends the completed part of A request, responds to B request, and then sends the rest of A request.

  • The data flow

    • Because HTTP/2 packets are sent out of order, successive packets within the same connection may be different responses. Therefore, the packet must be marked to indicate which response it belongs to.

    • HTTP/2 refers to all packets of data for each request or response as a stream. Each data stream has a unique number. The data flow ID must be marked when the data packet is sent to distinguish which data flow it belongs to. In addition, it is stipulated that the data stream sent by the client must have an odd number of ids, and the data stream sent by the server must have an even number of ids.

    • Halfway through a stream, both the client and server can send a signal (RST_STREAM frame) to cancel the stream. Version 1.1 The only way to cancel the data stream is to close the TCP connection. That is, HTTP/2 can cancel a request while keeping the TCP connection open and available for other requests.

    • The client can also specify the priority of the data flow. The higher the priority, the sooner the server will respond.

  • Header compression

    • The HTTP protocol has no state, and all information must be attached to each request. Therefore, many fields of the request are repeated, such as Cookie and User Agent. The same content must be attached to each request, which will waste a lot of bandwidth and affect the speed.

    • HTTP/2 optimizes this by introducing header compression. On the one hand, the header information is compressed using gzip or COMPRESS and then sent. On the other hand, both the client and the server maintain a header table where all fields are stored, generating an index number, and then not sending the same field, but only the index number, which increases speed.

  • Server push

    • HTTP/2 allows a server to send resources to a client unsolicited, which is called server push.

    • This means that when we request data from an HTTP2.0 enabled Web server, the server will incidentally push some resources to the client, so that the client will not create a connection to send a request to the server. This is a great way to load static resources.

    • These resources pushed by the server side actually exist somewhere on the client side, and the client side can load these resources directly from the local, without going to the network, the speed is naturally much faster.

    • A common scenario is that a client requests a web page that contains many static resources. Under normal circumstances, the client must receive the web page, parse the HTML source code, find the static resource, and then send a static resource request. In fact, the server can expect the client to request a web page, it is likely to request static resources, so take the initiative to send these static resources together with the web page to the client.

    • Server push can send the resources required by the client to the client along with index. HTML, saving the step of repeated requests by the client. Because there are no requests, connections, etc., static resources can be pushed by the server to greatly improve the speed.

The HTTP 3.0

So while HTTP2.0 is already performing well, what’s the downside?

  • Long connection setup time (essentially a TCP problem)
  • Queue head congestion problem
  • Poor performance in mobile Internet (weak network environment)
  • .

Those familiar with HTTP2.0 should know that these shortcomings are basically caused by TCP protocol, water can carry a boat can also overturn a boat.

The TCP queue header is blocked

TCP packets are transmitted sequentially. If one packet is lost, the system waits for the retransmission of the packet, blocking subsequent packets.

It is not easy to upgrade from TCP, but UDP does not have the problems caused by TCP to ensure reliable connections, but UDP itself is unreliable and cannot be used directly.

QUIC protocol and HTTP3.0

QUIC stands for Quick UDP Internet Connections.

HTTP3.0, also known as HTTP Over QUIC, deprecates TCP and uses UDP-based QUIC instead.

QUIC protocol in detail

Since HTTP3.0 chose the QUIC protocol, it means that HTTP3.0 basically inherits the powerful functions of HTTP2.0, and further solves some problems existing in HTTP2.0, while inevitably introducing new problems.

Queue head congestion problem

Header blocking can occur at both the HTTP and TCP layers, and at both levels at HTTP1.x.

  • The multiplexing mechanism of HTTP2.0 protocol solves the problem of queue blocking at the HTTP layer, but the problem of queue blocking still exists at the TCP layer.

  • After TCP receives data packets, these data may arrive in disorder, but TCP must collect and sort all data to be used by the upper layer. If a packet is lost, it must wait for retransmission. As a result, certain packet loss data blocks the data use of the whole connection.

  • The QUIC protocol is based on UDP protocol. There can be multiple streams on a link, and the streams are not affected by each other. When a stream loses packets, the impact range is very small, thus solving the problem of queue head blocking.

Why does HTTP3.0 use UDP?