Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.
preface
HTTP/3 is still a draft, but the latest Chrome browser supports it by default. Chrome has 70% of the browser market, so HTTP/3 has gone mainstream. HTTP/3 was invented to make the Web more efficient, secure, and shorten content delivery delays, and is a refinement of HTTP2. Instead of TCP, it uses QUIC protocol based on UDP.
What are the advantages of QUIC over TCP? To better understand TCP’s shortcomings, it’s best to start from scratch.
The HTTP 0.9
In 1991, the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF) developed the HTTP 0.9 standard. Since the Internet was still popular in this day and age, only GET requests were supported and headers were not supported. Because there is no protocol header, HTTP/0.9 only supports the transfer of plain text and HTML files (images, videos and other formats cannot be inserted).
HTTP/0.9 is stateless in that each transaction is processed independently and the connection is released at the end of the transaction. The transmission process first requires the establishment of a TCP connection between the client and the server. Then the client initiates a request, the server returns the page content, and the connection is closed. If the request page does not exist, no error status code is returned.
The HTTP 1.0
features
In 1996, HTTP/1.0 was released, which added the following major features over HTTP/0.9.
- Diversified request modes: Add POST and HEAD. (POST allows clients to submit data to the server.)
- Added the concept of HTTP headers to allow the transfer of metadata to both requests and responses, making the protocol very flexible and extensible. The ability to transfer other types of documents than text HTML, with the help of HTTP headers, benefits
Content-Type
. - Add a status code to let the client know whether the request was successful or failed and adjust its behavior accordingly (such as updating or using local caching).
- Add caching mechanism.
- Add identity authentication.
defects
HTTP/1.0 is still flawed.
- Short connection: The TCP connection cannot be reused. That is, a TCP connection needs to be established with the server for each request and the connection is disconnected immediately after the request is processed.
- Bandwidth waste: A client cannot request a portion of an object.
- Queue header congestion: When a page requests multiple resources, the queue header is blocked. As a result, when the number of requests reaches the threshold, the remaining resources can be requested only after the previous resources are completed. As a result, the bandwidth cannot be fully utilized.
The HTTP 1.1
features
In 1997, HTTP/1.1 was released to further improve the HTTP protocol. The specific optimization points are as follows.
- Add cache handling mechanism
HTTP/1.0 relies on the Expires field as a criterion for strong caching, and HTTP/1.1 adds a cache-control field that takes precedence over Expires. If the strong cache is not matched, the system checks whether the negotiation cache is matched. In HTTP/1.0, the server/client is determined by the last-modified/if-Modified-since field. In HTTP/1.1, Etag/ if-none-match is added.
- A long connection
HTTP/1.1 By default, Connection: keep-alive is enabled. Multiple HTTP requests and responses can be sent over a TCP Connection, reducing the consumption and delay of establishing and closing connections.
- Bandwidth optimization
HTTP/1.0 was a waste of bandwidth, such as the client requiring only a portion of an object while the server sending the entire object. HTTP/1.1 adds a range field to the request header, allowing a request for a portion of the resource (return status code: 206), the basis for breakpoint continuation.
- The HOST header
In HTTP/1.0, each server is bound to a unique IP address. Therefore, the request URL does not contain the host name. However, with the development of virtual host technology, a server can have multiple virtual hosts, and they share the same IP. Therefore, HTTP/1.1 request and response information should support the HOST header field.
- Example Add an error status code
24 status codes are added. For example, 409 indicates that the requested resource conflicts with the current resource status. 410 indicates that a resource on the server is permanently deleted.
- Add pipelining technology
Allows a second request to be sent before the first reply has been fully sent to reduce communication latency.
defects
As the web evolves, HTTP1.1 still exposes some limitations:
- Long connections enable TCP connections to be reused. However, multiple TCP connections need to be established in the case of domain name fragmentation, which causes pressure on the server.
- Pipelining solves only part of the problem. Although the client can make multiple requests at once, the pipeline requires that the return be sequential, so the previous request takes too long and also blocks the return of subsequent requests.
- HTTP/1.1 headers carry too much information, which increases the cost of transport to some extent.
The HTTP 2.0
features
In 2015, HTTP/2.0 was released, which uses the SPDY protocol based on TCP. Compared with HTTP/1.1, HTTP/2.0 added the following major features:
- multiplexing
HTTP/1.1 is based on the text segmentation protocol. All data is transmitted sequentially and cannot be transmitted in parallel. HTTP/2.0 is a protocol based on binary frames, in which the frames identify the data. When parallel transmission is achieved, the merged data will not be corrupted.
- The head of compression
HTTP/1.1 header metadata is sent as plain text, which adds 500 to 8000 bytes to the transmission cost of the request. HTTP/2.0 predefined 61 header fields through static dictionary tables, such as the general :method GET, :method POST, and so on. Dynamic dictionary tables (queues) store fields that are not predefined, starting at 62, and then using the corresponding index for subsequent transfers. In addition, static Huffman encoding is supported for key-values of String type, reducing transmission volume and enhancing security.
- Server push
In addition to the response to the initial request, the server can push additional resources to the browser without the browser having to request again. HTTP/2.0 cannot replace WebSocket because it cannot transfer data to the client APP itself.
The HTTP 3.0
HTTP/2.0 has significantly improved performance compared to HTTP1.1, but TCP’s limitations are inevitable because it is based on TCP.
TCP limitations
- Team head block
HTTP/2.0 Multiple requests in a TCP connection. When TCP loses packets, the entire TCP waits for retransmission, blocking all requests in the TCP connection. TCP is a byte stream protocol, and the TCP layer must ensure that the received byte data is complete and orderly. If the fields with low sequence numbers are lost, the application layer cannot read this part of data from the kernel even if the high TCP fields are received.
- TCP/TLS handshake delay
To initiate an HTTP request, TCP three-way handshake and TLS four-way handshake are required, requiring three round-trip Time delays. TCP also provides congestion control. As a result, TCP connections are set up in a slow startup process, which slows down TCP connections.
- Network migration requires reconnection
TCP connections are determined by a quad (source IP address, source port, destination IP address, and destination port). If the IP address or port is changed, the TCP/TLS handshake is re-established, which is not convenient for mobile devices to switch networks.
The characteristics of QUIC
As stated at the beginning of this article, HTTP/3.0 is a UDP-based QUIC protocol with four advantages.
- Connect the migration
The QUIC protocol marks two endpoints with connection ids. Therefore, even if the mobile device network changes, resulting in IP changes, as long as you still have (connection ID, TLS key), the original connection can be reused “seamlessly”, eliminating the cost of reconnection.
- No queue head is blocked
The QUIC protocol, like HTTP/2, can transmit multiple streams concurrently. Although UDP data packets are lost, QUIC provides a unique serial number for each packet to ensure packet reliability. When a packet is lost in a stream, the data cannot be read by HTTP/3.0, even if other packets of the stream arrive, until QUIC retransmits the lost packet and the data is handed over to HTTP/3.0. There is no dependency between multiple streams on a QUIC connection and they are all independent. Packet loss in one Stream affects only that Stream, and other streams are not affected.
- Custom congestion control
Based on the improvement of TCP congestion algorithm, such as pluggable, application level can realize different congestion control algorithm, without the support of operating system and kernel; Different connections for a single application can also be configured with different congestion controls; Congestion control changes can be implemented without downtime and upgrades. Monotone increasing Packet Number enables QUIC to distinguish the original Packet from the retransmitted Packet and avoid the retransmitted fuzzy problem. There are more ACK blocks, 256 ACK blocks supported. RTT time can be calculated accurately.
- Forward safety and forward error correction
After sending a set of data, xOR (efficient) is performed on that set of data and the result is also sent, so that the receiver has two versions of the data that can be corrected and verified on the original data. This ensures reliability.
Additional questions
What’s the difference between GET and POST?
- GET request data is displayed after the URL, and POST is in the HTTP package body.
- GET request parameters are limited (browser and server), POST request parameters are not limited.
- POST is more secure than GET.
Is TCP different from UDP?
- TCP connection-oriented; UDP no connection. Procedure
- TCP is based on byte stream; UDP Data – based packets.
- TCP provides reliable service, that is, data transmitted over the TCP connection is error-free, not lost, not repeated, and arrives in sequence; UDP does its best to deliver, i.e. reliable delivery is not guaranteed
- TCP connections are point-to-point; UDP supports one-to-one, one-to-many, many-to-one, and many-to-many interactive communications.
- TCP header cost 20 bytes; UDP is only 8 bytes.
Common status code
1XX represents a status code for an interim response that requires the requester to continue performing the operation.
- 100 indicates that the requester should continue with the request, and that the server has received the first part of the request and is waiting for the rest.
- 101 indicates that the requester has asked the server to switch protocols and that the server has confirmed and is ready to switch.
2XX represents the block of code that successfully processed the request.
- 200 indicates that the server successfully processed the request.
- 201 indicates that the request succeeds and the server creates a new resource.
- 202 indicates that the server has accepted the request but has not yet processed it.
- 203 indicates that the server successfully processed the request, but the returned information may have come from another source.
- 204 indicates that the server successfully processed the request, but does not return anything, but instead returns meta information in the message header that changed.
- 205 indicates that the server has successfully processed the request but has not returned anything. Unlike 204, this response requires the requester to clear the document view for the next input.
3xx indicates that further action is required to complete the request, and typically, these status codes are used for redirection.
- 301 indicates a permanent redirect
- 302 indicates a temporary redirect
- 304 indicates that the local resource cache is used
- 305 indicates that the proxy access request is required.
4xx indicates that an error may occur in the request, preventing the server from processing it
- 400 indicates that the server does not understand the request syntax
- 401 indicates that authentication is required
- 403 indicates that the server rejects the request.
- 404 indicates that the server could not find the requested page.
- 410 indicates that the requested resource is permanently deleted.
5xx indicates that the server encountered an internal error while processing the request
- 500 indicates that the server encountered an error and could not complete the request.
- 501 indicates that the server is not capable of completing the request. For example, the server may return this code if it does not recognize the request method.
- 502 indicates that the server, acting as a gateway or proxy, received an invalid response from the upstream server.
- 503 Indicates that the server is currently unavailable (overloaded or down for maintenance)
- 504 indicates that the server acts as a gateway or proxy but does not receive requests from the upstream server in time.
- 505 indicates that the server does not support the HTTP version of the request.
reference
HTTP/3 is coming! HTTP/3 HTTP 0.9 HTTP 1.0 HTTP 1.1 HTTP 2.0 Differs from HTTP/3