This article appeared at https://jaychen.cc
Author: jaychen
Write something about HTTP2.
The predecessor of HTTP2 was SPDY, which was developed under the leadership of Google. Later, Google handed over the entire effort to the IETF, which standardized SPDY into HTTP2. Google has also generously abandoned SPDY in favor of http2 support. Http2 is fully HTTP /1.x compliant, with four major new features:
- Binary framing
- The head of compression
- Server push
- multiplexing
- Optimization means
I’ll focus on these four features.
Binary framing
While HTTP /1.x is a text protocol, Http2 is a thoroughly binary protocol, which is why http2 can do so many new things. Http2’s binary protocol is called binary framing.
The format of the HTTP2 protocol is frames, similar to data packets in TCP.
+--------------------------------------------------------------+ ^
| | |
| Length (24) | |
| | |
| | |
+----------------------+---------------------------------------+ |
| | | +
| | |
| Type (8) | Flag (8) | Frame Header
| | | +
+----+-----------------+---------------------------------------+ |
| | | |
| | | |
| R | Stream Identifier (31) | |
| | | v
+----+---------------------------------------------------------+
| |
| Frame Payload |
| |
+--------------------------------------------------------------+
Copy the code
A Frame consists of a Frame Header and a Frame Payload. The header and body in HTTP /1.x were previously placed in the Frame Payload.
- The Type field is used to indicate whether the Frame Payload in the Frame holds header data or body data. In addition to identifying header/body, there are additional Frame types.
- The Length field is used to indicate the size of the Frame Payload.
- Frame Payload is used to store header or body data.
**Stream Identifier Identifies which Stream the frame belongs to. The Stream Identifier is the second feature of Http2: multiplexing.
multiplexing
In the HTTP /1.x case, a TCP connection is established for each HTTP request, which means a three-way handshake is required for each request. This is a waste of time and resources, which cannot be avoided in the case of HTTP /1.x. And browsers will limit the number of concurrent requests to the same domain name. Therefore, in the case of HTTP /1.x, a common optimization is to distribute static resources to different domain names to break the browser concurrency limit.
In the case of HTTP2, all requests share a TCP connection, which is the killer feature of Http2. Because of this, many of the optimizations of the HTTP /1.x era can be retired. The problem is that all requests share the same TCP connection, so how does the client/server know which request a frame belongs to?
The Stream Identifier above is used to identify which request the frame belongs to.
When the client initiates multiple requests to the server at the same time, the requests are split into frames, and each frame is transmitted unordered over a TCP link. The Stream Identifier of the frame for the same request is the same. When the frames arrive at the server, the complete request can be reassembled according to the Stream Identifier.
The head of compression
In the HTTP /1.x protocol, header data is carried with each request, and information such as user-agent, accept-language, etc. is almost constant during each request, so this information becomes wasted during each request. Therefore, AN HPACK compression method is proposed in Http2 to reduce the traffic consumed by HTTP headers per request.
HPACK compression works as follows:
The client and server maintain a “static dictionary” with three columns per row, similar to the table below
index | header name | header value |
---|---|---|
2 | :method | GET |
3 | :method | POST |
When the request header contains mehtod:GET, the client sends the request directly with the index value of the static field, which in this case is 2. When the server receives the request, it looks for the header name and value corresponding to index = 2 in the static dictionary to understand that the client has initiated a GET request.
The client and the server must maintain the same set of static dictionaries. The complete static dictionary is provided here. Both the client and the server follow this set of static dictionaries.
You’ll notice that some of the header values in the static dictionary have no value. This is because some header fields have variable values, such as the user-agent field, so the standard does not specify the value of header value.
If the static dictionary does not have header value, the HPEACK algorithm will do the following:
Assume that the HTTP request header contains the user-agent :Mozilla/5.0 (Macintosh; AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36, Then HPACK will do Huffman coding for the user-agent value and find the index of the user-agent is 58 in the static dictionary. Then the client will send the index value of the user-agent and the Huffman coding value corresponding to the user-agent value to the server.
The user-agent: Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_13_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36 58 : Huffman('the Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_13_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari)
Copy the code
When the server receives the request, it appends the user-Agent and Huffman encoding values to the static dictionary. These append lines are called “dynamic dictionaries.”
index | header name | header value |
---|---|---|
2 | :method | GET |
3 | :method | POST |
. | . | . |
62 | User-Agent | Huffman(‘header value’) |
When the client sends the request, it also adds the row to the static dictionary table it maintains, so that the dictionary table maintained by the client and the server is consistent. Later requesting clients that need to carry the user-agent field can simply send 62.
In HTTP2, the situation is completely different, where all requests are made over a TCP connection.
Server push
Server push refers to the server actively pushing data to the client.
For example, index.html has the following code
<! DOCTYPE html> <html> <head> <link rel="stylesheet" href="style.css">
</head>
<body>
<h1>hello world</h1>
<img src="something.png">
</body>
</html>
Copy the code
Normally, it takes three requests to display the page:
- Initiate a request for the index.html page
- The style.css and something-.png resources are found by parsing the index.html page, and the resources are obtained by making two requests.
If the server is configured with server push, the situation will look like this:
- The browser requests index.html.
- The server finds that the requested index.html contains style.css and something.png resources, and returns them directly to the browser.
In this way, the server and the browser only need to communicate once to obtain all the resources.
Turn http2 HTTP / 1 x
The purpose of HTTP2 was to optimize some of the performance problems of HTTP /1.x, so when HTTP2 arrived, many of the optimizations for HTTP /1.x didn’t work. What should we pay attention to when using Http2?
https
The feud between HTTPS and Http2 is interesting. Google made HTTPS mandatory when developing SPDY, and it’s logical that HTTP2 based on SPDY should also be mandatory, but the community has blocked http2 from using HTTPS. However, Both Chrome and Firefox have stated that they will only develop HTTP2 based on HTTPS, so basically that means that the prerequisite for using Http2 must be HTTPS.
Unnecessary optimization
In the HTTP /1.x era, in order to reduce the number of browser requests/increase the number of browser concurrency, the following methods are often used to optimize:
- Domain name sharding: Static resources are distributed under different domain names to overcome the limitation of concurrent domain names on browsers. (Mentioned in multiplexing)
- Merge files: The front end often merges several small files into one large file so that the browser can retrieve resources with a single request. But there is a drawback: if you change only a small part of the file, you have to resend the whole thing.
The above optimizations, in the case of HTTP2, are unnecessary.