HTTP/2 FAQ

HTTP/2 Frequently Asked Questions

The original author: HTTP/2

Translation from: The Gold Project

This article is permalink: github.com/xitu/gold-m…

Translator: YueYong

Proofread by: Ranjay, Ziyin Feng

Here are answers to frequently asked questions about HTTP/2.

General issues
- Why is HTTP being revised?
- Who made HTTP/2??
- What is the relationship between HTTP/2 and SPDY?
- Is it HTTP/2.0 or HTTP/2?
- What are the key differences between HTTP/2 and HTTP/1.x?
- Why is HTTP/2 binary?
- Why does HTTP/2 need multiplexing?
- Why only one TCP connection is needed?
- What are the benefits of server push?
- Why do headers need compression?
- Why HPACK?
- Can HTTP/2 make cookies (or other headers) better?
- What does HTTP look like for non-browser users?
- Does HTTP/2 need encryption?
- How does HTTP/2 improve security?
- Can I use HTTP/2 now?
- Will HTTP/2 replace HTTP/1.x?
- Will HTTP/3 appear?
Implementation problems
- Why do rules continue around the data of the header frame?
- What are the minimum and maximum sizes of HPACK status?
- How can I avoid staying in HPACK status?
- Why is there a separate compression/process control context?
- Why is there an EOS symbol in HPACK?
- Can I implement HTTP/2 without implementing HTTP/1.1?
- Is the priority example in section 5.3.2 correct?
- Is TCP_NODELAY required for HTTP/2 connections?
The deployment problem
- How do I debug encrypted HTTP/2?
- How do I use HTTP/2 server push?

General issues

Why is HTTP being revised?

HTTP/1.1 has been on the Web for more than 15 years, but its disadvantages are starting to show.

Loading a web page is more resource-intensive than ever (see HTTP Archive’s Page size Statistics). At the same time, it becomes very difficult to load all of these static resources efficiently because of the fact that HTTP only allows one outstanding request per TCP connection.

In the past, browsers used multiple TCP connections to make parallel requests. There are limits, however. If too many connections are used, the opposite effect can occur (TCP congestion control can be invalidated, resulting in congestion events that can hurt performance and the network). And it’s fundamentally unfair to other applications (because browsers take up resources that shouldn’t be theirs).

At the same time, the large number of requests means that there is a large amount of duplicate data “on line”.

Both of these factors mean that HTTP/1.1 requests have a lot of overhead associated with them; If there are too many requests, performance will be affected.

This led to an industry consensus on what are the best practices, including, for example, Spriting, Data: Inlining, Domain Sharding, and Concatenation. These non-standard solutions illustrate some potential problems with the protocol itself, as well as many problems when used.

Who made HTTP/2?

HTTP/2 was developed by the HTTP Working Group of the IETF, the organization responsible for maintaining the HTTP protocol. The organization is made up of many HTTP implementors, users, network operators, and HTTP specialists.

It is important to note that although the working group mailing list is hosted on the W3C website, this is not due to the W3C. However, Tim Berners-Lee and the W3C TAG are in line with WG’s progress.

Many people contributed to the effort, especially engineers from “big” projects such as Firefox, Chrome, Twitter, Microsoft’s HTTP Stack, Curl, and Akamai. And several Python, Ruby, and NodeJS HTTP implementers.

For more information about the IETF, you can visit Tao of the IETF; You can also see who is contributing to the project on Github’s Contributors chart, and you can also see who is working on the project on the Implementation List.

What is the relationship between HTTP/2 and SPDY?

When HTTP/2 first appeared and was discussed, SPDY was being favored and supported by vendors like Mozilla and Nginx, and was seen as a major improvement over HTTP/1.x.

SPDY/2 was chosen as the basis for HTTP/2 after an ongoing solicitation of suggestions and a vote on the choice. Since then, it has changed a lot, based on working group discussions and user feedback.

Throughout the process, SPDY’s core developers worked on HTTP/2, including Mike Belshe and Roberto Peon.

In February 2015, Google announced plans to remove support for SPDY in favor of HTTP/2.

Is it HTTP/2.0 or HTTP/2?

The working group decided to remove the minor version (“.0 “) because it caused a lot of confusion in HTTP/1.x. In other words, the version of HTTP only represents its compatibility, not its features and “highlights.”

What are the key differences between HTTP/2 and HTTP/1.x?

In the higher version of HTTP/2:

It’s binary. It replaces the original text
It is multiplexed, replacing the original sequence and blocking mechanism
So you can do it in parallel in one connection
Compress header information to reduce overhead
Allows the server to proactively push responses into the client cache

Why is HTTP/2 binary?

Compared to text protocols such as HTTP/1.x, binary protocols are more efficient to parse, more compact “on line”, and more importantly, less error-prone. They are useful for handling things like whitespace, case, end-of-line, empty links, and so on.

For example, at 🌰, HTTP/1.1 defines four different ways to parse a message; In HTTP/2, you only need a code path.

HTTP/2 is not available in Telnet, but we already have some tools to support it, such as the Wireshark Plugin.

Why does HTTP/2 need multiplexing?

HTTP/1.x has a problem with “head-of-line blocking”. This means that it is more efficient to submit only one request at a time, but more requests are slow.

HTTP/1.1 tries to solve this problem by using pipelining, but it doesn’t work very well (it still blocks requests that follow it if the response is heavy or slow). In addition, because many network media (intermediacies) and servers do not support pipeline well, their deployment is also difficult.

This forces the client to use some heuristic (mostly guessing) to decide which requests to submit over which connections; Because a page loads up to 10 times more data than the available connections can handle, this has a significant negative impact on performance and often results in waterfall of blocked requests.

Multiplexing can be a good solution to these problems, because it can simultaneously handle the request and response of multiple messages; You can even mix one message with another in transit.

So in this case, the client only needs a connection to load a page.

Why only one TCP connection is needed?

With HTTP/1, the browser needs between four and eight connections to open each origin. Today, many web sites use multiple origins, which means more than 30 connections are opened just to load a web page.

With so many connections open in one application, this is far more than was intended when TCP was designed; At the same time, because each connection responds to a large amount of data, there is a risk that the network cache will overflow, resulting in network congestion and data retransmission.

In addition, using so many connections takes up a lot of network resources. These resources are “stolen” from “law-abiding” applications (VoIP is a good example).

What are the benefits of server push?

When the browser requests a page, the server sends HTML in response, and then needs to wait for the browser to parse the HTML and make a request for all the embedded resources before it can start sending JavaScript, images, and CSS.

The server push service avoids round-trip delays by “pushing” what it thinks the client will need into the client’s cache.

However, push responses are not a “panacea” and can hurt performance if used improperly. The proper use of server push is an area of long-term experimentation and research.

Why do headers need compression?

Patrick McManus from Mozilla illustrated this graphically and fully by calculating the average page load of the message headers.

Assuming a page has 80 resources to load (a conservative number for today’s Web), and every request has 1400 bytes of headers (again not uncommon because of cookies, references, etc.), it takes at least seven or eight to get those headers “online.” And that doesn’t include the response time — that’s just the time it takes to get them from the client.

This is all due to TCP’s slow-start mechanism, which restricts the number of packets sent on new connections based on the number of packets that can be confirmed — effectively limiting the number of packets that can be sent in the first few rounds.

By contrast, even a slight compression of the head can make those requests a single round trip — sometimes even a packet.

This extra overhead is considerable, especially when you consider the impact on mobile clients. The latency of these round-trips, even in good network conditions, can be hundreds of milliseconds.

Why HPACK?

SPDY/2 proposes to use a separate GZIP context for header compression on each side, which is easy and efficient to implement.

Since then, CRIME has been a major attack on compressed streams (such as GZIP) used inside encrypted files.

With CRIME, attackers with the ability to inject data into an encrypted data stream gain the possibility to “probe” the plaintext and restore it. Because it’s the Web, JavaScript makes it possible, and there have been cases of using CRIME to restore cookies and authentication tokens (Toekn) from HTTP resources protected by TLS.

Therefore, we should not use GZIP for compression. Since no other secure and efficient algorithm could be found for this use case, we created a new coarser grained compression model for the header; Because HTTP headers don’t need to be changed very often, we still get good compression efficiency and are much more secure.

Can HTTP/2 make cookies (or other headers) better?

This effort was permitted to run on a revised version of the network protocol-for example, how HTTP headers, methods, and so on could be “over the network” without changing the semantics of HTTP.

This is because HTTP is so widely used. If we were to use this version of HTTP, it would either introduce a new state mechanism (such as in the example discussed earlier) or change its core approach (thankfully, this has not happened yet), which would probably mean that the new protocol would not be compatible with existing Web content.

Specifically, we want to be able to move from HTTP/1 to HTTP/2 with no loss of information. If we start “cleaning up” headers (most people think HTTP headers are a mess), we’ll have to deal with many of the problems of the current Web.

Doing so would only cause problems for the new agreement’s popularity.

In short, the working group is responsible for all HTTP, not just HTTP/2. As a result, we can work with new version-independent mechanisms, as long as they are also backward compatible with the existing network.

What does HTTP look like for non-browser users?

Non-browser applications should be able to use HTTP/2 if they have already used HTTP.

Previous feedback has been received that the HTTP “APIs” have good performance features in HTTP/2 because the API is not designed to take into account things like request overhead.

Having said that, the focus of the improvements we are considering is the typical browsing use case, since this is the primary use scenario for the protocol.

Here’s what our bylaws say:

The specification being organized needs to meet the functional requirements of HTTP that are now widely deployed; Specifically, these include Web browsing (desktop and mobile), non-browsers (in the form of “HTTP APIs”), Web services (broadly), and various network mediations (implemented through proxies, enterprise firewalls, reverse proxies, and content distribution networks). Similarly, current and future semantic extensions to HTTP/1.x (e.g., headers, methods, status codes, cache directives) should be supported in the new protocol.

It is important to note that this does not include using HTTP in scenarios where non-specific behavior depends (such as timeouts, connection state, and intercepting proxies). These may not be enabled in the final product.

Does HTTP/2 need encryption?

Don’t need. After heated discussions, the working group failed to reach a consensus on whether the new protocol would use encryption, such as TLS.

However, some argue that HTTP/2 will only be supported if used over encrypted connections, and currently no browser supports unencrypted HTTP/2.

How does HTTP/2 improve security?

HTTP/2 defines the required TLS documents, including versions, cipher suite blacklists, and extensions to use.

See related specifications for details.

There is discussion of additional mechanisms such as TLS for HTTP:// URLs (so-called “opportunistic encryption”); See RFC 8164 for details.

Can I use HTTP/2 now?

In browsers, the latest versions of Edge, Safari, Firefox, and Chrome all support HTTP/2. Other Blink based browsers will also support HTTP/2 (such as Opera and Yandex browsers). See caniuse.

There are also several servers available (including beta support from Akamai, Google, and Twitter’s main sites), as well as many open source implementations that can be deployed and tested.

See Implementation List for more information.

Will HTTP/2 replace HTTP/1.x?

The purpose of the working group is to make HTTP/2 available to those who use HTTP/1.x and to reap the benefits of HTTP/2. They’ve said that we can’t force the entire world to migrate because of the way people deploy agents and servers, so HTTP/1.x is still likely to be around for a while.

Will HTTP/3 appear?

If the communication and collaboration mechanisms introduced through HTTP/2 work well, supporting new versions of HTTP will be easier than ever.

Implementation problems

Why do rules continue around the data of the header frame?

Data continuation exists because a value (such as cookie) can exceed 16KB, which means it cannot all fit into a single frame.

So it was decided to pass all the headers frame by frame in the least error-prone way, making it easier to decode headers and manage buffers.

What are the minimum and maximum sizes of HPACK status?

The receiver always controls the amount of memory used in the HPACK, and the minimum value can be set to 0, or the maximum value depends on the maximum integer that can be represented in the SETTING frame, currently 2^ 32-1.

How can I avoid staying in HPACK status?

Send a SETTINGS frame, set the state size (SETTINGS_HEADER_TABLE_SIZE) to 0, and RST all streams until a SETTINGS frame with the ACT setting bit is received.

Why is there a separate compression/process control context?

Just a quick word.

The original proposal mentioned the concept of flow grouping, sharing context, flow control, and so on. While that’s good for the agent (and good for the user experience), it also adds a bit of complexity. So we decided to start with something simple, see how bad a problem it would be, and fix those problems (if any) in a future version of the protocol.

Why is there an EOS symbol in HPACK?

For CPU efficiency and security reasons, the Huffman encoding of HPACK fills the next byte boundary of the Huffman encoding string. Therefore 0-7 bits may be required for any particular string.

If you consider Huffman decoding alone, any symbol longer than the required padding will work. However, HPACK is designed to allow huffman-encoded strings to be compared by byte. By populating the bits required for the EOS symbol, we ensure that users are equal when doing a Huffman encoded string bytes-level comparison. Conversely, many headers can be parsed without the need for Huffman decoding.

Can I implement HTTP/2 without implementing HTTP/1.1?

Usually/most of the time.

For HTTP/2 running over TLS (H2), if you do not implement the ALPN identifier of HTTP1.1, then you do not need to support any HTTP/1.1 features.

For HTTP/2 running over TCP (H2C), you need to implement the initial Upgrade request.

H2c-only clients need to generate a request for OPTIONS because “*” or a HEAD request for “/” are fairly safe and easy to build. A client that only wishes to implement HTTP/2 should treat an HTTP/1.1 response without the 101 status code as an error.

Servers that only support H2C can use a fixed 101 response to receive a request that contains the Upgrade header field. A request for an Upgrade token without H2C can be rejected with a 505 (not supported in the HTTP version) status code that contains the Upgrade header field. Servers that do not wish to process HTTP/1.1 responses should reject the request with a REFUSED_STREAM error code immediately after sending a connection preface that encourages the user to retry on an upgraded HTTP/2 connection.

Is the priority example in section 5.3.2 correct?

No, that’s right. Stream B has a weight of 4 and stream C has a weight of 12. To determine the proportion of available resources received by each stream, add up the ownership weights (16) and divide each stream weight by the total weight. Thus, stream B receives one quarter of the available resources and stream C receives three quarters. Thus, as the specification states: Stream B ideally receives one-third of the resources allocated to stream C.

Is TCP_NODELAY required for HTTP/2 connections?

Yes, it could be. Even for clients that download large amounts of data using only a single stream, some packets still need to be sent back in the opposite direction to achieve maximum transmission speed. In the absence of TCP_NODELAY (which still allows the Nagle algorithm), packets that can be transmitted are delayed for a period of time to allow them to be merged with subsequent packets.

For example, if such a packet tells a peer that there are more Windows available to send data, sending it a few milliseconds (or more) later can have a serious impact on a high-speed connection.

The deployment problem

How do I debug encrypted HTTP/2?

There are many ways to access application data, the simplest is to use NSS Keylogging in conjunction with the Wireshark plug-in (included in the latest development version). This approach works for both Firefox and Chrome.

How do I use HTTP/2 server push?

HTTP/2 server push allows the server to serve content to the client without waiting for a request. This can improve the time it takes to retrieve resources, especially for connections with products with high bandwidth latency, where network round trips account for most of the time spent on resources.

It may be unwise to push resources that vary based on the content of the request. Currently, browsers only push requests, and if they don’t, they make a matching request (see Section 4 of RFC 7234).

Some caches do not take into account changes in all request header fields, even though they are listed in the Vary Header field. Content negotiation is the best option to maximize the likelihood that the push resource will be received. Content negotiation based on accept-Encoding header fields is widely respected by the cache, but other header fields may not be supported.

If you find any errors in the translation or other areas that need improvement, you are welcome to revise and PR the translation in the Gold Translation program, and you can also get corresponding bonus points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.

Diggings translation project is a community for translating quality Internet technical articles from diggings English sharing articles. The content covers the fields of Android, iOS, front end, back end, blockchain, products, design, artificial intelligence and so on. For more high-quality translations, please keep paying attention to The Translation Project, official weibo and zhihu column.