preface

HTTP2.0 greatly improves web performance and further reduces network latency on the basis of full semantic compatibility with HTTP1.1. Low latency and high throughput. For front-end developers, less optimization is required. This article will focus on what these new features do, how they work, and how they are better optimized

  • Binary framing
  • The first compression
  • Flow control
  • multiplexing
  • Request priority
  • Server push

Binary framing

Without changing http1.x semantics, methods, status codes. In the case of urls and header fields, how can HTTP2.0 break through the performance limitations of HTTP1.1, improve transport performance, and achieve low latency and high throughput? One key is to add a binary frame layer between the application layer (HTTP) and the transport layer (TCP).

Before sorting out binary frames and how they work, let’s start with a little bit of knowledge about frames:

  • Frame: The minimum unit of HTTP2.0 communication. All frames share an 8-byte header, which contains the frame’s length, type, flag, one reserved bit, and at least an identifier that identifies the stream to which the current frame belongs. A frame carries a specific type of data, such as HTTP header, payload, and so on.
  • Message: A unit of communication larger than a frame. It refers to a logical HTTP message, such as a request or a response. Consists of one or more frames
  • Stream: a communication unit larger than a message. A virtual channel in a TCP connection that can carry messages in both directions. Each stream has a unique integer identifier

What is binary framing

At the binary frame-splitting layer, HTTP2.0 splits all the transmitted information into smaller messages and frames and encapsulates them with binary encoding. The http1.x header is encapsulated in the Headers frame, and the request body is encapsulated in the Data frame.

How does binary framing work

HTTP2.0 communication is done over a TCP connection that can host any number of two-way data streams, each of which is sent as a message. Messages consist of one or more frames that can be sent out of order and then reassembled according to the stream identifier at the head of each frame.

Contribution of binary framing to performance optimization

Binary framing mainly provides the basis for the following features. It can divide and encapsulate data into smaller and more convenient data. Firstly, in the single-link multi-resource mode, the link pressure on the server side is reduced, the memory usage is less, and the link throughput is greater. This can be seen in the context of multiplexing below. On the other hand, network congestion is improved due to fewer TCP connections, and slow start time is reduced. Makes congestion and packet loss recovery faster.

The first compression

HTTP1.1 does not support HTTP header compression, hence SPDY and HTTP2.0. SPDY uses DEFLATE, while HTTP2.0 uses HPACK, which is designed specifically for first compression.

What is header compression

Each http1.x communication (request or response) carries header information that describes resource attributes. HTTP2.0 uses header tables between client and server to track and store previously sent key-value pairs. Request and response headers are basically the same in HTTP2.0, except that all header keys must be in lower case and the rows must be separate key-value pairs :method:, :scheme:, :host:, :path:

How does header compression work

The same data is not sent again through each request and response. Each new header key-value pair is either appended to the end of the current table or replaces the previous value in the table. The header table exists for the lifetime of the HTTP2.0 link and is updated incrementally by both the client and the server.

Contribution to optimization of header compression performance

Header table in HTTP2.0 uses header compression technology. Make the header more compact, faster transmission, is conducive to mobile network environment. Reduce the data amount of each communication, so that the network congestion can be improved.

Flow control

HTTP2.0 provides a simple mechanism for data flow and connection traffic:

  • Traffic is based on each hop of the HTTP link, rather than end-to-end control
  • Flow control is based on window update frames, where the receiver broadcasts how many bytes of a data stream it intends to receive, as well as how many bytes to receive for the entire link.
  • Flow control is directional, meaning that the receiver may set any window size for each flow or even for the entire link
  • Flow control can be disabled by the receiver, both for individual flows and for the entire link.
  • The type of frame determines whether flow control applies to frames. Currently only DATA frames are subject to flow control; all other types of frames do not consume space in the flow control window. This ensures that important control frames are not blocked by flow control

multiplexing

In HTTP1.1, browser clients are limited to a certain number of requests under the same domain name at the same time. Requests exceeding the limit will be blocked. Multiplexing in HTTP2.0 optimizes this performance.

What is multiplexing

Based on the binary framing layer, HTTP2.0 can send requests and responses simultaneously while sharing a TCP connection. HTTP messages are broken into individual frames without breaking the semantics of the message itself, interlaced, and reassembled at the other end according to the stream identifier and first part.

How does multiplexing work

Let’s see how it works by comparing it to Http1.x.

  • HTTP1.x
  • HTTP2.0

Contribution of multiplexing to performance optimization efforts

  1. Requests and responses can be sent in parallel and interleaved, with no impact on each other
  2. Multiple requests and responses can be sent in parallel using just one link
  3. Reduce page loading time by eliminating unnecessary delays
  4. You don’t have to do much more work to get around http1.x restrictions

Request priority

Once the HTTP message is divided into many individual frames, performance can be further optimized by optimizing the interleaving and transmission order of those frames.

What is the request priority

Each stream can have a 31bit priority value: 0 for the highest priority; 2 to the 31st minus 1 is the lowest priority.

How does request priority work

The client specifies a priority, which the server can use as the basis for interactive data. For example, the client is set to.css>.js>.jpg. The server returns the results in this order to make more efficient use of the underlying connection and improve the user experience. However, when using request priorities, you should pay attention to whether the server supports request priorities and whether it can cause queue-first blocking problems, such as a slow response request with a high priority that blocks interactions with other resources.

The contribution of request priority to performance tuning efforts

The server can control resource allocation (CPU, memory, broadband) based on the priority of the stream, and after the response data is ready, the highest priority frames are sent to the client first. The browser can dispatch the request as soon as it discovers the resource, specifying a priority for each flow and letting the server determine the optimal order of response. Requests are not queued, saving time and maximizing each connection.

Server push

A powerful new feature of HTTP2.0 is that the server can send multiple responses to a single client request. The server pushes resources to the client without an explicit request from the client.

What is Server Push (in HTTP2.0)

The server returns multiple responses in advance to push additional resources to the client based on the client’s request. As shown below, the client requests stream 1(/page.html). Server pushes stream 2(/script.js) and stream 4(/style.css) while returning stream 1 message

How does server push work

  • The PUSH_PROMISE frame is a signal that the server is intentionally pushing a resource to the client.
  • The PUSH_PROMISE frame contains only the header of the pre-pushed resource. If the client has no objection to the PUSH_PROMISE frame, the server sends a response DATA frame after the PUSH_PROMISE frame. A PUSH_PROMISE frame can be rejected if the client has already cached the resource and does not need to push it.
  • Push-promises must follow the request-response principle, pushing resources only through the response to the request.
  • PUSH_PROMISE frames must be sent before the response is returned to avoid a race condition on the client.
  • After an HTTP2.0 connection, the client and server exchange SETTINGS frames to limit the maximum amount of bidirectional concurrency. Therefore, the client can limit the number of push streams or disable server push completely by setting this to 0 only.
  • All pushed resources must comply with the same origin policy. In other words, the server cannot push third-party resources to the client, but only after the confirmation of both parties.

Server push contribution to performance tuning efforts

Server push is a mechanism for sending data before the client requests it. In HTTP2.0, the server can send multiple responses to a single client request. If a request is sent by your home page, the server may respond with the home page content, logo, and style sheet because it knows the client will use them. This not only reduces redundant data transfer steps, but also speeds up page response and improves the user experience.