Gu (A) Y in the group mentioned an interview question about the difference between HTTP Keep Alive and TCP Keep Alive in the operating system.

This question can be regarded as an eight-part essay, but it is difficult to give a systematic and definite answer if you ask it carefully. This is also a good interview question, so here combined with the code to talk about their own understanding.

HTTP keepalive

As of HTTP 1.0, each TCP connection was used by only one HTTP Transaction (request plus response), which was established on request and released on request completion. As web content becomes more and more complex, with lots of images, CSS, and other resources, this model becomes too inefficient. Therefore, in HTTP 1.1, the concept of HTTP Persistent Connection, also known as HTTP Keep-Alive, is introduced, with the purpose of reusing TCP connections and making multiple HTTP requests on one TCP connection to improve performance.

HTTP 1.0 is disabled by default. You need to add “Connection: keep-alive “to the HTTP header in order to enable keep-alive. HTTP 1.1 enables keep-alive by default. Connection: close is used to disable keep-alive.

We can use Netty to implement an HTTP server. The complete code for Netty to implement the HTTP server will not be listed here, just look at the key ChannelRead method.

@Override public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception { if (msg instanceof HttpRequest) { HttpRequest request = (HttpRequest) msg; boolean keepAlive = HttpUtil.isKeepAlive(request); serverBootstrap.channel(NioServerSocketChannel.class) .group(boss, Work).handler(new LoggingHandler(LogLevel.info)) // Handler is executed when it is initialized ChildHandler (new ChannelInitializer<SocketChannel>() {@Override protected void initChannel(SocketChannel) ch) throws Exception { ch.pipeline().addLast("http-coder",new HttpServerCodec()); ch.pipeline().addLast("aggregator",new HttpObjectAggregator(1024*1024)); Pipeline ().addLast(new HttpServerHandler()); // Pipeline ().addLast(); } }) .option(ChannelOption.SO_BACKLOG, 1024) .childOption(ChannelOption.SO_KEEPALIVE, true) .childOption(ChannelOption.TCP_NODELAY, true); // Handle code HttpResponse.headers ().set(HttpHeaderNames.content_type, "text/ HTML; charset=UTF-8"); httpResponse.headers().setInt(HttpHeaderNames.CONTENT_LENGTH, httpResponse.content().readableBytes()); if (keepAlive) { httpResponse.headers().set(HttpHeaderNames.CONNECTION, HttpHeaderValues.KEEP_ALIVE); ctx.writeAndFlush(httpResponse); } else { ctx.writeAndFlush(httpResponse).addListener(ChannelFutureListener.CLOSE); }}}

Netty encapsulates the HTTP implementation. The code is very simple. If the request is determined to be Keep-Alive, the response header is also marked with the Keep-Alive flag, thus implementing the Keep Alive function.

Is that how keep alive is implemented? So how do I know if Keep Alive really works? We can print the key information such as Channel ID in the process of ChannelRead.

System.out.println("keepAlive="+keepAlive); System.out.println("channel id="+ctx.channel().id()); System.out.println("http uri: " + uri); // Print the request parameters.

Then we request it twice in the browser and look at the log

Information: [id: 0 xee8bc5e1, L: / 0:0:0:0:0:0:0:0:8 080] READ: [id: 0 x734e2ebb, L: / 0:0:0:0:0:0:0:1:8 080 - R: / 0:0:0:0:0:0:0:1:37] July 06, 386 2021 io.net ty 10:03:48 afternoon. Handler. Logging. LoggingHandler channelReadComplete information: [id: 0xee8bc5e1, L:/0:0:0:0:0:0:0:0:8080] READ COMPLETE keepAlive=true channel id=734e2ebb http uri: /a.txt?name=chen&f=123; key=456 name=chen f=123 key=456 keepAlive=true channel id=734e2ebb http uri: /favicon.ico keepAlive=true channel id=734e2ebb http uri: /a.txt? name=chen&f=123; key=456 name=chen f=123 key=456 keepAlive=true channel id=734e2ebb http uri: /favicon.ico

As you can see, no matter how many times the server log is flushed, only one socket connection log is recorded, and the channel ID is the same each time.

If it’s not Keep Alive, what about server-side logs?

July 06, 2021 io.net ty 9:51:27 afternoon. Handler. Logging. LoggingHandler channelRead information: [id: 0xade39344, L:/0:0:0:0:0:0:0:0:8080] READ: [id: 0 x26d40041, L: / 0:0:0:0:0:0:0:1:8 080 - R: / 0:0:0:0:0:0:0:1:33] July 06, 130 2021 io.net ty 9:51:27 afternoon. Handler. Logging. LoggingHandler channelReadComplete information: [id: 0xade39344, L:/0:0:0:0:0:0:0:0:8080] READ COMPLETE keepAlive=true channel id=26d40041 http uri: /a.txt?name=chen&f=123; Key = 456 name = Chen f = 123 key = 456 July 06, 2021 io.net ty 9:51:29 afternoon. Handler. Logging. LoggingHandler channelRead information: [id: 0xade39344, L:/0:0:0:0:0:0:0:0:8080] READ: [id: 0 x600995e6, L: / 0:0:0:0:0:0:0:1:8 080 - R: / 0:0:0:0:0:0:0:1:33] July 06, 156 2021 io.net ty 9:51:29 afternoon. Handler. Logging. LoggingHandler channelReadComplete information: [id: 0xade39344, L:/0:0:0:0:0:0:0:0:8080] READ COMPLETE keepAlive=true channel id=600995e6 http uri: /a.txt?name=chen&f=123; key=456 name=chen f=123 key=456

The client has two connections to the socket port 33130 and 33156 for the second time. The channel ID is also different. It is confirmed that there are two connections.

Now you can visualize how HTTP Keep Alive works. In fact, HTTP Keep Alive is easy to understand. HTTP is based on the TCP protocol (for the sake of rigor, this is only for the HTTP 1.1 version, HTTP 2 is more complex, and HTTP 3 is based on UDP protocol). TCP is streaming. It would be natural for stateless HTTP requests to implement Keep Alive over streaming TCP, but there is a problem.

Implementing a long connection is simple, as long as both the client and the server maintain the HTTP long connection. But the key question is how does the browser know when the server has responded after a long connection is maintained? With a short connection, the server closes the HTTP connection as soon as it completes the response, so the browser knows it has received all the responses, and also closes the connection (TCP connections work both ways). When using a long connection, the server cannot close the connection after the response is complete, so it must add a special flag in the response header to tell the browser that the response is complete.

In general, this special flag is Content-Length, which indicates the data size of the response body. For example, Content-Length: 120 means that the response body contains 120 bytes. In this way, the browser will know that it has completed the response after receiving the response body of 120 bytes.

Because the Content-Length field must truly reflect the Length of the response body, but in practice, sometimes the response body Length is not so easy to obtain, such as the response body is from a network file, or generated by a dynamic language. At this time, if you want to accurately obtain the length, you can only first open a large enough memory space, and then calculate after all the content is generated. On the one hand, this requires more memory overhead, and on the other hand, it makes the client wait longer. This is where the transfer-encoding :chunked response header comes in. This header indicates that the body of the response is being transmitted in chunks, so the server can send the response to the browser in chunks rather than all at once. When the browser receives all chunks, the response is over.

So, HTTP keep-alive is really just connection multiplexing.

So we talked about HTTP Keep Alive, what about TCP Keep Alive?

TCP keepalive

In the case of using a TCP long connection (reuse of an established TCP connection), the TCP connection needs to be kept alive to avoid being killed by the gateway. In the application layer, it can be realized by sending heartbeat packets regularly. Linux already provides TCP KeepAlive, in the application layer does not care when the heartbeat packet is sent and what content is sent, the OS will be managed: the OS will regularly send probe packet on the TCP connection, the probe packet not only plays a role in keeping the connection alive, but also can automatically detect the validity of the connection, and automatically close the invalid connection.

The mechanism of TCP, there are a lot of articles, say much more clearly than I, I will not go into detail. In simple terms, TCP’s KeepAlive mechanism is intended to keepalive, heartbeat, and detect connection errors. It is a timer-based implementation. The default in Linux is 7200 seconds.

conclusion

The purpose of the HTTP protocol Keep Alive is connection multiplexing, with request-response data transmitted serially on the same connection. The purpose of TCP’s Keep Alive mechanism is to keep life alive, heartbeat, and detect connection errors.

There is no direct relationship between the two. Going back to the netty code, notice here

.childHandler(new ChannelInitializer<SocketChannel>() { @Override protected void initChannel(SocketChannel ch) throws Exception { ch.pipeline().addLast("http-coder",new HttpServerCodec()); / /... }) .option(ChannelOption.SO_BACKLOG, 1024) .childOption(ChannelOption.SO_KEEPALIVE, true) .childOption(ChannelOption.TCP_NODELAY, true);

Do you think that if the ChannelOption. SO_KeepAlive property is set to FLASE, the subsequent HTTP Keep Alive will still work?