Caption: by @olga
Hi, everyone, I am Chengxiang Ink ying!
The HTTP protocol plays an important role in network knowledge. The most basic protocol of HTTP is the request and response packets, which are composed of Header and entity. Most use of the HTTP protocol relies on setting different HTTP request/response headers.
This series of working HTTP sets out to solve the problem of how these HTTP protocols are used, beyond the usual header-style presentation. And how it’s designed and how it works.
The HTTP protocol is a stateless “loose protocol” that does not record the status of different requests, and because it contains both sides (client and server), most of its content is just a recommendation based on request and response, which can be ignored on both sides.
“It says here that the suggested retail price is $2…”
‘Oh, no advice!
In the previous two articles, we talked about HTTP caching and HTTP content entity encoding compression, respectively. When we talked about entity encoding compression, we also mentioned a transfer encoding, which lets us optimize the way we transfer. Entity coding and transmission coding are mutually reinforcing and are commonly used together.
This article is about HTTP’s transport encoding mechanism.
2, HTTP transmission encoding
2.1 What is transmission Coding?
The Transfer Encoding is marked in HTTP headers with the transfer-encoding header, which specifies the currently used Transfer Encoding.
Transfer-encoding changes the format of messages and the way they are transmitted. Using transfer-encoding does not reduce the size of the content transmission, and may even make the transmission larger. This may seem like a bad idea, but it is designed to solve specific problems.
In short, transmission encoding must be used in conjunction with persistent connections. it is designed to block data transfer and mark the end of transmission within a persistent connection. more on this later.
In earlier designs, just as content encodings used accept-encoding to mark the type of compression encodings received by clients, transmission encodings were used with the request header TE to specify supported transmission encodings. However, in the latest HTTP/1.1 protocol specification, only one transmission encoding is defined: chunked, so there is no need to rely on the TE header.
These details will be covered later. Since transmission encoding and persistent connections are related, let’s first understand what a persistent connection is.
2.2 Persistent Connection
In popular terms, is a long Connection, English called Persistent Connection, in fact, literally understood.
In the early days of HTTP, data was transmitted in the order of making a request, establishing a connection, transferring data, closing the connection, and so on. With persistent connections, the closing step is removed, so that the client and server can continue to transfer content through the connection.
This is also to improve the transmission efficiency. As we know, HTTP protocol is built on THE TCP protocol. Naturally, IT has the same characteristics as TCP, such as three-way handshake and slow start, so that each connection is actually a valuable resource. To maximize HTTP performance, it is important to use persistent connections. To this end, the MECHANISM is introduced in the HTTP protocol.
In the early days of HTTP/1.0, there were no persistent connections. The concept of persistent connections was introduced later, with the Connection: keep-alive header, which told a client or server not to disconnect the TCP Connection after sending data. Then you need to use it again.
The importance of persistent connections is found in HTTP/1.1, which states that all connections must be persistent unless explicitly in the header, through the Connection:close header, that the Connection will be closed at the end of the transmission.
In fact, the Connect header no longer has the keep-alive value in HTTP/1.1. Due to historical reasons, many clients and servers still retain this header.
Persistent connections bring another problem, how to determine the current data transmission is complete.
2.3 Check that the transmission is complete
In the early days when persistent connections were not supported, it was possible to rely on a connection disconnect to determine that the current transfer was over, as most browsers did, but this was not the norm. The content-length header should be used to specify the Length of the entity Content being transmitted.
For example, in the case of a persistent connection, you rely on content-length to determine when the data is sent.
Content-length determines that a response entity has already been sent. In this case, we would require content-length to be consistent with the Length of the Content entity, and if not, all kinds of problems would arise.
As shown in the figure above, if the content-length is less than the Length of the Content entity, truncation will take place; otherwise, the current response cannot be determined to be complete and the request will continue to hang resulting in the Padding state.
Ideally, when we respond to a request, we need to know the size of its content entity. But in practice, sometimes the length of the content entity is not so easy to obtain. For example, the content entity comes from a network file or is dynamically generated. At this time, if you still want to obtain the length of the content entity in advance, you can only open a large enough Buffer and wait until all the content is cached.
However, this is not a good solution, all the cache in the Buffer, the first will consume more memory, the second will also take more time, make the client wait too long.
This requires a new mechanism that does not rely on the content-length value to determine whether the current Content entity has been transferred. This requires the transfer-encoding header.
2.4 Transfer Encoding: chunked
As mentioned earlier, transfer-encoding in the latest HTTP/1.1 protocol has only the chunked parameter, which indicates that the Transfer is block-encoded.
Since there is only one optional argument, all we need to do is specify transfer-encoding :chunked and we can then wrap the content entity in blocks for Transfer.
Rules for block transfer:
1. Each block contains a hexadecimal data length value and real data.
2. The data length value occupies a single line, and the real data is divided by CRLF(\r\n).
3. Data length: The CRLF at the end of the real data is not calculated, but the data length of the current transmission block is calculated.
4. Finally, a block with a data length of 0 is used to mark the end of the current content entity transmission.
In this example, transfer-encoding is first marked in the response header: Chunked, then passed the first block “0123456780” of length B (hexadecimal of 11), followed by “Hello CxmyDev” and “123” respectively, and ended with a block of length 0 marking the current response.
2.5 Chunked drag
When chunked is used for chunked transmission, there is a chance to append a piece of data at the end of the chunked message, which is called Trailer.
The dragged data can be the data that the server needs to transfer at the end. The client can actually ignore and discard the dragged content, which requires both sides to negotiate the content to be transferred.
The accompanying header field can be included in the drag and drop, and all HTTP headers except for transfer-Encoding, Trailer, and Content-Length headers can be sent as drag and drop.
Generally, drag is used to pass values that cannot be determined at the start of the response. For example, the Content-MD5 header is a common header to append to in drag. As with lengths, it is also difficult to calculate the MD5 value of a content entity that needs to be transmitted in block encoding when the response starts.
Note the addition of Trailder to the header to specify that a content-MD5 drag header is also passed at the end, and if there are multiple drag headers, they can be separated by commas.
Three, content coding and transmission coding combination
Content encoding and transmission encoding are generally used together. We will use content encoding to compress the content entity and then send it out in chunks using the transmission encoding. The client receives the data in blocks and then reintegrates the data to restore the original data.
Iv. Transmission coding summary
We should have some idea of the transmission code. Here’s a quick summary:
1. Transfer Encoding is marked with the transfer-encoding header. In the latest HTTP/1.1 protocol, it only has the value chunked, which means chunked Encoding.
2. Transmission coding is mainly to solve the persistent connection will be data block transfer, determine the end of the content entity transmission.
3. Block format: Data length (hexadecimal) + block data.
4. If there is additional data, you can use Trailer to drag and transfer the additional data after completion.
5. Transmission encoding is usually used in conjunction with content encoding.
In addition, the transmission encoding should be standard implemented in all HTTP/1.1 implementations and should be supported. If you receive ununderstood transmitted encoded packets, you should return the 501 Unimplemented status code.
Reference links:
- Transfer-Encoding:https://imququ.com/post/transfer-encoding-header-in-http.html in the HTTP protocol
- 3.3.1 Transfer-Encoding:https://tools.ietf.org/html/rfc7230#page-28 REC 7230
- RFC 7230, section 4.4: Trailer:https://tools.ietf.org/html/rfc7230#section-4.4
- RFC 7230, section 4.1.2: Chunked trailer part:https://tools.ietf.org/html/rfc7230#section-4.1.2
Public account background reply growth “growth”, will get my prepared learning materials, can also reply “group”, learning progress together; You can also respond to “questions” and ask me questions.
Recommended Reading:
- HTTP content encoding, there are only two things to know
- Password management for programmers
- Manually refresh the MediaStore, and the saved images immediately appear in the album
- Pseudocode, humor, and the art of Google!
- Comic: App prevents Fiddler from grabbing bags with tips!