Serial connection
HTTP/0.9 and the earlier HTTP/1.0 protocols serialized HTTP request processing. Suppose a page contains three style files, all belonging to the same protocol, domain name, and port. The browser then initiates four requests, opens only one TCP channel at a time, disconnects the requested resource as soon as it finishes downloading, and opens a new connection to process the next request in the queue. As the size and number of page resources continue to expand, the network delay time will continue to accumulate, users will be faced with blank screen, waiting too long and lose patience.
Parallel connection
In order to improve the throughput capacity of the network, the improved HTTP protocol allows clients to open multiple TCP connections at the same time, request multiple resources in parallel, and make full use of bandwidth. Typically, there is some latency between each connection, but the transmission time of the request is overlapped and overall the latency is much lower than serial connections. Considering that each connection consumes system resources and the server has to handle a large number of concurrent requests from users, the browser sets a limit on the number of concurrent requests. Even if the RFC doesn’t specify a specific limit, browser vendors have their own standards:
- IE 7: 2
- IE 8/9: 6
- IE 10: 8
- IE 11: 13
- Firefox: 6
- Chrome: 6
- Safari: 6
- Opera: 6
- iOS WebView: 6
- Android WebView: 6
Persistent connection (long connection)
Earlier HTTP protocols used a separate TCP connection for each request, which increased TCP connection establishment, congestion control, and connection release costs. Improved HTTP/1.0 and HTTP/1.1 (the default) support persistent connections. If a request is completed, the connection is kept for a certain period of time to quickly process the upcoming HTTP request. The same TCP channel is reused until the heartbeat detection of the client fails or the server connection times out. This feature can be activated via the HTTP header Connection: keep-alive, or the client can actively close the Connection by sending Connection: close. So, we see that the two optimizations of parallel and persistent connections are complementary. Parallel connections enable multiple TCP connections to be opened simultaneously on the first load page, while persistent connections ensure that subsequent requests reuse the opened TCP connection, which is a common mechanism for modern Web pages.
Pipe connection
Persistent connections allow us to reuse the connection for multiple requests, but it must satisfy the FIFO’s queue order, and the next request in the queue must be guaranteed that the previous request reached the server successfully, processed successfully, and received the first byte returned from the server. HTTP pipes allow clients to initiate multiple requests in a row within the same TCP channel without waiting for a response, eliminating round-trip latency differences. However, due to the limitation of HTTP/1.x protocol, data is not allowed to arrive interleaved on a link (IO multiplexing). Imagine a situation where the client server sends an HTML and multiple CSS requests at the same time, and the server processes all requests in parallel. When all CSS requests are processed and added to the buffer queue, it is found that the HTML request processing encounters problems and is indefinitely suspended, and even causes buffer overflow in serious cases. This condition is called queue head blocking. Therefore, this scheme was not adopted in the HTTP/1.x protocol.
Queue head blocking is not a concept unique to HTTP, but a common phenomenon in cached communication network switching
conclusion
- A browser can open a maximum of six TCP connections for the same protocol, domain name, or port.
- The same TCP connection allows multiple HTTP requests, but must wait for the first byte of the previous request response to reach the client.
- The client was not allowed to send all requests in the queue at the same time due to queue head blocking, which was resolved in HTTP/2.0.