This chapter by the original author loose if original release, the author homepage: zhihu.com/people/hrsonion/posts, thank the original author’s selfless share.

1, the introduction

A classic interview question is: What happens between the time the URL is typed into the browser and displayed on the page?

Most of the answers are about how the DOM is constructed and drawn after the request response. But have you ever wondered how, in what order, how many connections are made, and what protocol is used to download dozens of image tags in the HTML you receive?

To understand this, we need to solve the following five questions:

1) Do modern browsers disconnect after an HTTP request is completed after establishing a TCP connection with the server? When will it be disconnected?

2) How many HTTP requests can a TCP connection correspond to?

3) Can HTTP requests sent in a TCP connection be sent together (e.g., three requests sent together and three responses received together)?

4) Why do you sometimes refresh the page without re-establishing SSL connection?

5) Does the browser limit the number of TCP connections to the same Host?

Well, with the above questions in mind, I’m going to read this article.

(This article is simultaneously published at: www.52im.net/thread-2680…)

2. Related articles

  • Web Programming Slacker’s Guide (7) : Understand HTTP in a Nutshell
  • From HTTP/0.9 to HTTP/2: Understanding the History and Design of the HTTP Protocol
  • Introduction to Brain-dead Network Programming (3) : Some must-know HTTP protocols
  • Introduction to Web Programming (4) : A Quick Understanding of HTTP/2 Server Push

First question: Will the connection to the server be disconnected after an HTTP request? In what case is the disconnect?

As the title suggests, let’s start with the first question: do modern browsers disconnect after an HTTP request is completed after establishing a TCP connection with the server? When will it be disconnected?

In HTTP/1.0, a server breaks the TCP connection after sending an HTTP response. However, TCP connections are re-established and disconnected each time a request is made. So although it is not specified in the standard, some servers support Connection: keep-alive headers. After completing the HTTP request, do not disconnect the TCP connection used for the HTTP request. The benefits are that the connection can be reused without the need to re-establish the TCP connection when sending HTTP requests later, and the overhead of SSL can be avoided if the connection is maintained.

The following two pictures are my two visits in a short timewww.github.comTime statistics of:

▲ First access, initialization connection and SSL overhead

▲ The initialization connection and SSL overhead disappear, indicating that the same TCP connection is used

Persistent connections: Since there are so many benefits to maintaining a TCP Connection, HTTP/1.1 writes the Connection header into the standard and enables persistent connections by default, unless the request says “Connection” : Close, the TCP connection between the browser and the server is maintained for a period of time and does not break at the end of the request.

So the answer to the first question is: by default, establishing a TCP Connection does not break, only declaring Connection: close in the request header will close the Connection after the request completes. (please see detailed documentation: tools.ietf.org/html/rfc261…).

Question 2: How many HTTP requests can a TCP connection correspond to?

The answer to the first question is already there: a TCP connection can send multiple HTTP requests if the connection is maintained.

5. Third question: Can HTTP requests sent in a TCP connection be sent together?

Let’s look at the third question: can HTTP requests sent in a TCP connection be sent together (say, three requests sent together and three responses received together)?

There is a problem with HTTP/1.1: a single TCP connection can only handle one request at a time.

The life cycles of two HTTP requests cannot overlap. The start to end times of any two HTTP requests cannot overlap on the same TCP connection.

Pipelining is specified in the HTTP/1.1 specification to address this problem, but it is turned off by default in browsers.

Take a look at what Pipelining is, as outlined in RFC 2616:

The original: A client that supports persistent connections MAY “pipeline” its requests (i.e., send multiple requests without waiting for each response). A server MUST send its responses to those requests in the same order that the requests were received.

A client that supports persistent connections can send multiple requests within a connection (without waiting for a response from any request). The server receiving the request must send the response in the order the request was received.

One possible reason for the standard is that HTTP/1.1 is a text protocol, and the content returned does not distinguish which request to send, so the order must be consistent. What if you send two GET /query requests to the server? Q = A and GET/query? Q =B, the server returns two results, and there is no way for the browser to determine which one the response corresponds to.

Pipelining looks like a good idea, but there are many problems in practice:

1) Some proxy servers cannot handle HTTP Pipelining correctly;

2) Proper pipelining implementation is complex. For details, see HTTP/1.x Connection Management.

3) Head-of-line Blocking: After establishing a TCP connection, suppose that the client sends several consecutive requests to the server on that connection. By standard, the server should return the results in the order it received the requests, assuming that the server took a lot of time to process the first request, then all subsequent requests would have to wait for the first request to end.

So HTTP Pipelining is not enabled in modern browsers by default.

However, HTTP2 provides the Multiplexing feature, which allows multiple HTTP requests to be completed simultaneously over a TCP connection. How exactly the Multiplexing is implemented is another question. We can see the effect of using HTTP2.

▲ Green is the waiting time between the initiation of the request and the return of the request, blue is the download time of the response, you can see that they are all in the same Connection, in parallel

So there is an answer to this question: Pipelining technology exists at HTTP/1.1 that can do this at the same time, but since the browser is turned off by default, it can be argued that this is not feasible. Multiplexing allows multiple HTTP requests to be processed in parallel in the same TCP connection due to the Multiplexing feature in HTTP2.

So how can browsers improve page loading efficiency in the AGE of HTTP/1.1?

There are two main points:

1) Maintain the established TCP connection with the server and process multiple requests sequentially on the same connection;

2) Establish multiple TCP connections with the server.

6. Fourth question: Why is it sometimes not necessary to re-establish SSL connection to refresh the page?

The answer to the first question has been addressed in the discussion: TCP connections are sometimes maintained by browsers and servers for a period of time. TCP does not need to be re-established, SSL will naturally use the previous.

7. Fifth question: Is there a limit on the number of TCP connections established by the browser to the same Host?

Suppose we were in the HTTP/1.1 era, when there was no multiplexing. What would a browser do when it got a web page with dozens of images?

You can’t just open a TCP connection for sequential downloading, which will make it very uncomfortable for users to wait. However, if each image has a TCP connection to send HTTP requests, the computer or server may not be able to bear it. If there are 1000 images, you can’t open 1000 TCP connections. Your computer may or may not agree with NAT.

So the answer is: yes. Chrome allows up to six TCP connections to the same Host. Different browsers have some difference, see: developers.google.com/web/tools/c… .

So back to the original question: if the HTML you receive contains dozens of image tags, in what way, in what order, how many connections are made, and what protocol is used to download the images?

If the images are all HTTPS connections and under the same domain name, then the browser will negotiate with the server after the SSL handshake whether HTTP2 can be used and if so use the Multiplexing function over the connection. But also would not necessarily all hang in the domain of resources will be to use a TCP connection to get, but what is certain is Multiplexing is likely to be used.

What if you can’t use HTTP2? Or you can’t use HTTPS (in real life HTTP2 is implemented over HTTPS, so you can only use HTTP/1.1). The browser establishes multiple TCP connections on the same HOST. The maximum number of connections depends on the browser Settings. These connections are used by the browser to send new requests when idle. Then the other requests will have to wait.

(Original link: click here to enter)

Appendix: More information on network programming basics

Description of TCP/IP – Chapter 11 ·UDP: User Datagram Protocol Transmission control Protocol “TCP/IP detail – chapter 18 ·TCP connection establishment and termination” “TCP/IP detail – Chapter 21 ·TCP timeout and retransmission” “Technology past: TCP/IP protocol changed the world (precious many pictures, mobile phone careful points)” “Easy to understand – in-depth understanding of TCP protocol (1) : Theoretical Basis: Simple to Understand – In-depth understanding of TCP (PART 2) : RTT, Sliding Window, Congestion Processing theory Classics: DETAILED explanation of TCP three-handshake and four-wave process Theory and Practice: Wireshark Packet Capture analyzing TCP Three-way handshake and Four-way wave Processing Computer Network Communication Protocol Diagram What is the Maximum Size of a UDP Packet? P2P technology details (a) : NAT – detailed principle, P2P introduction “P2P technology details (b) : P2P through (hole) scheme details” “P2P technology details (c) : P2P technology STUN, TURN, ICE details” “easy to understand: “High Performance Network Programming (1) : How many concurrent TCP connections can a single server have” “High performance network programming (2) : The last 10 years, the famous C10K concurrent connection problem” “High performance network programming (3) : In the next 10 years, it is time to consider the CONCURRENCY problem of C10M. High Performance Network Programming (4) : Theoretical Exploration of High performance Network Applications from C10K to C10M. “Unknown network programming (1) : Analysis of the TCP protocol in the difficult diseases (1)” “Unknown network programming (2) : Analysis of the TCP protocol in the difficult diseases (2)” “Unknown network programming (3) : TIME_WAIT and CLOSE_WAIT when closing TCP connections Unknown Network Programming (4) : In Depth analysis of TCP abnormal Shutdown Unknown Network Programming (5) : UDP Connectivity and Load Balancing Unknown Network Programming (6) : Understand UDP thoroughly and use it well. Unknown Network Programming (7) : How to Make UNRELIABLE UDP Reliable? “Unknown network programming (eight) : Deep decryption of HTTP from the data transfer layer” “Network programming lazy introduction (a) : a quick understanding of network communication protocol (PART I)” “Network programming lazy Introduction (two) : a quick understanding of network communication protocol (part II)” “Network programming lazy introduction (three) : A quick understanding of THE TCP protocol is enough “network programming lazy introduction (four) : a quick understanding of the difference between TCP and UDP” network programming lazy introduction (five) : a quick understanding of why UDP is sometimes more advantageous than TCP “network programming lazy introduction (six) : The history of the most popular hub, switch, router function principle introduction “network programming lazy people introduction (seven) : simple, a comprehensive understanding of the HTTP protocol” network programming lazy people introduction (eight) : Hand to teach you to write based on TCP Socket long connection “network programming lazy people introduction (nine) : Why use MAC addresses when you have IP addresses? “Technology Literacy: A New Generation of UdP-based Low Latency Network Transport Layer Protocol — A Full explanation” “Make The Internet Faster: A New generation of QUIC Protocol in Tencent’s technical Practice Sharing” “Modern Mobile terminal Network Short Connection optimization means summary: Request speed and weak network security, and adapt the talk iOS network programming in the long connection of those things, the mobile terminal IM developers required (a) : easy to understand, understand mobile network “weak” and “slow”, “the mobile end (2) : IM developers are required to read the history of most comprehensive mobile weak network optimization method summary” the IPv6 technology is a: Basic Concepts, Application Status, Technical Practice (Part 1) IPv6 Technical Details: Basic Concepts, Application Status, technical Practice (Part 2) from HTTP/0.9 to HTTP/2: An introduction to Brain-disabled Network programming (Part 1) : Introduction to Brain-disabled Network programming (2) : What are we reading and writing when we read and write sockets? Introduction to Network Programming (4) : A quick understanding of HTTP/2 Server Push “Introduction to Network programming (5) : Ping command used every day, What is it? Introduction to Network Programming (6) : What is public IP and internal IP? What is NAT? Take the Design of Network Access Layer of Online Game Server as an example to understand the Technical challenges of real-time Communication to The Advanced level: The Network Foundation that Excellent Android Programmers must Know and must Know Fully Understand the Miscellaneous Diseases of DNS domain name hijacking on Mobile Terminal: Technical Principles, Root Causes and Solutions “Android Developers must know and must know network communication Transport Layer protocols — UDP and TCP” “IM Developers must know zero Basic Communication Technology introduction (1) : 100 Years of Development of Communication switching Technology (1)” “IM developers zero Basic Communication technology introduction (2) : Communication history of the exchange of technology in one hundred (under) the IM introduction to developers of zero based communication technology (3) : Chinese communication mode of “one hundred change” IM developers based communication technology introduction of zero (4) : the evolution of mobile phones, most complete history of the evolution of mobile terminals. The IM developers based communication technology introduction of zero (5) : 1 g to 5 g, 30 years of mobile communication technology evolution, the IM developers based communication technology introduction of zero (6) : mobile terminal joint – “base station” technology “” IM developers based communication technology introduction of zero (7) : mobile terminal” electromagnetic waves “– a swift horse of the IM developers based communication technology introduction of zero (8) : Zero basis, the strongest in the history of “principle of the antenna,” literacy “” IM developers based communication technology introduction of zero (9) : wireless communication network center, the core network” the IM developers based communication technology introduction of zero (10) : zero foundation, the strongest in the history of 5 g technology literacy “” IM developers based communication technology introduction of zero (11) : Why is WiFi signal bad? Introduction to Basic Communication technology for IM Developers (12) : Access to network traffic? Network Down? Get it! Introduction to Zero-Base Communication technology for IM Developers (13) : Why Cell phone Signal is Bad? How hard is wireless Internet access on high-speed Trains? “Introduction to Zero-base COMMUNICATION Technology for IM Developers (15) : Understanding positioning technology, one is enough” baidu APP mobile terminal network in-depth optimization practice sharing (1) : DNS optimization “baidu APP mobile terminal network in-depth optimization practice sharing (2) : Network connection optimization “Baidu APP mobile terminal network depth optimization practice sharing (three) : mobile terminal weak network optimization” “technology master Chen Shuo share: from shallow to deep, network programming learning experience dry summary” may screw up your interview: do you know how many HTTP requests can be launched on a TCP connection? >> More similar articles……

(This article is simultaneously published at: www.52im.net/thread-2680…)