This article is taken from my public account [Sun Wukong, Don’t talk nonsense]

Technical people know TCP’s three-way handshake and four-way wave. To make things less boring, I’d like to start with a colloquial description of TCP’s three-way handshake.

Story mode

Once upon a time, there were two balls. One was called “little Frog” and the other was called “little Turtle”. They lived in two parts of the mountain. Both balls have a mouth and ears, but neither can be seen. One day, the “little frog” went for a walk and happened to meet the “little turtle”. They wanted to talk, but they were not sure whether the other party could hear them or whether the other party could talk (because only both sides could hear and talk could they talk). Here, the ability to speak is the ability to send, and the ability to hear is the ability to receive.

Finally, “little Turtle” said: “Hello ~ hello”

The little frog heard the greeting, and the little frog knew that the little Turtle could talk. (Round 1: “Little Turtle” can send messages)

So, “little Frog” politely replied: “Hello, hello ~”

Little Turtle heard little Frog’s reply. Then the little turtle knew that the little Frog could not only hear, but also talk. (Round 2: The Little Frog can send and receive messages)

At this point, the frog doesn’t know if the turtle can hear. So the little turtle needs to tell the little Frog. It hears the little Frog’s reply and can tell the little frog more. (Round 3: The turtle can receive messages)

And so, after these three rounds, the happy chatter of the two little ones began.

That’s all there is to a TCP handshake.

Here is a very important concept: the ball not only needs to know that it has a certain ability (the ability to send information, the ability to receive information), but more importantly, to let the other party know that it has this ability. To paraphrase the current fashion: I don’t just want what I think, I want what you think.

Explain the TCP three-way handshake

With this story behind us, let’s look at the formal, boring, standard TCP three-way handshake:

TCP establishes reliable (receive assured) full-duplex communication through a three-way handshake.

Before the client and server can exchange application data, they must agree on the starting packet sequence number and other connection-related details. For security reasons, the serial number is randomly generated by both ends.

  • SYN
  • The client selects a random sequence number X and sends a SYN packet, which may also include other TCP flags and options.
  • SYN ACK
    • The server increments x by one, selects its own random serial number Y, appends its own flag and options, and returns a response.
  • ACK
    • The client increments X and y by one and sends the last ACK packet during the handshake.

Do I have to shake hands three times?

Yes. Because we want to create reliable (guaranteed receipt) full-duplex communication, as mentioned in the Story section, the three-way handshake allows both the client and the server to be sure that both the receiving and sending capabilities are normal. Details are as follows:

First handshake: The client sends a network packet and the server receives it. In this way, the server can conclude that the sending capability of the client and the receiving capability of the server are normal.

Second handshake: The server sends the packet and the client receives it. In this way, the client can conclude that the receiving and sending capabilities of the server and the client are normal. However, the server cannot confirm whether the client’s reception capability is normal.

Third handshake: The client sends the packet and the server receives it. In this way, the server can conclude that the receiving capability of the client and the sending capability of the server are normal.

At this point, the client and server can be sure that both the receiving and sending capabilities are normal.

What happens if there’s no third handshake?

This is mainly to prevent the invalid connection request segment from being suddenly transmitted to the server, thus reducing the overhead of the server.

If only two handshakes are used to establish a connection, the segment of the connection request packet sent by the client is stuck on some network nodes for so long that it does not reach the server until some time after the connection is released. This is an invalid connection request segment. However, after receiving the invalid connection request segment, the server mistakenly thinks that the client has sent a new connection request. Then it sends a confirmation message to the client, agreeing to establish a connection.

Since there is no request from the client to establish a connection, no confirmation from the server is processed and no data is sent to the server. However, the server assumes that a new connection has been established and waits for data from the client. The server will waste a lot of connections.

Optimization of the TCP handshake phase

The TCP three-way handshake has been described above. Even without being able to change the number of handshakes, our Web performance optimizations found plenty of room for improvement.

Do you need three handshakes for every connection?

In principle, three handshakes are required for each connection. But in the actual application, we will find that most of the cases are: multiple requests from the same client to connect to the same server. At this point, a handshake is unnecessary for every connection. The fully capable handshake procedure operates only once, and then lets the other connections directly interact with the data of the service on the server side.

So how do we do that? By TFO technology.

TFO

TCP Fast Open (TFO) is a mechanism that transmits data during the establishment of a TCP connection. You can use this mechanism to advance data interaction.

  1. We consider adding the TCP option of the Fast Open Cookie request to the client’s SYN packet during the first handshake of the three-way handshake. In this way, the server knows that this type of request is a TFO request and contains some additional information in the second handshake information returned.

  2. The server generates a cookie, which is generated by encrypting the client’s IP address with the key, and can be regarded as the handshake certificate provided by the server to the client. The server sends the SYN | ACK response to the client, including the cookie in the options of the response packet.

  3. The client saves the cookie credentials after receiving them in the second handshake message.

  4. In the following connection request, the client only needs to carry the cookie credential in the handshake request, and the server will check whether the credential is valid (whether the corresponding IP address has produced the handshake behavior before and is within the validity period).

  5. If so, after two handshakes, data communication is ready. It doesn’t take three times.

According to test data, TFO can reduce HTTP transfer latency by 15% and full-page download time by an average of 10% and up to 40%.

Wow, TFO technology is so good, that everyone use it?

Limitations on TFO use

TFO technology is a relatively new technology. It needs to be supported by both the server and the client.

For servers, this is only supported in Linux 3.7 and later Versions of the Linux kernel. To enable TFO, set /proc/sys/net/ipv4/tcp_fastopen to 3 on a Linux vm of the corresponding version.

For the browser side, the browser kernel support is required. Client support for TFO is as follows:

Windows Edge browser 14352 and later versions. This version of Chrome runs on Linux and Android. Version not supported on Windows. The Firefox browser is disabled by default. You can manually enable it.Copy the code

conclusion

The TCP three-way handshake process is a very fixed and mature topic. How to write it was a challenge. This article is developed by story + drawing, hoping that this way will be more interesting.


Front-end performance optimization series:

(a) : start with the TCP three-way handshake

(two) : for the blocking of TCP transmission process

(3) : optimization of HTTP protocol

(IV) picture optimization

(v) : browser cache strategy

(vi) : How does the browser work?

(vii) : Webpack performance optimization