Oh, old TCP knowledge

Hi, everyone, I am Master Kong, today I will talk to you about the common knowledge points of network protocol, in order to talk about this knowledge? This year is coming to an end. Maybe many students are preparing for the interview next year. So I think no matter you are front-end, back-end or client, the review of network protocol should be indispensable.

Network protocols are inseparable from HTTP and TCP. In the network hierarchical model, HTTP belongs to the application layer protocol, TCP belongs to the transport layer protocol. In fact, there are application layer protocols such as SMTP and FTP, transport layer and UDP. HTTP is inseparable from TCP, so let’s talk about TCP and IP first.

TCP and IP

One day, you and your colleagues are chatting on wechat, I don’t know if you have thought about it, you have installed a lot of software on your computer, such as NetEase Cloud, QQ… Wait, why is it that the messages you send through wechat are correctly sent to the other party’s wechat instead of other apps? At the same time, to exaggerate a bit, it seems silly to ask why the message you send is sent to your colleague’s computer, but not to the computer next door. In fact, it is necessary to talk about TCP and IP protocol. First of all, everyone can understand IP, every computer has an IP, which is why our information can be accurately sent to our colleagues rather than the old Wang next door, because we know the IP of the colleague’s computer, this is what the IP layer does.

When you find your colleague’s computer through IP, but also to find your colleague is running wechat software, there are so many software on the computer, and everyone’s IP is the same, how to do this? Port is the answer, that is TCP layer of dry matter, at the time of data through TCP layer, will plus the target port is our WeChat process takes up the port, and then the packet arrived at your colleague’s computer, in the TCP layer will be unpacking, will find that the target port after unpacking, and then put data to our WeChat process (computer perspective: The port number is 10086. Please leave this data to the wechat process.

The TCP layer will not only add the destination port number, but also add the sender port number. The IP layer will not only add the destination IP, but also add the sender IP. A socket quad determines a connection.

We often say that TCP is a reliable transport-layer communication protocol based on byte streams. Here we need to consider three abstractions in the definition:

What does byte stream based mean?
What is connection-oriented?
How reliable?

Byte stream based

Let’s start with the first question. What does it mean that TCP is based on byte streams? For example, when we write 1000 bytes into a socket, there are many different situations. In this case, 1000 bytes will be copied into the kernel buffer. However, it is uncertain how 1000 bytes will be sent through the network card. They’re 300, 700, maybe 500, 500, but no matter how they’re divided, each byte has its own serial number.

The reason for so many cases is influenced by the MTU of the maximum transmission unit of the path, the size of the sending window, the size of the congestion window and other factors (these concepts will be described later, just follow me). Here we also talk about reliability, because the packet has been segmented in the TCP layer, which is equal to a piece of data is scattered. Is receiving data packets over the break up of the order may be different, but the kernel after receiving out-of-order packets, does not throw it to the upper application directly (HTTP, etc.) need to be assembled in accordance with the order of the packet, the assembly rely on is the serial number, the way based on byte stream transmission of packets, how to determine the packet’s serial number? The sequence number is actually the sequence number of the first byte of the packet.

Three-way handshake

Then there’s the second problem: connection-oriented. That’s right, we’re back to the old chestnut: three handshakes, four waves. Three handshakes and four waves are also a sign of reliability. To be reliable, you need to make sure that both parties are ok before establishing a connection, i.e. a three-way handshake. Let’s see what the three handshakes do first. In the meantime, let’s see why three times is good, or two or ten times?

See in the picture above a bunch of stuff such as syn, seq, ack, isn, etc., don’t be afraid, first we explain one by one, then you will understand, we also said to the above, because need reliable, can’t come up to send data directly, a one thousand each other is not online, the data is not lost, Therefore, the purpose of shaking hands is to confirm the status of both sides are OK, so how to distinguish this communication is shaking hands rather than sending data normally? This is where the SYN packet comes in. The SYN is a flag attached to a packet indicating that the purpose of the communication is to shake hands.

After the three-way handshake of sending SYN packets, there is one more important thing: Exchange each other’s initial sequence number seq, this is because the byte stream based TCP is orderly, each byte of data after shaking hands to confirm each other’s initial sequence number, all the next byte data is based on the initial sequence Numbers backwards accumulation, method for generating the initial sequence number is ISN function, it will probably be randomly generated a number, It’s important to note that it doesn’t start at zero. The interaction is smooth when one end sends its initial sequence number and receives an ACK from the other end, where the ACK value is the sequence number plus 1.

Ok, now that we’ve got a few concepts and meanings we’re going to look at the three-way handshake.

First of all, the sending end C and the receiving end S do not contact each other at the beginning, there is no intersection, there is no transaction, the door is closed, that is imaginarycloseState.
For the first time,: Sender C sends a SYN packet with the initial sequence number ISN(C), and sender C is inSYN-SENTStateSending end C has the capability of sending packets.
The second time: After receiving the SYN packet from sender C, the receiving end knows that it wants to perform handshake authentication, so it sends a SYN packet with its ISN(s) sequence number, and replies with an ACK, the ISN(C)+1. The ISN(c)+1 means, “I received your initial sequence number, and the next time you communicate, the packet will start from the ISN(C)+1.” At this time, the receiver is in the SYN-RCVD state, and the second handshake indicates thatThe receiving end has the capability of receiving and sending packets.
The third time: After receiving the ACK, the sender c needs to reply to the receiver S, so that the receiver S can knowThe sending end C also has the capability of receiving packets“, the sender will send back an ACK =ISN(s)+1, which means similar to the above: “I have received your initial serial number, and for the next communication, your packet serial number will start with the ISN(s)+1”, so far the three handshakes have been completed, and both parties are in the same stateESTABLISHED.

As TCP is a reliable transport layer communication protocol, the purpose of handshake is to confirm that both parties have the ability to receive and send packets. From the above description, three times is just enough. If there is less, the ability of one end to receive and send packets cannot be confirmed. The receiver does not know if it has received the packet. Of course, more than 3 times is certainly no problem, but there is no need, because 3 times can already know the status of both parties.

I don’t know if you noticed, but both sides of the connection consume a serial number. Can we not consume a serial number here? No, you must consume one. Remember that a segment that does not occupy the sequence number does not need to be acknowledged. For example, an ACK segment that consumes the sequence number must be acknowledged by the peer end. If the segment is not acknowledged, the packet is retransmitted until the specified number of times is reached. For example, SYN packets need to be acknowledged.

Four times to wave

After the three handshakes, let’s take a look at the four handshakes, the four handshakes are in some state, and that’s something to be aware of, and that’s what the interview is all about. Again, let’s see why we need four waves, and what does each wave do?

For the convenience of description, the active breaking party is defined as “A” and the passive breaking party is defined as “B”.

For the first time,: A suddenly wants to close the connection and stops playing, so it sends A FIN packet. The FIN packet is opposite to the SYN packet, indicating that A wants to disconnect and the FIN packet needs to be confirmed by the peer. Therefore, the FIN packet needs to consume A sequence numberFIN_WAIT1State.
The second time: B receives the FIN packet from the other end and thinks, “He doesn’t want to play anymore.” Then B responds to the FIN packet with an ACK, which means, “Opposite, I know.CLOSE_WAITState: A is in the same state after receiving the ack from the peer endFIN_WAIT2State.
The third time: After sending an ACK packet, if there is still unfinished data, B needs to send the unfinished data to the peer end. When the data has been sent, there is no relationship between A and B. Therefore, B will send A FIN packet, meaning “The data has been sent, you can disconnect”LAST_ACKState.
For the fourth time: Again, the FIN packet needs to be confirmed. Therefore, after receiving the FIN packet, user A will immediately reply with an ACK, and the peer end will disconnectCLOSEDState, and A is inTIME_WAITState, that is, automatically disconnect after 2MSL.

Can you wave three times?

Look at the flow. Four waves is absolutely fine, so here’s the question. Three? In fact, in some cases, three times is also ok, such as the passive breaker does not have to deal with the DATA there is no DATA that part, that in fact, ACK and FIN sent together is no problem, if there is DATA, we have to combine ACK+DATA+FIN sent together what will happen? It takes time to process DATA first. Therefore, to send AN ACK after DATA processing is complete, the active breaker may not receive an ACK and resend the FIN packet.

Why is the last active break required to be in TIME_WAIT state? What does this state represent?

TIME_WAIT is the last state entered by the active breaker. MSL is the maximum lifetime of a packet. Normally, a packet will be discarded if it is not received by the peer end after exceeding MSL on the network.

An MSL is used to ensure that the last ACK will reach the peer end. What if the ACK has not reached the peer end after 1MSL?
If the ACK of the active breaker fails to reach the peer end, it triggers the retransmission of the FIN packet from the broken party, and the other MSL ensures that the retransmitted FIN packet also reaches the peer end.

Therefore, 2MSL = Maximum lifetime of ACK messages (MSL) + Maximum lifetime of FIN messages (MSL).

Why do FIN packets also need to consume a serial number?

FIN packets consume the same sequence number as SYN packets. Why? It’s a bit of a stretch to just say, “Because FIN packets require peer validation, and the segments that require validation consume serial numbers.” Let’s take a look at the figure.

Assuming that the serial number of the sender is now 100 and 100 bytes of data are sent, the next ACK would logically be 200 (199+1).

The seQ of the FIN packet should be 200 because 100 bytes have already been sent
If the FIN packet does not consume a sequence number, then the corresponding ACK would be 200, not 201. It’s kind of confusing.

Therefore, both SYN packets and FIN packets consume a sequence number to distinguish them from normal data.

reliable

As we know from the above, data transmission in TCP is based on byte stream. Data blocks are divided into packet segments and then sent out. The size of packet segments is determined by many factors, such as path MTU, sending window size, receiving window size and so on. So let’s look at what are these factors? First, we take a look at the path MTU. In the network layer, we know that the final data is sent through the link layer. The channel of the link layer is actually limited, which is called the MTU. You can check the MTU of your local network card by using netstat -i.

netstat -i
en0   1500  <Link#6>    08:f8:bc:6f:6a:03 34427890     0 37802460 26255     0
Copy the code

Note that this is only the LOCAL MTU. On a real network, your data may pass through a series of physical hardware such as routers and switches after leaving the network card. Each physical hardware has its own MTU, which MTU plays a key role in the long network path? The smallest is called the path MTU. This is like the bucket effect, where the capacity of the bucket is determined by the shortest board. When your packets are larger than the MTU, they are split into appropriate network packets and sent out. The IP layer found that the data packet of the link layer had a size limit, so the IP layer said: “Since the link layer has a size limit, no matter how big the data packet is, it will also be disassembled. It is better to do it myself, before sending the data to the link brother, I directly divide the data according to its requirements, so as not to bother it.” IP layer dry data segmentation things, TCP layer not happy, “do what ole, do what ole, I in their upper layer, data should IP layer younger brother segmentation, I face!” Therefore, in order to avoid data fragmentation by the sender, the TCP layer will actively divide the data into small segments and then hand them to the IP layer. The maximum Segment that TCP can divide is called Max Segment Size (MSS). What is the value of this MSS? It actually has this value:

MSS = MTU-IP header size -TCP header sizeCopy the code

The IP header and TCP header each contain 20 bytes. If MTU=1500, MSS = 1500-20-20=1460. In this way, the TCP layer takes the initiative to divide the data, which has been praised by the IP layer and the link layer. IP layer: “elder brother reliable”. Link layer: “Big brother’s big brother is reliable”.

I can only eat so much – sliding window In the socket communication, we know that the sender has sent the buffer, the receiver has a buffer at the receiving end and sending end the data written to the socket buffer, to buffer is full, or after a period of time, of the buffer data will be send card section by section, after data to reach the end, Instead of just waiting for the other person to process it, which would be inefficient, we put data into a buffer, our receive buffer, and the application keeps fetching data from the receive buffer. The buffer serves as a buffer, which makes sense, but what if the sender sends too fast, or the receiving application processes too slowly, causing the receiving buffer to fill up too quickly? You can’t send it directly. You have to tell the sender not to send it yet. This brings us to the concept of “sliding Windows” in TCP. We know that TCP sends data based on byte streams, which means that each byte is actually an ordered number. What can we do with serial numbers? First of all, data can be reorganized by serial number. Secondly, serial number before ACK means that all have been received. Sliding window is related to ACK.

It is standing on the sending end point of view of the state of the packet, send part of the sliding window can be thought of as the sliding window, for parts sent has confirmed, is one of the past, it will only make the sliding window to the right move, really affect the size of the sliding window is “sent unrecognized” and “not send can send” part, The remaining “can’t send” is because there is not enough space on the receiving end. Let’s see what a sliding window looks like from the receiving end.

It can be found that the size of the window is actually the same, the only difference is that for the receiving end, it is either received or not received. If it cannot receive, it means that there is not enough space. How does the sender know how much free space the receiver currently has? In fact, when the receiver ACK will bring its own window size, so that the sender knows the size of the receiver window. As shown in the above figure, when the receiver receives the data from 32 to 35, it will tell the sender that ACK=36, and the sliding window of the receiver will move back 4 bits. When the sender receives ACK=36, it will know that the receiver has received the data before 36, so the window of the sender will also move back 4 bits.

Sliding Windows are great for processing as much data as you can, but here’s the problem: what happens if the sender is super fast and the receiver is super slow? At some point the sliding window goes to zero, and the receiver tells the sender: “Fuck off, there’s no room.” After the sender receives the notification: “Turned out to be a weak chicken, take a break, when it next time ack inform me”, normally, after the receiver in the treatment of the data can tell the sender can continue to send data, unexpectedly appeared, however, due to the receiving end of the master of the host is listening to netease cloud music, playing with a 2 k, at the same time also nima admiring the area b standing dance up moths of dance, Lead to network card a lot of pressure, and finally an ack is lost, so the sender don’t know the receiver part is already dealing with the data, it can do, if has been lost, not to have been silly, etc., have initiative, then make a zero detection window timer, this timer function believe you also know, That is, when the receiving window of the receiver is 0, the sender will take the initiative to send the detection packet every once in a while to know the state of its receiving window by forcing the response of the peer end. It has to be said that the detection of the zero window is stable enough.

Go easy to slowly – congestion control above we said to the sliding window can reasonable control of the receiver can handle the amount of data, pay attention to here say is, if the network condition is very poor, the sender one-time send a lot of data, and the window is not fill (here means it doesn’t matter with the size of the window), then what happens? I think big probability is the sender crazy retransmission (because the network is poor, not yet received an ack), then we, in turn, think about the network status is poor, we also need to send a large amount of data in the past, a lot of retry not trouble to send end, so you need to go easy, go easy here say is “congestion window (CWND)”, Congestion window refers to the maximum number of MSS segments that can be transmitted before receiving an ACK from the peer end. The actual sending window size is the minimum between the congested window and the receiving window. MSS we know that in the case of MTU = 1500, its value is 1460, congestion window refers to how to send 1460, because at the beginning of the connection is established, the sender does not know the network situation, if the network status is very poor, in the past a lot of data is not wise, slow start is the right choice, slow start just in time to stop, At the same time, slow start does not mean slow all the time. If the network is ok, it will grow slowly over time. This is the purpose of slow start. At the beginning of communication, as long as the sender receives an ACK, the CWND will be doubled, for example, the first is 10, after receiving an ACK, the next is 20, after receiving another ACK, is 40… It’s smart, we just have to tolerate a slow start, and over time CWND will catch up exponentially. Tea and chat, all is well. When CWND reaches SSTHRESH, it indicates that CWND is not small, and it may be dangerous in doubling. At this time, you can choose to increase a little, not doubling. Add 1 MSS to CWND each time.

When CWND <= SSTHRESH, congestion window grows exponentially (slow start)
When CWND > SSTHRESH, congestion window increases linearly (congestion avoidance)

Even if you add 1 MSS, it can be infinite over time, but why isn’t there a problem in reality? I think one of them is that the size of the real send window is the smallest of the two. After all, the receive window can’t be infinitely large. Ssthresh = CWND / 2, and then CWND is set to 1 packet segment. It starts all over again slowly and congestion is avoided. For example: “Assume that the initial value of SSTHRESH for TCP is 8. The network timed out when the congestion window rose to 12, so TCP started using slow start and congestion avoidance. Try to find the size of each congestion window from the first transmission to the 15th transmission.

CWND starts at 1 and then doubles again until it reaches SSTHRESH (8). At this point, it starts at 1 MSS each time. When it reaches 12, a timeout occurs, ssTHRESH becomes 6, then CWND starts at 1 again and doubles again. Find sthRESH =6, so it becomes 6, and start adding MSS one at a time. So 1 and 15 are 1 and 9, respectively.

The awesome algorithms make me laugh, most of the things that make the Imperial Gasoline System work, how do you know that the network congestion has expired? This is actually very good judgment, after over a certain time, the sender received an ack, may be the network timeout, normally, at this time, the sender will use retreat strategy to send, every time the retransmission intervals is probably a few hundred milliseconds, this a few hundred milliseconds milliseconds for humans is quite fast, but for the computer is slow, Is there a faster way? Let’s start with an example: If you want to send four packets ([1,100], [101,200], [201,300], [301,400]), normally after sending the first packet, the reply will be ACK=101. However, when sending the second packet, the network timed out and the packet was lost. When the sender continues to send the third and fourth packets, it does not reply with ACK= 301,401, but with ACK=101. Please remember that ACK means that all data before the sequence number has been received. As says on, normally, at this time to wait for a few hundred milliseconds to be aware of packet loss, resend, and if you want to more quickly, such as account received three times of repetitive ACK is packet loss, this is much faster, this is the “fast retransmission (SACK)”, but simply tell 101 before data received (the first packet) is a little low, SACK also informs ACK that the third packet is missing and that I received the fourth packet, so that the sender knows that the third packet is missing and can retransmit the second and third packets.

The last

Wechat search [pretend to understand programming], decadent together

Past highlights:

The beating side of the bit
The two sides of the beat
How did network IO go to zero copy

TCP and IP

Byte stream based

Three-way handshake

Four times to wave

reliable

The last

Past highlights:

Related Posts

Java load and execution

Summary of computer networking notes: overview of Part1

[7] : Service meltdown and Downgrade Hystrix