Welcome to search “little monkey technical notes” to follow my official account, receive rich interview materials and learning materials.
Do you know anything about TCP buffers? What does it have to do with packet gluing and unpacking in TCP transmissions? At what stage of TCP does sticky and unpack occur?
A quick review of TCP concepts: TCP is connection-oriented, reliable, two-channel, byte stream one-to-one transport in network transport. TCP communication must first establish a connection, and then allocate necessary kernel resources. After data exchange, both parties must disconnect to release system resources. Long links can reuse the same channel without disconnecting. So what is a TCP buffer?
There are two Spaces in an operating system: user space and kernel space. Each socket connection is in kernel space, and the kernel has a send and receive buffer for each socket. TCP’s duplex mode and flow control depend on the filling of these two buffers.
We used the socket to get “OutputStream” to get an OutputStream and write out bytes. In fact, it was written into the “send buffer”. In this case, the data is not necessarily sent to the other machine. The “write()” method simply copies user-space data into the kernel send buffer, and TCP decides when to send it.
TCP sends data from the send buffer through the network card to the kernel buffer on the target machine. If the system never calls the “recv()” method to read, the data will always be squeezed into the socket’s recv buffer.TCP sticky packet and unpack problem
If you understand the picture above, the problem of sticking and unpacking will be easier to understand. Here I would like to ask a question first, do sticky and unpack occur in the transmission process?
At what stage do sticky and unpack problems occur? First we need to clearly understand that TCP data is reliable and therefore definitely not transmitted in the process! Because data is sent from the buffer -> nic, sticky packet problems occur when data is read from the buffer. Unpacking occurs from the buffer to the nic stage.
Sticky packets: sticky packets are when the sender sends two or more packets to the receiver at the same time.
Suppose the sender needs to send two pieces of data: “Relax, you’ll be fine!” “And” Read the article, you can learn.” The two pieces of data are first put into the send buffer, and then in the receive buffer of the data sent through the network card to the receiver. If the receiver does not get the data out of the receive buffer in time, the data will be squeezed in the buffer, so that the two data will pile up together, forming a single data. This is the sticky packet problem!So what is the unpacking problem? The problem with packet unpacking is that THE length of each packet sent by TCP is limited. If a packet is too large, TCP will split the packet into two packets for sending.
Suppose the data to be sent is “Relax, you’ll be fine!” Big enough that TCP split it into “Relax, you’re doing this” and “It’s okay!” when it was sent. After sending, the receiver receives two packets, which is the problem of unpacking.In fact, if it is too large, it may be broken up into three or more packets. However, NO matter how many packets are separated, TCP can ensure that packets are sent sequentially and correctly.
So what are the reasons for sticking and unpacking? This has to do with TCP buffers and slider Windows, MSS/MTU limits, and Nagle algorithms.
Given the sticky and unpack problem, how can we avoid or deal with it in real development? That’s the protocol that defines our communication. In this way, if the packet is stuck, different packets can be distinguished according to the protocol. If the packet is unpacked, the data can be processed after forming a complete message.
The first method is fixed-length protocol: The so-called fixed-length protocol specifies a fixed length of a packet, and each party intercepts a fixed length according to the convention. Suppose we need to send the words “hello” and “very” at the agreed five-byte intercept. Then words of less than 5 bytes can be supplemented with 0, and the following rule occurs.Due to the lack of the agreed length need to complement 0, so the fixed length agreement will cause bandwidth waste.
The second method is special character delimiter: To use special character delimiter is to append the special character delimiter to the end of the packet, and use the secondary delimiter to indicate that the packet is complete. For example, “\ N” is encountered.
In this way, packets can be divided, but the requirement is that the packets do not contain special delimiters.
The third way — fixed header length: before sending data, you need to obtain the size of the binary bytes of the content to be sent, and then add a fixed header integer in front of the content to represent the length of the binary bytes of the message body.
This approach avoids the problem of special characters and is a production option, as I described in a previous article.
In fact, for Java programmers, we don’t need to worry too much about the receive and send buffer, we need to understand its concept, because the bottom layer has been wrapped for us. Understand the process and reason of “sticking package” and “unpacking”.
By looking at the data interaction between user space and kernel space, you might find that a complete interaction requires four copies of the data, which can have an impact on performance. This also has the interviewer often asked the “zero copy” question, try to learn their own understanding of this article “zero copy”, this is for the later study of Netty to lay a solid foundation.