The network layer

The layers of the network are roughly as shown in the figure. Each layer of the sender adds its own packet header, and each layer of the server is unpacked.

  1. Assume that the application layer transmits data over HTTP
  2. After the transport layer, the transport layer is generally used to determine the port, plus the TCP message header
  3. After the IP layer, the IP layer is to determine the IP address of the peer end, plus IP information packet
  4. A header that passes through the network interface layer and adds information about the MAC address
  5. Locate the receiver based on the MAC address and remove the Ethernet header
  6. The IP packet is parsed, the corresponding IP receiver is found, and the IP header is removed
  7. Parse the TCP packet, find the corresponding service (for example, QQ is port 80, wechat is port 81), and remove the TCP header
  8. The service is found, the data is parsed through the original HTTP protocol, the data is retrieved, and the end

Tips: The HTTP protocol is mainly composed of request headers, request lines, and request bodies

GET /user HTTP/1.1 (HTTP1.1 GET request /user interface) HOST:www.juejin.cn (domain name) name= Pretend to know programming (parameter)Copy the code

The advantage of layering is that complex network requests can be presented in layers and abstractions.

TCP protocol

Definition: Transmission Control Protocol (TCP) is a connection-oriented, reliable, byte stream based transport-layer communication Protocol. We often compare TCP to UDP, which is unreliable because we send data and don’t have to wait for an acknowledgement from the other end. So why is TCP reliable?

The TCP header

Let’s look at the TCP header:Except that the options and padding on the last line are uncertain, you can see that a TCP header occupies at least 5*4=20Bytes.

  • Source port: indicates the port of the sender, for example, 12345
  • Destination port: the receiving port, such as port 80
  • Serial number: TCP is byte stream based, so each TCP packet has a serial number
  • Confirmation number: Assume that the sender sends a packet with the serial number 200 and the length of the packet is 300. After receiving the packet, the receiver writes the confirmation number 500 (200+300).
  • Data offset: LENGTH of TCP header (20 bytes + indeterminate option length)
  • Reserved: for future use
  • URG: indicates that the packet segment has urgent data and should be sent as soon as possible instead of in the original order
  • ACK: The ACK number is useful only when ACK=1. After a connection is established, ACK is always 1
  • PSH: when PSH=1, it indicates that when the receiver receives PSH=1, it immediately sends out the data response without waiting for the cache to be full
  • RST: Resets the connection. A serious error occurred and the connection must be re-established
  • SYN: Used to synchronize serial numbers during the three-way handshake
  • FIN: Wave four times to disconnect.

connection-oriented

Generally speaking, we need to shake hands to confirm the relationship before we communicate. Three handshakes:

  1. The sender and the receiver are initially close (imaginary)
  2. First: The sender sends a SYN packet, asks for a handshake, and sends the ISN(C) with the initial sequence number in syn-sent state
  3. When the receiving end receives a SYN packet, the status changes from LISTEN to SYN-RCVD
  4. Second time: The receiving end sends its initial sequence ISN(s) and the SEQUENCE ISN(C)+1 of the receipt to the sender, that is, SYN+ACK
  5. The sender receives the ACK and confirms that the SEQUENCE number of the ACK is its initial sequence number plus 1, and its state is ESTABLISHED
  6. Third time: The sender sends the ISN(C)+1 of the start sequence number of the next time and returns the ISN(s)+1
  7. After receiving the ACK, the receiver confirms that the ACK is correct and the state is ESTABLISHED

A few questions:

  • The initial sequence number of the ISN is not 0. It is a random value generated by an internal function based on certain conditions to prevent forged ACK packets
  • SYN packet not carry data also consumes a sequence number, the last time we found the handshake exchange after the serial Numbers for their ISN + 1, on the one hand, said the SYN consume a sequence number, on the other hand, said the next time I began to send data serial number (such as: the first is 1, after the completion of the connection, I began the next data from 2)
  • ACK does not need to consume the serial number, otherwise endless confirmation, bottomless pit
  • If a packet consumes a sequence number, the peer end must reply with ACK. Otherwise, the packet will continue to be retried

Four waves:

Assume that the initiative-initiated closing party is A and the passive party is B

  1. After A initiates the FIN packet, the user enters the FIN_WAIT1 state
  2. After receiving the FIN packet, user B immediately replies ACK with CLOSE_WAIT
  3. After receiving the ACK from B, A enters FIN_WAIT2
  4. B will also send the current data after processing
  5. B sends a FIN packet to confirm the closure and enters the LAST_ACK
  6. A After receiving the FIN packet, enter TIME_WAIT (2MSL)

Byte stream based

Why is TCP based on byte streams? Suppose you now send 1000 bytes of data into the socket, but 1000 bytes can be divided into several cases depending on the size of the send window, receive window, or path MTU:

It could be 500, 500, it could be 100, 900. But this also reflects the fact that the data is transmitted in segments, and since segments are segments, each segment has a serial number. TCP assigns a monotonically increasing sequence number to each byte, which allows data to be strung together by sequence numbers:

reliable

The TCP connection is reliable. One reason is that the sequence of packets cannot be guaranteed after the packets are unpacked. The sequence of packets can be reassembled according to the sequence number to ensure the order of packets.

The sliding window

Must first understand a concept, you send a data block, if small blocks of data and must be sent out immediately, because a send buffer, when send buffer accumulated to a certain amount, the operating system will be the real data packets through the network card to send, similarly to the end of the read data is also a buffer, upper application each time to read data from the receive buffer. So when the receiving buffer due to the problem of the upper processing program, not in time to take away, resulting in the receiving buffer backlog, when the receiving buffer has no space to receive data, then have to tell the sender not to send data. Angle of sending end:

  1. Packets that have been sent and receive an ACK from the peer end
  2. Sent but has not received an ACK from the peer
  3. A block of data that has not yet been sent but can still be sent
  4. The peer end has reached the zero window for unsent data blocks that cannot be sent

Angle of receiving end:

  1. Received, replied
  2. The received data block is not received
  3. A block of data that cannot be received

Assuming 32-35 has returned, the sliding window must be moved backwards.

Congestion control

First concept: RTT: time from packet sending to packet returning MSS: TCP maximum segment size: MTU (the maximum data block allowed to be transmitted at the link layer) =1500. If the MTU exceeds 1500, the data will be split, and the IP layer will split the data. Since the data is from TCP, it is better to split the data directly at the TCP layer. MSS = 1500 (MTU) -20 (IP header size) -20 (TCP header size) = 1460. We know that the receiver has a sliding window to limit the data sent, so the sender doesn’t need to limit the data sent? Of course, due to network problems, assuming that the current network is very poor, the sending end of brainless data, many packets can not reach the opposite end, causing unnecessary retransmission. CWND: The amount of data that the sender can send without receiving an ACK. The size of the data block depends on the minimum value of the receiving window and the sending window. CWMD =10 means 10 MSS data can be sent

  1. Slow start

Due to the unpredictability of the network, it is not possible to transmit a large amount of data at first, but due to the efficiency, it is not possible to transmit a small amount of data for a long time. The best way is to start slowly. 1RTT: CWND =10 * 2=20 2RTT: CWND =20 * 2=40… Then of course you can’t double it indefinitely.

  1. Hold the avoid

Ssthresh is called the slow start threshold. When the slow start CWND reaches SSTHRESH, it does not double. For each RTT, the CWND increases the MSS segment size by 1. Assuming ssTHRESH =40, then 1RTT: CWND =41 2RTT: CWND =42

  1. Fast recovery

CWND /2 = ssTHRESH = CWND /2, CWND /2 = CWND /2

  • Raise timeout retransmission: CWND = SSTHRESH +3
  • Received 3 repeated ACK, fast retransmission CWND =1

Retransmission mechanism and timer

Network transmission is not always reliable, so always have some back-up.

ACK always indicates that all packets before ACK are received

Suppose that three packets are sent 101:200 (OK) 201:300 (FAIL) 301:400 (OK) and the second packet fails to be sent. Even if the third packet is successfully received, the ack is 201. 301 will be replied only after the second packet is successfully retransmitted.

SACK fast retransmission

Although the above sender to transfer packets, but also have to wait for the timer to arrive, when this kind of situation in the network frequent interaction appear inefficient, so he appeared a sack, such as the sender received three ack, would have been aware of packet loss, under the current ack need a packet retransmission, so you can improve efficiency, Don’t wait for the timer to come

  • Setting up connection timer

If a SYN packet fails to arrive at the server due to some problem, the client starts a timer every time it sends a SYN packet, for example, 2s. If it does not receive an ACK from the server within 2s, the client sends another SYN packet. The next timer may be 3s. If the connection fails, the connection fails to be established.

  • Retransmission timer

After the packet is sent, a timer is enabled. If no ACK is received from the peer end within the timer time, the packet is retransmitted

  • Delayed reply timer

After receiving the packet, the receiver does not rush to reply to the ACK and waits for a while. If the data packets need to be returned at this time, take them with you to improve efficiency.

  • Hold timer

When the size of the receiving window on the receiving end is 0, it will inform the sender in the ACK that I cannot receive data now. Then, when will IT be restored? The sending end cannot wait forever, so the sending end goes to inquire at intervals.

  • Keepalive timer

When a TCP connection is established, if SO_KEEPALIVE is enabled, the keepalive timer takes effect. If the two parties do not communicate with each other for a long time, they do not know whether the other party is alive or dead. If the peer end is suspended, the connection can be released without further maintenance

  • FIN_WAIT2 timer

After the sender sends a FIN packet and the receiver replies with an ACK packet, the sender starts a timer. This prevents the receiver from sending FIN packets in time or the NUMBER of FIN packets cannot be reached. If the timeout occurs, the active party releases the connection itself.

  • TIME_WAIT timer

TIME_WAIT is the last state to enter after active shutdown. The value of TIME_WAIT is 2MSL. MSL is the maximum lifetime of a message.

  1. 1MSL means that in case the last ACK fails to reach the peer end, the peer end resends a FIN packet
  2. 1MSL Ensures that the ACK itself can reach the peer end

Worst-case scenario: an ACK consumes 1MSL, and the peer end retransmits the FIN. The FIN also consumes 1MSL