Autumn recruitment escort series of articles are constantly updated, the previous articles are as follows, you can help yourself:

  • HTML, CSS: Autumn recruitment escort – HTML, CSS
  • Js: Autumn recruitment escort — JS interview part 1, autumn recruitment escort — JS interview part 2
  • Browser: autumn recruitment escort – browser

The transport layer

The transport layer lies between the application layer and the network layer. In Internet protocols, we focus on TCP and UDP. Let’s look at the services provided by the transport layer:

  • Transport layer protocols provide logical communication between running processes.

  • Application processes send messages to each other using logical communication capabilities provided by the transport layer, regardless of the details of the physical infrastructure that hosts the messages

  • The transport layer protocol is implemented on the end system:

    • At the sender, the transport layer converts the message received from the application process into a message segment
    • Add a transport layer header to each packet segment to generate a transport layer packet segment
    • These segments are passed to the network layer
    • The network layer encapsulates the packet and sends it to the destination
    • At the receiving end, the network layer extracts the transport layer segment from the packet and submits it to the upper transport layer
    • The transport layer processes the received message segment
    • The data in the packet segment is used by the receiving application process

UPD and TCP provide a service model

  • Multiplex and multiplex decomposition of transport layer
  • Provide integrity checks

Additional services provided by TCP

  • Reliable data transfer service
  • Congestion control

1. Differences between TCP and UDP

TCP is a connection-oriented, reliable, byte stream – based transport layer protocol. UDP is a connectionless transport layer protocol.

(1) Connection-oriented services

  • Connection-oriented services: TCP lets the client and server exchange transport layer control information with each other before packets flow.
  • This so-called handshake process alerts the client and server to prepare them for the arrival of a large number of packets
  • After a TCP connection is established between the socket of two processes, both parties can send and receive packets on the connection
  • When the packet relaxation ends, the connection must be removed.

(2) Reliable data transmission service

  • Communication processes rely on TCP to deliver all data error-free and in the proper order

(3) Byte – oriented stream

  • UDP data transmission is based on datagrams because it inherits only IP layer features, whereas TCP maintains state by turning IP packets into byte streams.

2. Three-way handshake for TCP connections

  • The client sends a SYN to indicate that it wants to establish a connection to the server. And with the serial ISN

  • The server returns an ACK (serial number: client serial number +1) as an acknowledgement. Also send a SYN as a reply (the SYN sequence number is unique to the server)

  • The client sends an ACK to acknowledge receipt of the reply (serial number is server serial number +1)

1. Why three handshakes, not two or four?

Because TCP connections are full-duplex, data can be transmitted in both directions simultaneously. So you want to make sure that both parties can send and receive data at the same time

  • First handshake: prove that data can only be sent by sender
  • Second handshake: ACK ensures that the recipient can receive data, syn ensures that the recipient can send data
  • Third handshake: data can only be received after sending
  • Four handshakes are wasted. Two handshakes cannot guarantee “both parties have the sending and receiving function”.

2. Why does the client need to send a confirmation at the end? The main purpose is to prevent the invalid connection request message segment from suddenly being sent to the server, resulting in an error.

3. TCP waves four times

  • The closing party sends the FIN to unilaterally shut down data transmission

  • Upon receiving the FIN, the server sends an ACK (serial number +1) as an acknowledgement.

  • After data transmission is complete, the server also sends a FIN identifier to disable data transmission in this direction

  • The client replies with an ACK to confirm the reply

1. Why do you wave four times and shake hands three times? The “four wave” is required for TCP connection release because the FIN release packet and ACK acknowledge packet are transmitted through the second and third handshake respectively. Why are connections made together and released separately?

  • When establishing a connection, the passive server completes the CLOSED phase and enters the Handshake phase without any preparation. It can directly return SYN and ACK packets to start establishing a connection.
  • When releasing a connection, the passive server receives a request from the active client to release the connection but cannot release the connection immediately because necessary data needs to be processed. Therefore, the server sends an ACK message to confirm receipt of the packet and then returns a FIN packet to release the connection after the close-wait phase is complete.

4. Principle of TCP reliable data transmission

(1) RDT1.0 protocol

Consider the simplest case first, where the underlying communication channel is completely reliable

  • The sender

    • Rdt_send (data) : receives data from a higher level
    • Make_pkt (data) : Generates a group containing this data
    • Send packets to a channel
  • The receiving end

    • Rdt_rcv: Accepts a packet from the underlying channel
    • Extract (packet, data) : Extract data from a packet
    • Deliver_data (data) : Sends data to higher levels

(2) RDT2.0 protocol

The first protocol is the case when the data is considered completely reliable, but in the real model is the model in which the bits in the grouping may be damaged.

  • The sender

    • Condition 1:

      • The sending protocol waits for data to be transmitted from the upper layer
      • Rdt_send (data) : Receives data from higher levels, with validation and
      • Make_pkt (data) : Generates a group containing this data
      • Udt_send (SNDPKT) : sends the packet
    • Condition 2:

      • Wait for an ACK or NAK packet from the receiver
      • Rdt_rcv (RCVPKT) && isACK(RCVPKT) : indicates that the receiver has correctly received the packet and the status changes to waiting for upper-layer call
      • Rdt_rcv (RCVPKT) &&isNAK(RCVPKT) : indicates that the response of the previous packet receiver is retransmission, re-uploads the packet and waits for ACK and NAK sent back by the receiver
  • The receiving party

    • The packet is not damaged, return ACK
    • Packet damaged, return NAK

(3) RDT2.1 protocol

Rdt2.0 appears to work, but has a fatal flaw that does not take into account the possibility of ACK or NAK corruption. So ordinals were introduced on top of RDT2.0.

The sender



The receiving party

(4) RDT 3.0

In addition to the bit damage, we also consider the situation of packet loss in the underlying channel in the computer network.

Set a reverse tech timer

  • A timer is started each time a packet is sent
  • Response timer interrupt
  • Stop timer

(5) Assembly line

Rdt3.0 is a well-functioning transport protocol, but the special performance of its stop-wait protocol (waiting for an ACK from the receiver before entering the state of waiting for a call from the upper layer) causes inefficiencies.

Workaround: Does not run in a pause mode, allowing the sender to send multiple packets without waiting for confirmation. This technique is calledAssembly line.

The impact of assembly line:

  • The ordinal range must be increased because each group must have a unique identifier
  • The sender and receiver of the protocol must cache multiple packets
  • The buffering requirements of the desired sequence number range depend on how the data transfer protocol handles lost, corrupted, and delayed packets. There are two basic methods to solve pipeline error recovery: N step back (GBN) and selective retransmission (SR)

(6) Roll back the N-step protocol

N Step rollback protocol (GBN, sliding window protocol) : allows sending sending multiple packets without waiting for confirmation. However, the number of unconfirmed packets cannot exceed a certain maximum value N.

The reason for setting N is traffic control and congestion control

GBK protocol responds to three kinds of events:

  • The sender:

    • When the upper layer calls

      • The window is full. Tell the sender to wait
      • The window is full, a packet is generated and transmitted
    • Received an ACK

      • The window slides to the right
    • Timeout event

      • If an ACK is received but the previous packet is not acknowledged, restart the timer
  • Receiver:

    • The packet numbered N is received correctly and, in order, an ACK is issued for n
    • In all other cases, the receiver discards the packet and resends the ACK by selecting the packet with the closest sequence

(8) Select retransmission protocol (SR)

In order to ensure the correct order of grouping, the sliding window protocol retransmits data, but considering the large window length and bandwidth, it will cause efficiency problems caused by repeated transmission.

Option retransmission: Lets the sender retransmit only those packets that it suspects are in error at the receiver, avoiding unnecessary retransmission.

(8) Reliable data transmission in TCP

The network layer service (IP service) of the Internet is unreliable, that is, it does not guarantee the delivery of data, the sequential delivery of packets, and the integrity of packets. TCP creates a reliable data transfer service, which ensures that the data transferred to the other end is not corrupted, no gap, non-redundant, and delivered sequentially.

We will explain how TCP achieves reliable data transmission based on the above principles:

  1. If the data from the underlying layer is completely reliable, according to RDT1.0 TCP only needs to transfer the data.
  2. Unfortunately, bits are often lost during network transmission, so according to RDT2.0, a checksum is added to ensure data correctness
  3. It seems that the above protocol is perfect, but there is still the problem of packet loss in network transportation. According to RDT3.0, a timer is introduced. When a packet is not sent after a period of time, the packet will be sent again and the timer will restart
  4. However, there is still a problem with timers. If the response packets are only delayed, how can they be distinguished from other packets? So we introduced the serial number. In this way, the receiver can determine whether the data is coming or retransmitted according to the byte number of the data.
  5. According to RDT, some protocols solve the problem of reliable transmission, but this is a stop-wait protocol, that is, in the process of transmission, if the response is not received, the upper-layer application has to wait, which is too low efficiency. Therefore, pipelining is introduced to send multiple packets without waiting for response packets.
  6. Networks are flooded with acknowledgements as much as packets are sent, because for every packet sent, there must be an acknowledgement. The way to improve network efficiency is: cumulative validation. The receiver does not need to reply one by one, but after accumulating a certain number of packets, tells the sender that all the data before this packet has been received. For example, if 1234 is received, the receiver only needs to tell the sender that I received 4, and the sender knows that 1234 was received.
  7. Cumulative confirmation improves network efficiency. However, in the case of packet loss, the GBN method is adopted, which retransmits all packets from the lost packet. Although this ensures the order of packets, it will cause serious resource waste once the packets are framed and the traffic is heavy. Therefore, you can set the received packet segment in the option field of a TCP packet. Each packet segment requires two boundaries. This allows the sender to retransmit only lost data based on the option field. This approach looks a lot like SR, so we say that TCP’s error-recovery mechanism for reliable data transmission is a hybrid of GBN and SR.
  8. Can I send indefinitely until all the data in the buffer is sent? Can’t. Because you need to consider the receiver buffer and the ability to read the data. If the packet is sent too fast for the receiver to accept, the packet is retransmitted frequently, wasting network resources. Therefore, the range of data sent by the sender needs to take into account the situation of the receiver buffer. This is TCP traffic control. The solution: sliding Windows.

5. TCP traffic control

For both the sender and the receiver, TCP needs to put the sent data into the send cache and the received data into the receive cache. The flow control to do, is through the size of the receiving cache, control the sender to send. If the receiver’s receive cache is full, the receiver cannot continue sending.

To understand flow control in detail, you first need to understand the concept of sliding Windows.

TCP sliding Windows are divided into two types: send window and receive window.

Send window

The structure of the sliding window on the sending end is as follows:

There are four major parts:

  • Sent and confirmed
  • Sent but not confirmed
  • Not sent but can be sent
  • It cannot be sent if it is not sent

There are some important concepts, which I have highlighted in the diagram:

The send window is the boxed area in the figure. SND is send, WND is window, UNA is unacknowledged, NXT is next, indicates the next location to send.

Receiving window

The window structure of the receiving end is as follows:

REV stands for receive, NXT stands for the next receiving location, and WND stands for the size of the receiving window.

Flow control process

6. TCP congestion control

The flow control mentioned in the previous section occurs between the sender and the receiver, without taking into account the impact of the entire network environment. If the current network is particularly bad and easy to lose packets, the sender should pay attention to it. And that’s where congestion control comes in.

For congestion control, TCP maintains two core states per connection:

  • Congestion Window (CWND)
  • Slow Start Threshold (SSthRESH)

There are several algorithms involved:

  • Slow start
  • Congestion avoidance
  • Fast retransmission and fast recovery

Let’s take these states and algorithms apart. First, start with congested Windows.

Congestion window

Congestion Window (CWND) is the amount of data you can still transmit.

So I introduced the concept of the receive window, what’s the difference between the two?

  • The receiving window (RWND) isThe receiving endTo the limit
  • Congestion window (CWND) isThe senderThe limits of

Limit who?

The limit is the size of the send window.

With these two Windows, how do you calculate the send window?

Send window size = min(RWND, CWND)Copy the code

Take the smaller of both. Congestion control is to control changes in CWND.

Slow start

At the beginning of data transmission, you do not know whether the current network is stable or congested. If you do too radical, send packets too quickly, then crazy packet loss, resulting in an avalanche of network disaster.

So congestion control starts with a conservative algorithm that slowly ADAPTS to the network, called slow start. The operation process is as follows:

  • First, shake hands three times, and each side announces its own receive window size
  • Both sides initialize their own congestion window (CWND) size
  • At the beginning of the transmission, the congestion window size increases by 1 for each ACK received by the sender, that is, the CWND doubles for each RTT. If the initial window is 10, then after 10 messages are sent in the first round and the sender receives ACK, the CWND changes to 20, 40 in the second round, 80 in the third round, and so on.

Is it just going to double forever? Of course not. Its threshold is called the slow start threshold, and when CWND reaches this threshold, it’s like hitting the brakes. Don’t go up so fast, Old iron. Hold on!

How do you control the size of CWND after the threshold is reached?

This is what congestion avoids doing.

Congestion avoidance

CWND can only add 1 / CWND when the threshold is reached. Then you carefully calculate, after a round of RTT, receive CWND ACK, then the size of the congestion window CWND will increase by 1 in total.

That is to say, before one RTT down, CWND doubled, now CWND just increased by 1.

Of course, slow start and congestion avoidance work together, as one.

The fast retransmission

During TCP transmission, if packet loss occurs, that is, when the receiving end finds that the data segment is not arrived in sequence, the receiving end repeats the ACK.

For example, if the fifth packet is lost, even if the sixth and seventh packets reach the receiver, the receiver will always return the ACK of the fourth packet. When the sender receives three duplicate ACKS, it realizes that the packet has been lost and immediately retransmits the packet without waiting for an RTO to run out.

This is called fast retransmission, and it addresses the question of whether retransmission is necessary.

Selective retransmission

Then you might ask, since retransmission is going to be done, do you retransmit only the 5th packet or all the 5th, 6th and 7th packets?

Of course, the 6th and 7th have already arrived, TCP designers are not stupid, already passed why still pass? Simply record which packages arrived and which did not, and retransmit accordingly.

After receiving the packet from the sender, the receiver replies with an ACK packet. Then, the attribute SACK can be added to the options of the packet header, and the left edge and right Edge can tell the sender which interval of datagrams it has received. Therefore, even if the fifth packet is lost, when the sixth and seventh packets are received, the receiver will still tell the sender that these two packets have arrived. If the fifth packet does not arrive, the packet will be retransmitted. This process, also called Selective Acknowledgment (SACK), addresses the problem of how to retransmit.

Fast recovery

Of course, after receiving three repeated ACK packets, the sender finds that the packet is lost and thinks that the network is congested, so it enters the rapid recovery phase.

At this stage, the sender changes as follows:

  • The congestion threshold is reduced to half that of CWND
  • The size of CWND becomes the congestion threshold
  • CWND increases linearly

These are the classic TCP congestion control algorithms: slow start, congestion avoidance, fast retransmission, and fast recovery.

The application layer

At the application layer, we focus on HTTP, which uses TCP as its supporting transport protocol

  • The client initiates a connection to the server
  • Once the connection is established, a process between the browser and the server can access TCP through a socket interface
  • The client sends HTTP request packets through the socket and receives HTTP response packets from the socket
  • TCP provides reliable data transmission services
  • The response packet from the server is returned to the client intact

Note: HTTP is a stateless protocol. The server sends the requested file to the client without storing any state information about the client

1. Format of HTTP packets

Hackr.jp request header information when accessing hackr.jp

(1)HTTP request packets

  • Request line: Request method + URI + protocol version

    • GET, POST, PUT, HEAD, OPTIONS, TRACT, CONNECT, LINK, UNLINK

      methods instructions Supported HTTP version
      Get Access to resources 1.0, 1.1,
      POST Transport entity body 1.0, 1.1,
      PUT Transfer files 1.0, 1.1,
      HEAD Obtaining packet header 1.0, 1.1,
      DELETE Delete the file 1.1
      OPTIONS Ask for supported methods 1.1
      TRACK Tracking path 1.1
      CONNECT The tunnel protocol connection agent is required 1.1
      LINK Establish relationships with resources 1.0
      UNLINK Disconnection relation 1.0
    • URI

      • Protocol: HTTP and HTTPS
      • Login information: Optional. Specify the user name and password as the login information for obtaining data from the server
      • Server ADDRESS: a common URL that is resolved by DNS to a unique IP address of the host
      • Port number: the socket used to access the server. The default port number of the Web server is 80
      • Hierarchical file path: Specify a specific file path on the server to get resources
      • Query character: Optional. You can use a query string for resources in a specified file path
      • Segment identifier: Optional, marks the child resources of the acquired resource
    • Protocol versions: HTTP0.9, HTTP1.0, and http1.1

  • Header field: see below

  • Packet Entity Content

2.HTTP response packets

  • Status line: protocol version + status code + status code cause phrase

    • Protocol versions: HTTP0.9, HTTP1.0, and http1.1

    • Status code:

      • 200 OK: The processing is normal
      • 204 No Content: The received request has been successfully processed, but the response packet returned does not contain the body of the entity
      • 206 Partial Content: This status code indicates that the client made a range request and the server successfully executed that part of the GET request
      • 301 Moved Permanently: Permanent redirection: indicates that the requested resource has been assigned a new URI
      • 302 Found: Temporary redirection indicating that the requested resource has been assigned a new URI
      • 303 See Other: The requested resource has another URI. Use the GET method to obtain the requested resource
      • 304 Not Modified: A condition in which the server allows access to a resource but does Not meet the condition
      • 307 Temporary Redirect: Indicates Temporary redirection
      • 400 Bad Request: Syntax errors exist in the Request packet
      • 401 Unauthorized: Careful authentication information over HTTP is required
      • 403 Forbidden: Request resources are rejected by the server
      • 404 Not Found: Requested resource could Not be Found on the server
      • 500 Internal Server Error: The Server failed to execute the request
      • 503 Service Unavailable: The server is overloaded or is down for maintenance
  • Header field: see below

  • The main body

3. Header field

  • Generic header field

  • Request header field

  • Response header field

  • Entity head field

2. Differences between GET and POST

If you dare to make trouble for me on post and get, don’t blame me for pretending to be a pussy

  • Get is used to get data, and POST is used to submit data
  • The GET argument is limited in length (by url length, depending on the browser and server, up to 2048 bytes), while the POST argument is unlimited.
  • The data of the GET request is appended to the URL, starting with “? “Split url and transfer data, multiple parameters are concatenated with ampersand, whereas A POST request puts the requested data in the HTTP request body.
  • Get is in plaintext, post is in the body of the request, but the developer can see it through the packet capture tool, and it is also in plaintext.
  • Get requests are actively cached by browsers, whereas POST requests are not, unless set manually.

3. Continuous connection and non-continuous connection

To send multiple requests from Http0.9 to Http2, from multiple Tcp connections =>keep-alive=> pipelining => multiplexing continuously reduces the performance cost of creating Tcp multiple times and so on.

(1) HTTP for non-persistent connections

Each request/response between the client and the server is sent over a separate TCP connection

Disadvantages:

  • A new connection must be established for each request object, and TCP buffers and TCP variables must be allocated between the client and the server, placing a heavy burden on the server
  • Each object suffers twice the delivery delay of RTT, one for establishing TCP and one for requesting and receiving an object

(2) HTTP for persistent connections

HTTP Persistent Connections (also called HTTP keep-alive)

In the HTTP1.1 persistent connection scenario, the server keeps the TCP connection open after sending the response. Subsequent request packets and response packets can be transmitted over the same TCP connection.

pipelines

Persistent connections make it possible to pipelinize — to make the next request without waiting for the next one to respond

multiplexing

HTTP transmission is based on the request-response mode, and the packets must be received once. However, it is worth noting that the tasks inside are executed in a task queue in serial. Once the first request is processed too slowly, the processing of subsequent requests will be blocked. This is the famous HTTP queue header blocking problem.

Multiplexing replaces the original sequence and blocking mechanisms. All requests are made concurrently over a TCP connection. Because before multiplexing all transmission is based on basic text, in multiplexing is based on binary data frame transmission, message, stream, so it can be done out of order transmission. Multiplexing is stream-based for all requests under the same domain name, so there is no concurrent blocking in the same domain. Multiple requests are shown below:

4. Differences between HTTP versions

Features and differences between HTTP versions

The HTTP 1.0:

  • Any data type can be sent
  • There are three methods: GET, POST and HEAD
  • Unable to reuse TCP connections (long connections)
  • There are rich request response headers. In the header of theLast-Modified/If-Modified-SinceandExpiresAs cache identifier

The HTTP 1.1:

  • Introduce more request method typesPUT,PATCH,DELETE,OPTIONS,TRACE,CONNECT
  • The TCP connection is not closed by default and can be reused by multiple requests. The connection:keep-alive header is set
  • Introducing a pipe connection mechanism that can be used in the same TCP connection,At the same time sendMultiple requests
  • Enhanced cache management and controlCache-Control,ETag/If-None-Match
  • Support block response, breakpoint continuation, conducive to large file transfer, can pass the request headerRangeimplementation
  • Using theThe virtual networkOn a physical server, multiple virtual hosts can exist and share the same IP address

HTTP 2:

  • Binary framing: No longer plain text, avoiding text ambiguity and reducing request size
  • Server push: The server can push additional resources to the client without an explicit request from the client
  • Multiplexing: Sending requests and responses simultaneously over a shared TCP connection
  • Increased security, using HTTP 2.0, requiring at least TLS 1.2
  • useThe HPACK algorithm compresses the headerwithHuffman codingEstablishing call list and transmitting index greatly saves bandwidth