Compared with the previous version of computer network interview knowledge summary, this version of the TCP protocol to ensure reliable transmission, including timeout retransmission, stop-wait protocol, sliding window, flow control, congestion control and other content and some additions to the existing content.

The structure and function of each layer of OSI and TCP/IP and what protocols are available

Architecture of five layers of protocols

When learning computer networks, we usually adopt a compromise, which neutralize the advantages of OSI and TCP/IP, and adopt a five-layer protocol architecture, which is both concise and can clarify the concept.

In conjunction with the situation of the Internet, a very brief introduction to the role of each layer from top to bottom.

1 the application layer

The task of the application layer is to complete specific network applications through the interaction between application processes. Application layer protocols define rules for communication and interaction between application processes (processes: programs that are running on a host). Different network applications require different application-layer protocols. There are many application layer protocols in the Internet, such as domain name system DNS, HTTP protocol that supports world Wide Web applications, SMTP protocol that supports E-mail, and so on. We call the unit of data that the application layer interacts with the message.

The domain name system

Domain Name System (DNS) is a core service of the Internet. As a distributed database that maps Domain names and IP addresses to each other, it makes it easier for people to access the Internet. Instead of having to remember the number of IP strings that can be read directly by the machine. For example, a company’s Web site can be regarded as its portal on the Internet, and the domain name is equivalent to its house address, usually using the name or short name of the company. For example, the domain name of Microsoft mentioned above is similar to that of IBM (www.ibm.com), Oracle (www.oracle.com) and Cisco (www.cisco.com).

The HTTP protocol

HyperText Transfer Protocol (HTTP) is the most widely used network Protocol on the Internet. All WWW (World Wide Web) documents must comply with this standard. HTTP was originally designed to provide a way to publish and receive HTML pages. (Baidu Encyclopedia)

2 transport layer

The main task of the Transport Layer is to provide a common data transfer service for communication between two host processes. Application processes use this service to transmit application-layer packets. “Generic” means that there is no specific network application, but that multiple applications can use the same transportation layer service. Because a host can run multiple threads at the same time, the transport layer has the ability to reuse and divide. The so-called reuse means that multiple application layer processes can simultaneously use the services of the transport layer below. On the contrary, the transport layer respectively delivers the received information to the corresponding processes in the application layer above.

The transport layer mainly uses the following two protocols

  1. TCP (Transmisson Control Protocol) – Provides connection-oriented, reliable data transmission services.
  2. User Datagram Protocol (UDP) — Provides a connectionless, best-effort data transfer service (with no guarantee of reliability).

Main features of UDP

  1. UDP is connectionless;
  2. UDP uses best effort delivery, that is, reliable delivery is not guaranteed, so the host does not need to maintain complex link state (which has many parameters);
  3. UDP is packet oriented.
  4. UDP has no congestion control, so congestion on the network does not slow down the transmission rate of the source host (useful for real-time applications such as live streaming, real-time video conferencing, etc.).
  5. UDP supports one-to-one, one-to-many, many-to-one and many-to-many interaction communication.
  6. The UDP header is 8 bytes shorter than the TCP header, which is 20 bytes.

Main features of TCP

  1. TCP is connection-oriented. (Just like making a phone call, you need to dial up to establish the connection before the call, and then hang up to release the connection);
  2. Each TCP connection can only have two endpoints, and each TCP connection can only be point-to-point (one-to-one).
  3. TCP provides reliable delivery of services. The data transmitted through TCP connection is error-free, lost-free and repeatable, and arrives in sequence.
  4. TCP provides full-duplex communication. TCP allows both application processes to send data at any time. Both ends of the TCP connection are equipped with send cache and receive cache, which are used to temporarily store the communication data between the two sides.
  5. Byte stream oriented. A “Stream” in TCP refers to a sequence of bytes flowing in or out of a process. “Byte stream oriented” means that while an application interacts with TCP one block at a time (of varying sizes), TCP treats the data handed down by the application as a series of unstructured byte streams.

3 network layer

The Network layer is responsible for providing communication services for different hosts on a packet-switched network. When sending data, the network layer encapsulates the segments or user datagrams generated by the transportation layer into packets and packets for transmission. In TCP/IP architecture, packets are also called IP datagrams, or datagrams for short, because the IP protocol is used at the network layer.

One caveat here: Do not confuse the “user datagram UDP” at the transport layer with the “IP datagram” at the network layer. In addition, no matter what level of data units, can be generally used to represent the “group”.

Another task of the network layer is to select a suitable route, so that the branch from the source host transportation layer can find the destination host through the router in the network layer.

It is emphasized here that the word “network” in the network layer is not the specific network we usually talk about, but refers to the name of the third layer in the computer network architecture model.

The Internet is composed of a large number of heterogeneous networks connected with each other through routers. The network layer protocol used by the Internet is the connectionless Internet Protocol (Intert Prococol) and many routing protocols, so the network layer of the Internet is also called the Internet layer or IP layer.

4 Data link layer

The Data Link Layer is commonly referred to as the link layer. Data between two hosts is always transmitted over links, which requires a special link layer protocol. When transmitting data between two adjacent nodes, the data link layer assembles the IP datagram handed over by the network layer into a routine frame, which is transmitted over the link between the two adjacent nodes. Each frame includes data and necessary control information (such as synchronization information, address information, error control, etc.).

When receiving data, the control information enables the receiver to know from which bit a frame starts and ends. In this way, when the data link layer receives a frame, it can take the data part from it and send it to the network layer. Control information also enables the receiver to detect errors in the received frame. If an error is found, the data link layer simply dismisses the faulty frame to avoid wasting network resources by sending it further across the network. If there is a need to correct errors in data transmission at the link layer (that is, the data link layer not only detects errors, but also corrects errors), then the reliability transport protocol is used to correct errors that occur. This approach will make the link layer protocol more complex.

5 the physical

The unit of data transmitted at the physical layer is the bit. The role of the physical layer is to realize the transparent transmission of bit streams between adjacent computer nodes, shielding the differences between specific transmission media and physical devices as far as possible. So that the data link layer above it does not have to consider the specific transmission medium of the network. “Transparent transmission bitstream” means that the actual circuit does not change the bitstream, and the circuit appears to be invisible to the transmitted bitstream.

Among the various protocols used on the Internet, the most important and famous are TCP/IP. Now people often mentioned TCP/IP does not necessarily refer to TCP and IP alone these two specific protocols, but often indicates the Internet used by the entire TCP/IP protocol family.

To summarize

Above we have a preliminary understanding of the five-tier architecture of the computer network, the following attached a seven-tier architecture diagram to summarize. Photo credit: blog.csdn.net/yaopeng_200…

Two TCP, three handshakes and four waves (interview regular)

In order to accurately send the data to the target, TCP protocol adopts the three-way handshake strategy.

Graphic illustration:

Image source: Illustrated HTTP

Simple schematic:

  • Client – Sends packet with SYN flag – One handshake – server
  • Server – Sends packets with THE SYN/ACK flag – Second handshake – Client
  • Client – Sends packets with ACK flags – Three-way handshake – server

Why the three handshakes

The purpose of the three-way handshake is to establish a reliable communication channel. Speaking of communication, it is simply the sending and receiving of data. The main purpose of the three-way handshake is to confirm that the sending and receiving of each other is normal.

First handshake: The Client cannot confirm anything; The Server confirms that the peer is sending properly

The second handshake: The Client confirms that its own sending and receiving are normal, and the other party’s sending and receiving are normal. The Server confirms that it receives and sends normally

Third handshake: The Client confirms that its own sending and receiving are normal, and the other party’s sending and receiving are normal. The Server confirms that its own sending and receiving are normal, and the peer party’s sending and receiving are normal

So three handshakes will confirm that both are functioning properly, and neither is necessary.

Why do we return SYN

The receiver sends the SYN back to the sender to tell the sender that the message I received is indeed the one you sent.

If SYN is passed, why do you need to pass ACK

The communication between the two parties must be correct if the messages sent to each other are correct. The SYN is transmitted, proving that the channel from the sender to the receiver is fine, but the channel from the receiver to the sender also needs an ACK signal for verification.

The SYN is the handshake used by TCP/IP to establish a connection. When a normal TCP connection is established between the client and server, the client sends out a SYN message, the server responds with a SYN-ACK and the client responds with an ACK. Acknowledgement character: in data communication, a transmission control character sent from the receiving station to the transmitting station to confirm that the data sent has been accepted. ) message response. In this way, a reliable TCP connection can be established between the client and the server, and data can be passed between the client and the server.

To disconnect a TCP connection, “four waves” are required:

  • Client – Sends a FIN to shut down data transfer from the client to the server
  • Server – Receiving the FIN, it sends back an ACK confirming that the serial number is the one received plus one. Like SYN, a FIN takes a serial number
  • Server – Close the connection with the client and send a FIN to the client
  • Client – Sends ACK packets for acknowledgement and sets the acknowledgement number to the receiving number plus 1

Why four waves

Either party can send a notice to release the connection after the data transmission ends, and enter the semi-closed state after the other party confirms. If the other party has no data to send, it sends a connection release notification. After the other party confirms that the TCP connection is completely closed.

Here’s an example: Call A and B, after the call is coming to an end, A said, “I do not have what to say,” replied “I know” B, but may also have to say, B can’t ask B end call follow their own rhythm, so B may barak barak said A phone again, finally B said, “I said,” A answered “know”, That’s how the call ends.

The comparison of all above summary, recommend an article about more detailed: blog.csdn.net/qzcsu/artic…

Differences between TCP and UDP

UDP does not need to establish a connection before transmitting data, and the remote host does not need to give any acknowledgement after receiving the UDP packet. While UDP does not provide reliable delivery, there are some situations where UDP is the most efficient way to work (typically for instant messaging), such as QQ voice, QQ video, live streaming, and so on

TCP provides connection-oriented services. A connection must be established before data transfer and released after data transfer. TCP does not provide broadcast or multicast services. Because TCP to provide a reliable, connection-oriented transport service (TCP and reliable in TCP before passing data, there will be three times handshake to establish a connection, and in data transmission, are confirmed, the window, the retransmission, the congestion control mechanism, in after the data transfer, disconnected will also be used to save system resources), the hard to avoid increased a lot of overhead, Such as validation, flow control, timers, and connection management. This not only makes the header of the protocol data unit much larger, but also consumes a lot of processor resources. TCP is used for file transfer, mail sending and receiving, and remote login.

How does TCP ensure reliable transmission

  1. Application data is split into blocks that TCP considers best suited to send.
  2. TCP numbers each packet to be sent. The receiver sorts the packets and sends the ordered data to the application layer.
  3. Checksum: TCP will keep the checksum of its header and data. This is an end-to-end check sum designed to detect any changes in the data during transmission. If the checksum of the received segment is incorrect, TCP dismisses the segment and does not acknowledge receipt of the segment.
  4. The TCP receiver discards duplicate data.
  5. Flow control: Each side of a TCP connection has a fixed buffer space. The TCP receiver allows the sender to send only the data that can be accepted by the receiver buffer. When the receiver has no time to process the data from the sender, it can prompt the sender to reduce the sending rate to prevent packet loss. The flow control protocol used by TCP is a variable size sliding window protocol. (TCP uses sliding window to achieve flow control)
  6. Congestion control: Reduces data transmission when the network is congested.
  7. The stop-wait protocol is also designed to achieve reliable transmission. Its basic principle is to stop sending each packet and wait for the other party to confirm. Send the next group after receiving confirmation. Timeout retransmission: After TCP sends a segment, it starts a timer and waits for the destination to acknowledge receipt of the segment. If an acknowledgement is not received in time, the segment is resent.

Stop waiting protocol

  • Stop-wait protocol is to achieve reliable transmission, its basic principle is to send a packet after each stop, waiting for the other party to confirm. Send the next group after receiving confirmation;
  • In the stop-wait protocol, if the receiver receives a duplicate packet, it dismisses the packet but also sends an acknowledgement.

1) No error:

The sender sends the packet, and the receiver receives the packet within the specified time and replies confirmation. The sender sends it again.

2) Error (timeout retransmission) :

Automatic retransmission request ARQ
Continuous ARQ protocol

3) Confirm loss and confirm lateness

  • Acknowledgement loss: Acknowledgement messages are lost during transmission

    When A sends an M1 message and B receives it, B sends an M1 acknowledgement message to A, but it is lost during transmission. However, A does not know that after the timeout, A retransmits the M1 message, and B takes the following two measures after receiving the message again:

    1. The duplicate M1 message is discarded and not delivered to the upper layer.
    2. Send A confirmation message to A. (Do not think that has been sent, no longer send. If A can retransmit, B’s confirmation message is lost.
  • Confirm lateness: Confirm that a message is late in transmission

    A sends the M1 message. B receives and sends the acknowledgement. If no acknowledgement message is received within the timeout period, A retransmits the M1 message, but B still receives and continues to send the acknowledgement message (B receives two COPIES of M1). At this point, USER A receives the second acknowledgement message sent by user B. It then sends other data. After A while, A receives the first acknowledgement message for M1 sent by B (A also receives two acknowledgement messages). The treatment is as follows:

    1. A Directly discards the device after receiving repeated confirmation.
    2. After receiving a duplicate M1, B directly dismisses the duplicate M1.

Automatic retransmission request ARQ protocol

In the stop and wait protocol, timeout retransmission means that if no acknowledgement is received after a certain period of time, the previously sent packet is retransmitted (the packet that was just sent is considered lost). Therefore, you need to set a timeout timer after each packet is sent, and the retransmission time should be longer than the average round trip time of data in the packet transmission. This mode of automatic retransmission is often referred to as automatic retransmission request ARQ.

Pros: Simple

Disadvantages: Low channel utilization

Continuous ARQ protocol

Continuous ARQ protocol can improve channel utilization. The sender maintains a sending window. All the groups in the sending window can be sent out consecutively without waiting for the other party’s confirmation. The receiver generally uses cumulative acknowledgement to send acknowledgement to the last group arriving in sequence, indicating that all the groups up to this group have been correctly received.

Advantages: High channel utilization, easy to implement, even if the confirmation is lost, there is no need to retransmit.

Disadvantages: It does not reflect to the sender all the packets that have been correctly received by the receiver. For example, if the sender sends five messages and the third message is lost (Number 3), the receiver can only send the first two messages. The sender had no way of knowing where the last three groups were and had to retransmit all three. Also known as go-back-n, this means you need to step Back to retransmit N messages that have already been sent.

The sliding window

  • TCP uses sliding window to realize flow control mechanism.
  • Sliding Window is a flow control technique. In the early days of network communication, both parties sent data without considering the congestion of the network. Because people do not know the network congestion situation, send data at the same time, resulting in the middle node blocking packet switching, no one can send data, so there is a sliding window mechanism to solve this problem.
  • TCP uses a sliding window for transmission control. The size of the sliding window means how much buffer the receiver can use to receive data. The sender can determine how many bytes of data to send by sliding the size of the window. When the slide window is 0, the sender is generally no longer able to send datagrams, except in cases where emergency data can be sent, for example, allowing the user to terminate a running process on a remote machine. Alternatively, the sender may send a 1-byte datagram to notify the receiver to restate the next byte it wishes to receive and the size of the sender’s sliding window.

Flow control

  • TCP uses sliding window to achieve flow control.
  • Flow control is used to control the transmission rate of the sender and ensure that the receiver receives the packet in time.
  • The window field in the acknowledgement packet sent by the receiver can be used to control the window size of the sender, thus affecting the sending rate of the sender. If the window field is set to 0, the sender cannot send data.

Congestion control

At some point in time, if the demand for a resource in the network exceeds the available portion of the resource, the network performance deteriorates. This is called congestion. Congestion control is to prevent excessive data injection into the network, so that the network routers or links do not overload. Congestion control is all based on the premise that the network can withstand the existing network load. Congestion control is a global process that involves all hosts, all routers, and all factors associated with degrading network traffic performance. On the contrary, flow control is often point-to-point traffic control and is an end-to-end problem. All flow control does is suppress the rate at which the sender sends data so that the receiver can receive it in time.

For congestion control, the TCP sender maintains a congestion Window (CWND) state variable. The size of the congestion control window depends on the degree of network congestion and changes dynamically. The sender makes its send window the smaller of the congestion window and the receiver’s receive window.

TCP congestion control adopts four algorithms: slow start, congestion avoidance, fast retransmission and fast recovery. At the network layer, routers can also adopt appropriate packet dropping policies (such as active queue management AQM) to reduce network congestion.

  • Slow start:The idea behind the slow start algorithm is that when the host starts sending data, if a large number of data bytes are injected into the network immediately, it may cause network congestion because the network compatibility is not yet known. Experience shows that a better method is to first detect, that is, gradually increase the sending window from small to large, that is, gradually increase the value of the congestion window from small to large. The initial value of CWND is 1. After each propagation cycle, CWND is doubled.
  • Congestion avoidance: The idea of the congestion avoidance algorithm is to make the congestion window CWND slowly increase, that is, every round trip time RTT will send CWND by 1.
  • Fast retransmission and fast recovery:In TCP/IP, FAST Retransmit and Recovery (FRR) is a congestion control algorithm that can quickly recover lost packets. Without FRR, IF a packet is lost, TCP will use a timer to ask the transmission to pause. During the pause, no new or copied packets are sent. With FRR, if the receiver receives a data segment out of sequence, it immediately sends a duplicate acknowledgement to the transmitter. If the transmitter receives three double acknowledgements, it assumes that the data segments indicated by the acknowledgements are lost and immediately retransmits those lost data segments. With an FRR, there is no delay due to the pause required during retransmission. Fast retransmission and Recovery (FRR) works most efficiently when individual packets are lost. It does not work very efficiently when multiple data packets are lost over a short period of time.

Enter the URL address in the browser ->> display the home page process (interview frequent visitor)

Baidu seems to like asking that question the most.

Open a web page, which protocols will be used throughout the process

Image source: Illustrated HTTP

Six status code

Relationships between various protocols and HTTP

Interviewers usually ask questions like this to test your understanding of the computer network knowledge system.

Image source: Illustrated HTTP

Long and short HTTP connections

Short connections are used by default in HTTP/1.0. In other words, each TIME the client and server perform an HTTP operation, a connection is established, and the connection is interrupted when the task is complete. When the client browser accesses an HTML or other type of Web page that contains other Web resources (such as JavaScript files, image files, CSS files, and so on), the browser re-establishes an HTTP session each time it encounters such a Web resource.

With HTTP/1.1, the default is to use persistent connections to preserve the connection feature. Using the long-connected HTTP protocol, this line of code is added to the response header:

Connection:keep-alive
Copy the code

In the case of a persistent connection, when a web page is opened, the TCP connection between the client and the server for the transfer of HTTP data is not closed. When the client accesses the server again, it continues to use the established connection. Keep-alive does not stay connected permanently, it has a hold time, which can be set in different server software (such as Apache). To implement persistent connections, both the client and the server need to support persistent connections.

The long and short connections of HTTP protocol are essentially the long and short connections of TCP protocol.

— What are HTTP Long and Short Connections?

Write in the last

Computer network FAQ review

  • TCP three handshakes and four waves.
  • ② Enter the URL address in the browser ->> display the home page process
  • ③ The difference between HTTP and HTTPS
  • (4) Differences between TCP and UDP
  • ⑤ Common status codes.

advice

I highly recommend reading Illustrated HTTP, which has a few pages but is very informative, whether it’s for a systematic understanding of the web or just for interview purposes. The following articles are for reference only. When I studied this course in my sophomore year, the textbook we used was The Seventh Edition of Computer Network (edited by Xie Xiren). I don’t recommend you to read this textbook because it is very thick and the knowledge is theoretical, so I’m not sure if you can read it in peace.

Open Source Documentation Recommendation

Java-guide: a Guide that covers the core knowledge most Java programmers need to master. It is being improved step by step. We look forward to your participation.

Github: github.com/Snailclimb/…

Reference:

Blog.csdn.net/qq_16209077…

Blog.csdn.net/zixiaomuwu/…

Blog.csdn.net/turn__back/…

If you are in full bloom, the breeze. Welcome to follow my wechat official account: “Java Interview Clearance Manual”, a wechat official account with temperature. The public number has a lot of information, reply to the keyword “1” you may see what you want!