This article is based on JavaGuide.

Common computer network architectures

The usual five – and seven-layer protocol architecture that people talk about is used here.

1 ️ Application Layer

  • Task: To complete a specific network application through interprocess interaction. (Process: a running program on a host)

  • Application layer protocols include DNS and HTTP.

  • The data units that the application layer interacts with are called packets.

2 ️ Transport Layer

  • Task: Responsible for providing a common data transfer service for communication between two host processes.

  • The main agreement

    1. TCP: Connection-oriented, reliable data transfer service
    2. UDP: connectionless, best-effort data transfer service

3 ⃣ ️ network layer

  • Task: select suitable network routing and switching nodes to ensure timely transmission of data

  • Protocol: Internet Protocol

TCP related

TCP three handshakes and four waves

Leaving aside the process of handshake, it is important to know why it is done. Let’s first record its purpose.

Why three handshakes?

The purpose of the three-way handshake is to establish a reliable channel of communication. That is, the sending and receiving of data. In other words, the purpose of the three-way handshake is to confirm that the sender and receiver are normal.

Three-way handshake process

The following table shows the process of three handshakes and what they confirm:

Sender status The data transfer Receiving state Confirm the content
SYN-SENT
>>> Sending packets with SYN flags >>> SYN-RECEIVED The receiving end confirms that the receiving end can receive the message properly
<<< send packet with SYN-ACK flag <<<
ESTABLISHED The sending end confirms that the sending and receiving are normal, and the receiving end confirms that the sending and receiving are normal
>>> Send packets with ACK flags >>> ESTABLISHED Both ends confirm that the sending and receiving is normal by themselves and the receiving is normal by the other party

Why four waves?

TCP is full-duplex, and the last two terminals need to be shut down respectively. Either party can send a notice to release the connection after the data transmission ends, and enter the semi-closed state after the confirmation of both parties. When the other party also has no data to send again, then released the release of the connection notice, the other party confirmed, completely closed the connection.

Four wave flow

Active close party state The data transfer Passive close side state Confirm the content
FIN-WAIT 1
>>> Send FIN with a serial number >>> CLOSE_WAIT Active shutdown is used to shut down data transfer from client to server
FIN-WAIT 2 <<< send ACK to confirm receipt of +1<<< CLOSE_WAIT
TIME_WAIT < < < send FIN < < < LAST_ACK The server closes the connection with the client
>>> Send an ACK message for acknowledgement and set the acknowledgement sequence number to the received sequence number +1>>> CLOSED Both ends confirm that the sending and receiving is normal by themselves and the receiving is normal by the other party
CLOSED

The difference between TCP and UDP

I think it’s important to know what they really are about each other. There is a brief introduction to these two agreements at the beginning of the article. This introduction is based on their main characteristics. And the characteristics mentioned are often not the other party has.

  • TCP: Connection-oriented, reliable data transfer service
  • UDP: connectionless, best-effort data transfer service

Beyond connection-oriented and reliable, what this leads to is an important difference.

TCP is connection-oriented, so a three-way handshake is required to establish a connection before data transmission, and a four-way handshake is required to release the connection after data transmission. This shows that TCP is a reliable data transfer service. In the three handshakes and the four waves of the hand, we mentioned that the packets were marked with various flags, such as SYN and ACK. These flags also result in a much larger header of the data unit, which requires a lot of resource allocation. Why else is TCP reliable? More on that in the next section.

UDP does not need to establish a connection, so it is not as reliable as TCP. However, since it does not need to establish a connection, it requires less work and resources, so the transmission efficiency is relatively higher.

How does TCP ensure reliable transmission

I was asked this question again in my last interview and answered it poorly. Although I memorized some of the technical terms, I was completely unfamiliar with the actual application scenarios. I hope you can understand how it works, so that you can understand more about its application.

1. Check Sum

The checksum is the check sum of the transmitted data. Both sender and receiver evaluate and compare the data. The way I see the calculation is to treat the transmitted data content as a 16-bit integer, and then add up all the 16-bit certificates of the segment, do not discard the carry, complement the last, and then take the inverse. So let’s do the search again.

If they are inconsistent, the data transmission must be wrong. The reverse may not be true.

2. Confirm the reply and serial number

Each packet sent by the sender has a SYN sequence number (SYN). After receiving the packet, the receiver responds with an ACK sequence number (ACK), which adds one to the sequence number of the received packet.

Serial numbers also help to reconstruct ordered data and remove duplicates. It also reflects the reliability of TCP transmission.

3. Retransmission timed out

After sending data, the sender waits for a reply packet with an ACK flag for some time. If it is not received after this time, the data just received will be resend.

4. Connection management

Three handshakes and four waves. Reliable connection is the premise of reliability.

5. Flow control

Each side of a TCP connection has a fixed amount of buffer space. The receiver only allows the sender to send data that the receiver buffer can accept. If the receiver does not have enough time to process the sender’s data, the sender is prompted to lower the sending rate to prevent packet loss. TCP determines the transmission speed of the sender based on the data processing capability of the receiver. This mechanism is called flow control. A sliding window for variable flow control protocols used by TCP.

If the received window size value is 0, the sender will stop sending data. And periodically send window detection data to the receiving end. If the window size is restored, continue sending.

6. Congestion control

When a lot of data is being sent, the network can become congested. If you still send a lot of data, increased congestion will result in a lot of lost packets. Then a large number of timeout retransmissions can seriously affect shipping.

//TODO: Details

  1. Slow start
  2. Congestion avoidance
  3. Fast retransmission and fast recovery

What happens from URL input to page presentation?

  1. Obtain the IP address by resolving the domain name

  2. The three-way handshake establishes a TCP connection

  3. Sending an HTTP request

  4. The server processes the request and returns HTTP packets

  5. The browser parses the rendered page

  6. Four waves end the connection

HTTP related

HTTP long and short connections

This is also the difference between HTTP/1.0 and HTTP/1.1.

HTTP/1.0 uses short connections by default. That is, each TIME the client and server perform an HTTP operation, a connection is established and the connection is broken at the end of the task.

HTTP/1.1 starts with long connections by default. In the case of long connections, the client and server only keep the TCP connection used to transfer HTTP data open. This connection is not permanent, but has a hold time that can be set in different server software. Both the client and the server must support the long connection.

In essence, HTTP long and short connections are TCP long and short connections.

HTTP is a protocol that does not save state. How to save user state?

HTTP protocol is an unsaved state, namely Stateless. In other words, the HTTP protocol itself does not store the state of the request and the corresponding prior communication. So how do you save the user state? Give an example of user status. When an item is added to the cart, how does the system know which user is doing it?

Sessions are designed to solve this problem. Session Records user status through the server. The server creates a specific Session for a specific user and then represents that user and keeps track of it. Generally, sessions have a time limit. After the time limit expires, sessions are destroyed.

So how does the server store sessions? The most common ones are memory and databases.

And then how does the server keep track of sessions? In most cases, this is done by attaching a session ID to a cookie. Cookies are stored on the client.

What if cookies are disabled? The most common one mentioned in JavaGuide is to enclose the session ID directly after the URL. I don’t think it’s safe. Please do more research for details.