Internet stuff [I]

Today’s content may be boring, but it is all sorted out dry stuff. The road to improving my ability is never plain sailing. Go for it!!

In order to better promote the research and development of the Internet, the international Organization for Standardization (ISO) has developed a Reference Model of the seven-layer framework of network interconnection, which is called the Open System Internetwork Reference Model (OSI/RM). The OSI reference model is an architecture model with seven layers. The contents and corresponding devices involved in sending and receiving messages are called entities. Each layer of the OSI contains multiple entities, and entities at the same layer are called peers. There are also organizations that are divided into TCP/IP architectures and the often talked about five-tier architecture as follows:

OSI’s seven-tier protocol architecture is clearer in concept and more complete, but is less commonly used because of its complexity
TCP/IP is a four-tier architecture, application layer, transport layer, Internet layer, and network interface layer, but the network interface layer does not have much actual content
So in our use, we always take a compromise approach, combining the advantages of OSI and TCP/IP, using a five-layer protocol architecture

The main functions of OSI layers

1.1 Physical Layer

In the OSI reference model, the lowest level of the physical layer reference model.

The main function of the physical layer is to use the transmission medium to provide physical connection for the data link layer and realize the transparent transmission of bitstream. A data unit is a bit

The function of the physical layer is to realize the transparent transmission of bitstreams between adjacent computer nodes and shield the differences between specific transmission media and physical devices as far as possible. The data link layer above it does not have to consider what the specific transmission medium of the network is. “Transparent transmission bitstream” means that the bitstream transmitted by the actual circuit does not change. It is responsible for passing bits of data between computers and creating rules for transferring bitstreams over physical media. This layer defines how the cable is connected to the network card. In a nutshell, there is another function: to provide the mechanical, electrical, functional, and procedural characteristics needed to establish, maintain, and release physical connections.

The physical interface standard defines the boundaries and interfaces between the physical layer and the physical transport media. The four characteristics of physical interfaces are mechanical, electrical, functional and procedural.

Mechanical properties: The mechanical properties of the physical layer define the shape and size of the pluggable connector used for physical connection, the number and arrangement of pins in the connector, etc.
Electrical characteristics: Specifies the high and low signal levels, impedance and impedance matching, transmission rate and distance limits on the line when transmitting binary bitstreams over physical connections.
Functions and features: Defines the function allocation and definition of each signal cable on a physical interface. Physical interface signal cables are classified into data cables, control cables, timing cables, and ground cables.
Specification characteristics: defines a set of operation procedures, including the operating rules and timing of each signal line, for binary bitstream transmission.

1.2 Data link Layer

In the OSI reference model, the data link layer is the second layer of the reference model.

The data link layer provides services to the network layer on the basis of the services provided by the physical layer. The most basic service of the data link layer is to reliably transmit the data from the network layer to the target computer network layer of adjacent nodes.

The data link layer has two main functions: frame coding and error correction control. Frame encoding means defining a packet containing information frequency, bit synchronization, source address, destination address, and other control information.

The most basic function of data link layer is to provide transparent and reliable data basic services to the users of this layer. Transparency means that there are no restrictions on the content, format, and encoding of the data transmitted over this layer, and there is no need to explain the meaning of the information structure; Reliable transmission saves users from worries about lost information, interference, and incorrect order. These situations may occur in the physical layer, and error correcting codes must be used to detect and correct errors in the data link layer. The data link layer strengthens the function of the physical layer to transmit the original bit stream. It transforms the physical connection provided by the physical layer into a logical error-free data link, making it appear as an error-free line to the network layer.

1.3 the network layer

The upper layer of the network layer is the transport layer, and the lower layer is the data link layer. It selects an appropriate transmission path for IP packets from the source host to the destination host through the routing algorithm, and provides services for end-to-end data transmission at the transport layer.

The network layer is responsible for communication between adjacent computers. Its functions are:

Process packet send requests from the transport layer. Upon receipt of the request, the packet is loaded into an IP datagram, the header is populated, the path to the host is selected, and the datagram is sent to the appropriate network interface.
Processing of incoming datagrams: first check their validity, then pathfinding – if the datagrams have reached the homing machine, remove the headers and hand the rest over to the appropriate transport protocol; If the datagram has not yet arrived at the destination, the datagram is forwarded.
Handle problems such as path, flow control, and congestion.

The network protocol at the network layer is IP, which can be classified into IPv4 and IPv6

IP protocol features:

The IP protocol is a connectionless, unreliable packet transport service protocol, so it provides a best-of-best service connectionless: the IP protocol does not maintain any status information after the IP packet has been sent.
Unreliable: The IP protocol does not guarantee that each IP packet can be correctly, not lost, and in low order to the destination host.
Point-to-point network layer communication protocol
IP protocol is according to the source host – router, the router to router, the router – purpose host data transmission between points – points of network communication protocols, it find a path for two communication host, usually composed of multiple routers, little line of IP blocked access to the Internet in the data link layer and physical layer protocol and the actual technical differences.

The following is an analysis of the current wide range of IPV6:

IPv6 message structure (you can take out pen, write IPv6 message structure)

Version: Indicates the protocol version. The IPv6 value is 6
Traffic level: mainly Qos
Flow label: Identifies packets in the same flow
Payload length: Indicates the number of bytes after the IPv6 packet header, including the extended header
Next header: Indicates the type of the packet header to be followed by the header. If there is an extension header, it indicates the type of the first extension header; otherwise, it indicates the type of the upper-layer protocol. It is the core implementation method of various IPv6 functions
Hop limit: This field is similar to the TTL in IPv4. Packets will be discarded when the hop count decreases by one each time
Source Address: Identifies the source address of the packet
Destination Address: Indicates the destination address of the packet

1.4 the transport layer

We’ll focus on the transport layer, not the transport layer.

1.5 the session layer

It defines how to start, control and end a session, including the control and management of multiple two-way message, in order to complete the continuous news only can inform when part of the application, so as to make the presentation layer to see data is continuous, in some cases, if the presentation layer received all of the data, with data on behalf of the presentation layer. Examples: RPC, SQL, etc.

Functions: Responsible for establishing and disconnecting communication connections (logical paths for data flow), separation of memory data and other data transfer related management

1.6 the presentation layer

Convert the information processed by the application into a format suitable for network transmission, or convert the data from the next layer into a format that the upper layer can process; Mainly responsible for data format conversion, to ensure that the information of one system application layer can be read by another system application layer

To be specific, it is to convert the inherent data format of the device into the network standard transmission format. Different devices may interpret the same bit stream differently. Therefore, the primary responsibility is to keep them consistent

1.7 the application layer

Service the application and specify the details related to communication within the application; The agreements included are as follows:

** HTTP: ** This is a basic client/server access protocol; The browser sends a request to the server, and the server responds to the corresponding web page
File transfer protocol FTP: Provides interactive access, based on client server mode, and is connection-oriented using TCP reliable transport services. Main function: Reduces file incompatibility in different operating systems
** Remote login protocol TELNET: ** client server mode, can adapt to many computer and operating system differences, network virtual terminal NVT significance
Simple Mail Transfer protocol SMTP: Client/Server mode, connection-oriented, basic functions: write, send, report the transmission situation, display letters, the receiver processing letters
**DNS protocol: **DNS is an Internet service used to convert domain names into IP addresses
** Simple file transfer protocol TFTP: ** Client server mode, using UDP datagram, only supports file transfer, does not support interaction, TFTP code takes a small memory
** Simple Network Management Protocol (SNMP) : ** Four components of SNMP model: managed node, management station, management information, management protocol

Second, transport layer -TCP

TCP/IP means TCP and IP work together. TCP is responsible for the communication between application software and network software. IP is responsible for communication between computers. TCP splits the data into IP packets and then recombines them when they arrive. IP is responsible for sending the packet to the recipient.

2.1 TCP Packet Structure

Source port: Identifies different application processes on the same computer
Destination port: The port specifies the application program interface on the receiving machine

The source and destination port numbers in the TCP header and the source and destination IP addresses in the IP packet uniquely determine a TCP connection.

Sequence number and confirmation number: are key parts of TCP reliable transmission. Ordinal is the ordinal number of the first byte of the data set sent in this paragraph. In a TCP stream, each byte has a sequence number. For example, if the serial number of a packet segment is 300 and the data in this packet segment is 100 bytes, the serial number of the next packet segment is 400. Therefore, sequence numbers ensure the order of TCP transmission. An acknowledgement number, or ACK, indicates the next expected byte number and indicates that all data prior to that number has been received correctly. The confirmation number is valid only when the ACK flag is 1. For example, when establishing a connection, the ACK bit of the SYN packet is 0.
Data offset/header length: 4bits. Since the header may contain optional content, the length of the TCP header is uncertain. If the header does not contain any optional fields, the length is 20 bytes. The maximum length of the 4-bit header field can be 1111, which is 15 in decimal notation, 15*32/8 = 60, so the maximum length of the header is 60 bytes. The header length is also called data offset because it actually indicates the starting offset of the data area in the packet segment.
Reserved: Reserved for defining new uses in the future, now set to 0.
Control bit: URG ACK PSH RST SYN FIN: a total of six flag bits. Each flag bit indicates a control function.

URG: emergency pointer flag. If the value is 1, the emergency pointer is valid; if the value is 0, the emergency pointer is ignored.
ACK: indicates the acknowledgement number. If the value is 1, the acknowledgement number is valid. If the value is 0, the packet contains no acknowledgement information and the acknowledgement number field is ignored.
PSH: push flag. A value of 1 indicates data with push flag, indicating that after receiving the packet segment, the receiver should send the packet segment to the application program as soon as possible instead of queuing in the buffer.
RST: Reset connection flag, used to reset a faulty connection due to a host crash or other reasons. It can also be used to reject invalid message segments and connection requests.
SYN: Synchronization sequence number used to establish a connection. In the connection request, SYN=1 and ACK=0 indicate that the data segment does not use an incidental acknowledgment field. The connection reply carries an acknowledgment, SYN=1 and ACK=1.
FIN: Finish flag, used to release the connection. If the value is 1, the sender has no data to send, that is, the local data flow is closed.

Window: The sliding window size is used to inform the sender of the cache size of the receiver, so as to control the rate at which the sender sends data and achieve flow control. The window size is a 16bit field, so the maximum window size is 65535.
Checksum: The parity check. The checksum is calculated in 16-bit characters for the entire TCP packet segment, including the TCP header and TCP data. It is calculated and stored by the sender and verified by the receiver.
Emergency pointer: The emergency pointer is valid only when the URG flag is set to 1. The emergency pointer is a positive offset that is added to the value in the ordinal field to indicate the ordinal number of the last byte of the emergency data. TCP emergency mode is a way for the sender to send emergency data to the other end.
Options and Padding: The most common optional field is the Maximum Segment Size (MSS), which is specified by each connector in the first Segment of the communication (the Segment where the SYN flag is set to 1 for connection establishment). This parameter indicates the Maximum Segment length that the local end can accept. The option length does not have to be a multiple of 32 bits, so padding bits are added, that is, adding an extra zero to the field to ensure that the TCP header is a multiple of 32.
Data part: The data part of the TCP packet segment is optional. When a connection is established and when a connection is terminated, only the TCP header is exchanged. If one party has no data to send, the header without any data is also used to acknowledge the received data. In many cases of processing timeouts, a segment of the message without any data is also sent. The data part is the real and valid data

2.2 TCP reliability Research

TCP is often said to be a reliable transport, but why does TCP have reliable transport? How many ways to ensure reliable transmission?

Acknowledgement response mechanism
Timeout retransmission mechanism
Flow control
Congestion window

2.2.1 Confirm the reply mechanism

In TCP, TCP numbers each byte of data, which is the serial number (the number of each piece of data).

Analysis from the above diagram:

When host 1 sends 1 1000 data to host 2, host 2 will reply to host 1 if it receives [ACK message segment, each ACK has corresponding confirmation serial number], indicating that the 1 1000 data has been received. Next time you send to me, please send me 1001 data from the beginning (where to send from next time). When host 1 receives the reply, it knows that the other party has received all the data from 1 to 1000. Therefore, when sending the data again, host 1 will send the data from 1001 to 1001, and so on.

Of course, when host 1 sends data to host 2, it is also possible that host 1 does not receive a reply from host 2 after a certain period of time, so in this case, to ensure the accurate arrival of data, TCP has a timeout retransmission mechanism.

2.2.2 Timeout retransmission Mechanism

If host 1 does not receive an acknowledgement from host 2, the following conditions occur:

The data is never delivered to host 2, so host 2 does not send back an acknowledgement.

According to the figure, host 1 sends data to host 2. The data may not reach host 2 due to other reasons (such as network congestion). After waiting for a certain amount of time, host 1 finds that host 2 has not responded and sends the previous data again.

2. Host 2 receives data and sends an acknowledgement packet, but the packet is lost.

According to the figure, host 2 receives the data from host 1, but the confirmation reply sent to host 1 does not reach host 1 on time. Therefore, host 1 will send the data again because it does not receive the confirmation reply. But the reality is that our host 2 has already received host 1’s data for the first time. However, host 1 still resends data to ensure that host 2 has received the data and forwards the data next time. As you can imagine, host 2 receives a lot of duplicate data, but duplicate data is obviously not needed, so TCP needs to be able to recognize the duplicate data and discard the flush data. Each data has its own serial number. If host 2 receives repeated data, it will inevitably generate multiple data with the same serial number. Therefore, the data with the same serial number must be duplicated.

2.2.3 Flow control

The speed of data processing at the receiving end is priority. If the sending end sends data too fast, the buffer at the receiving end will be filled up. At this time, if the sending end continues to send data, packet loss will occur, which will lead to a series of chain reactions such as packet loss and retransmission. Therefore, TCP determines the sending speed of the sending end based on the processing capability of the receiving end. This mechanism is called flow control.

The receiver puts the buffer size it can accept into the "window size" field in the TCP header and informs the sender through the ACK segment. The larger the window size field, the higher the network throughput the receiver will set the window size to a smaller value to notify the sender once it finds that its buffer is nearly full. After receiving the window size, the sender will slow down its own sending speed. If the receive buffer is full, the window is set to 0 and the sender stops sending data. However, it is necessary to send a probe window periodically, in order to get the probe data segment, the receiver tells the sender the size of the window.Copy the code

How does the receiver tell the sender the size of the window? In the preceding TCP packet structure, there is a 16-bit window size field in the TCP header, which is the information about the size of the window. In addition, the 40-byte option in the TCP header also contains a window enlargement factor M, the actual window size is the value of the window field moved M bits to the left.

2.2.4 Congested Windows

The difference between congestion control and traffic control Congestion control prevents too much data from being injected into the network and prevents the routers or links on the network from being overloaded. Flow control is the control of point-to-point traffic, which is an end-to-end problem. It is mainly to suppress the rate at which the sender sends data so that the receiver can receive it in time.

Congestion control mechanisms

Slow start

Slow start does not mean that CWND grows slowly (exponentially), but that TCP starts sending Settings CWND =1.

Instead of sending a lot of data at first, detect the level of congestion on the network. The congestion window size of the number of packet segments is used as an example to illustrate the slow-start algorithm. The real-time congestion window size is expressed in bytes.

To prevent network congestion caused by excessive CWND growth, set a slow start threshold (SSTHRESH status variable). When CNWD < SSTHRESH, use the slow start algorithm. When CNWD = SSTHresh, both the slow start algorithm and the congestion avoidance algorithm can be used. Use congestion avoidance algorithms

Congestion control

Congestion avoidance is not completely able to avoid congestion, which means that the congestion window is controlled to increase according to a linear law in the congestion avoidance stage, so that the network is less prone to congestion.

Let the congestion window CWND grow slowly, that is, every round trip time RTT increases the congestion control window of the sender by one.

Whether in slow start and congestion avoidance phase, as long as the sender determine the network congestion, according to is not received the confirmation, although not received confirmation may be other causes of packet loss, but because of unable to determine, so as congestion to processing), the slow start threshold is set to appear congestion when sending half of the window size. Then set the congestion window to 1 and perform the slow start algorithm.

Fast retransmission

Fast retransmission requires the receiver to send repeated acknowledgements as soon as it receives an out-of-order segment (so that the sender knows early that a segment has not reached the other party) rather than waiting until it sends data with additional acknowledgements. According to the fast retransmission algorithm, the sender should immediately retransmit the unreceived packet segment as long as it receives three consecutive repeated acknowledgements, rather than waiting for the retransmission timer to expire.

Fast Recovery (used with fast retransmission)
1. When the fast recovery algorithm is used, the slow start algorithm is used only when the TCP connection is established and the network times out. 2. When the sender receives three consecutive acknowledgements, the multiplication reduction algorithm is executed to halve the SSthresh threshold. But then the slow start algorithm is not performed. 3. The sender now assumes that the network may not be congested, considering that it would not receive multiple duplicate acknowledgements if the network were congested. So instead of executing the slow start algorithm, set CWND to ssTHRESH size, and then execute the congestion avoidance algorithm.

2.3 TCP Connection and Release

A TCP connection

The process by which TCP establishes a connection is called a handshake.
In a handshake, the client and the server exchange three TCP packet segments, called the three-packet handshake.
The three-way handshake is mainly used to prevent the invalid connection request message segment from suddenly starting to work, so as to ensure the reliability of transmission.

The following is a schematic of the TCP three-way handshake, and to explain it, I suggest you take out a piece of paper and draw the structure.

TCP release

At the end of the data transfer, both sides of the communication can release the connection.
The release of a TCP connection requires a four-way handshake

The following is a schematic of the TCP four-way handshake, and to explain it, I suggest you take out a piece of paper and draw the structure.

TCP three handshakes and four hand waves

2.4 Understand this?

2.4.1 Why is it a three-way handshake to establish a TCP connection and a four-way handshake to close a TCP connection?

After receiving a SYN request packet from a client, the server sends a SYN+ACK packet. ACK packets are used for reply, and SYN packets are used for synchronization. However, when the server receives a FIN packet, it may not close the SOCKET immediately. Therefore, the server can only reply an ACK packet to the client, saying, “I received the FIN packet you sent.” I can send FIN packets only after all packets on the server are sent. Therefore, THE FIN packets cannot be sent together. So you need four waves.

2.4.2 Can DATA be Carried during a three-way handshake?

In fact, the third handshake can carry data. However, the first and second handshakes cannot carry data

Why is that? One problem you can imagine is that if the first handshake can carry data, if someone were to attack the server maliciously, they would put a lot of data in each SYN packet in the first handshake. The attacker does not care whether the server receives and sends SYN packets properly, and then frantically focuses on recurrent SYN packets, which will cost the server a lot of time and memory to receive them.

That is, data should not be released on the first handshake, for one simple reason: it makes the server more vulnerable. For the third time, the client is already in ESTABLISHED state. As far as the client is concerned, it has already established the connection and already knows that the server’s receiving and sending capabilities are normal, so there is no problem with carrying data.

2.4.3 why does the state of TIME_WAIT return to CLOSE after 2MSL?

This is because although both parties have agreed to close the connection and all four packets have been coordinated and sent, they can logically return to the CLOSED state (just like the SYN_SEND state to the ESTABLISH state). However, assume that the network is unreliable and you cannot guarantee that the ACK packet you send will be received by the peer. Therefore, the peer SOCKET in the LAST_ACK state may resend the FIN packet because it fails to receive the ACK packet due to timeout. Therefore, the TIME_WAIT state is used to resend ACK packets that may be lost.

After all four packets are sent, the CLOSE state can be directly entered. However, the network may be unreliable and the last ACK may be lost. Therefore, the TIME_WAIT state is used to resend ACK packets that may be lost.

Transport layer -UDP

1. Common sense

Each UDP packet consists of a UDP header and a UDP data area. The header consists of four 16-bit (2-byte) fields, which respectively describe the source port, destination port, packet length, and parity value of the packet.

The UDP packet format is shown in The following figure.

The meanings of each field in a UDP packet are as follows:

Source port: This field occupies the first 16 bits of the UDP packet header and usually contains the UDP port used by the application that sends the datagram. The receiving application uses the value of this field as the destination address to send the response. This field is optional, so the sending application may not write its own port number into it. If no port number is written, set this field to 0. Thus, the receiving application cannot send the response.
Destination port: a 16-bit port used by the UDP software on the receiving computer.
Length: This field contains 16 bits and indicates the length of the UDP packet, including the UDP packet header and UDP data. Because the length of the UDP packet header is 8 bytes, the minimum value is 8.
Check value: This field is a 16-bit field that checks whether data has been corrupted during transmission.

Features:

UDP is a connectionless protocol that transmits data without establishing a connection between the source and the end. When it wants to transmit, it simply grabs the data from the application and throws it on the network as quickly as possible. On the sending side, the speed at which UDP can transmit data is limited only by the speed at which the application can generate the data, the power of the computer, and the transmission bandwidth; At the receiving end, UDP queues each message segment, and the application reads one message segment at a time from the queue.
Since no connection is established for data transmission, there is no need to maintain connection state, including sending and receiving state, so a single server machine can simultaneously transmit the same message to multiple clients.
The header of a UDP packet is short, only 8 bytes, with little overhead compared to the 20 bytes of a TCP packet.
Throughput is not regulated by the congestion control algorithm, but is limited only by the rate of data generated by the application software, transmission bandwidth, and the performance of the source and terminal hosts.
UDP uses best effort delivery, which means reliable delivery is not guaranteed, so hosts do not need to maintain complex linked state tables (which have many parameters).
UDP is packet oriented. The UDP packets sent by the sender to the application are forwarded to the IP layer after the header is added. Instead of splitting or merging, the boundaries of these messages are preserved, so the application needs to select the appropriate message size.

2. TCP is different from UDP

1. The contrast

2. To summarize

TCP provides connection-oriented reliable services to the upper layer, while UDP provides connectionless unreliable services to the upper layer.
Although UDP is not as accurate as TCP transmission, it can also be used in many places where real-time requirements are high
TCP can be used if the data accuracy is high and the speed is relatively slow

This article on the bottom six layers of the OSI (physical, data link, network, transport, session, and presentation) is mostly theoretical and mostly dry today (about 10 hours of college courses and some pictures from other blogs, The next post will focus on the evolution of HTTP and HTTPS.

It’s already early in the morning, and this time tomorrow will be the National Day and Mid-Autumn Festival. Wish you a happy and healthy family!!