I put together my previous articles to become Github, welcome everyone big guy star github.com/crisxuan/be… This article has been submitted

Transportation layer, located between application layer and network layer, is the fourth layer in OSI layered system and an important part of network architecture. The transport layer is responsible for end-to-end communication on the network.

The transport layer plays a critical role in communication between applications running on different hosts. Now let’s discuss the agreement on the transport layer

Transport Layer Overview

The transportation layer of a computer network is much like a highway, which moves people or things from one end to the other, and the transportation layer of a computer network moves messages from one end to the other, which is the end system. In computer network, any medium that can exchange information can be called end system, such as mobile phone, network media, computer, operator, etc.

When transporting packets at the transport layer, it complies with certain protocol specifications, such as the data limit of one transmission and the transport protocol selected. The transport layer implements logical communication between two unrelated hosts, as if two hosts were connected.

Transport layer protocols are implemented in the end system, not in the router. Routing only identifies addresses and forwards them. This is like the Courier to send express, of course, by the address of the recipient is XXX building XXX unit XXX room this person to judge!

How does TCP determine which port it belongs to?

Remember the structure of the packet? Here’s a review

After the packet passes through each layer, the layer protocol attaches the packet header to the packet. A complete packet header diagram is shown above.

After data is transferred to the transport layer, a TCP header is attached to it, which contains the source port number and destination port number.

At the sending end, the transport layer converts the packets received from the sending application process into transport layer groups, also known as segments in computer networks. The transport layer generally divides the packet segments into smaller pieces, adds a transport layer header to each piece and sends it to the destination.

In the process of sending, the transport layer protocols (i.e., transportation vehicles) are mainly TCP and UDP. The selection and characteristics of these two transport protocols are also the focus of our discussion.

Pre-knowledge of TCP and UDP

Among TCP/IP protocols, TCP and UDP are the most representative ones that can realize the transport layer function. When it comes to TCP and UDP, we need to start with the definition of these two protocols.

TCP is called Transmission Control Protocol (TCP). From its name, WE can roughly know that TCP has the function of controlling Transmission, mainly reflected in its controllable, controllable means reliable. Indeed, it is true. TCP provides a reliable, connection-oriented service for the application layer that can reliably transport packets to the server.

Known by its name as User Datagram Protocol (UDP), UDP focuses on datagrams and provides the application layer with a way to send datagrams directly without establishing a connection.

How is it that computer network jargon has so much to say about a single piece of data?

In a computer network, there are different descriptions between different layers. As mentioned above, transport layer packets are referred to as packet segments. In addition, TCP packets are referred to as packet segments, while UDP packets are referred to as datagrams and network layer packets are referred to as datagrams

However, in order to unify, we generally call TCP and UDP packets as packet segments in computer networks. This is quite a convention, and there is no need to tangle how to call them.

The socket

Before sending TCP or UDP packets, you need to pass through a door, namely a socket. The socket connects up to the application layer and down to the network layer. In the operating system, the operating system provides Application Programming interfaces for applications and hardware. In computer network, socket is also an interface, it also has interface API.

When TCP or UDP is used for communication, the socket API is widely used to set the IP address and port number and send and receive data.

Now we know that Socket and TCP/IP are not necessarily related, Socket is only convenient for the use of TCP/IP, how to facilitate the use of it? You can use these methods directly from the Socket API below.

methods describe
create() Create a socket
bind() Socket identifier, commonly used to bind port numbers
listen() Prepare receiving connection
connect() Prepare to act as a sender
accept() Prepare to be the receiver
write() To send data
read() Receive data
close() Close the connection

Socket type

There are three main types of sockets, which are described below

  • Datagram Sockets: datagram socket provides oneThere is no connectionAnd can not guarantee the reliability of data transmission. Data may be lost or duplicated during transmission, and sequential data receipt cannot be guaranteed. Datagram socket usageUser DatagramProtocol (UDP)Data transmission. Because datagram socket cannot guarantee the reliability of data transmission, it is necessary to deal with the possible data loss in the program.
  • Stream Sockets: Streaming sockets are used to provide connection-oriented, reliable data transfer services. Ensure the reliability and sequence of data. Stream sockets provide reliable data services because of their use of transmission control protocol, i.eThe Transmission Control Protocol (TCP)
  • Raw Sockets: Raw sockets allow IP packets to be sent and received directly without any protocol-specific transport layer format. Raw sockets can read and write IP packets that are not processed by the kernel.

Socket processing

In a computer network, to achieve communication, you must need at least two end systems, at least a pair of two sockets. The following is the communication process of the socket.

  1. The API in the socket is used to create an endpoint on a communication link. After the endpoint is created, a description of the socket is returnedSocket descriptor.

Just as file descriptors are used to access files, socket descriptors are used to access sockets.

  1. When an application has a socket descriptor, it can bind a unique name to the socket, and the server must bind a name to access it on the network
  2. After the socket is assigned to the server and the name is bound to the socket using bind, the LISTEN API is called.listenTo indicate the client’s willingness to wait for the connection, Listen must be called before the Accept API.
  3. The client application is invoked on a stream socket (based on TCP)connectInitiate a connection request to the server.
  4. Server application usageacceptThe API accepts client connection requests, and the server must successfully call bind and LISTEN before calling the Accept API.
  5. After establishing a connection between the streaming socket, the client and server can make read/write API calls.
  6. Called when the server or client wants to stop an operationcloseThe API releases all system resources acquired by the socket.

Although the socket API resides in the communication model between the application layer and the transport layer, the socket API does not belong to the communication model. The socket API allows applications to interact with the transport and network layers.

Before we move on, let’s play a brief interlude to talk about IP.

Talk about IP

IP stands for Internet Protocol. It is a network layer Protocol in the TCP/IP system. IP was designed to solve two main types of problems

  • Improve network scalability: Achieve large-scale network interconnection
  • Decouple the application layer and link layer to make them develop independently.

IP is the core of the entire TCP/IP protocol family and constitutes the foundation of the Internet. In order to realize the interconnection of large-scale networks, IP pays more attention to adaptability, simplicity and operability, and sacrifices some reliability. IP does not guarantee the delivery time and reliability of packets. Packets transmitted may be lost, repeated, delayed or out of order.

We know that the next layer of TCP protocol is IP protocol layer, since IP is not reliable, so how to ensure that the data can accurately arrive?

This brings us to the TCP transport mechanism, which we’ll talk about later.

The port number

Before we talk about port numbers, let’s talk about file descriptions and the relationship between sockets and port numbers

For the convenience of resource use, improve the performance, efficiency and stability of the machine, and so on reasons, we have a layer of computer software called an operating system, which is used to help us to manage the resources of the computer can be used, when our program to use a resource, you can apply to the operating system, and then by the operating system for our program resource allocation and management. Usually when we want to access a kernel device or file, the program can call a system function, the system will open the device or file for us, and then return a file descriptor fd (or ID, which is an integer). We can access the device or file only through this file descriptor. This number can be considered to correspond to an open file or device.

So we can apply to the operating system, and then the system will create a Socket for us, and return the ID of the Socket. In the future, our program will use network resources. Operation on the ID of the Socket. Each of our network communication processes corresponds to at least one Socket. Writing data to the Socket ID is equivalent to sending data to the network. Reading data to the Socket is equivalent to receiving data. And these sockets have a unique identifier, the file descriptor FD.

The port number is a 16-bit non-negative integer that ranges from 0 to 65535. This range is divided into three different port number segments that are assigned by the Internet number allocation organization (IANA)

  • Known as the standard port number, it ranges from 0 to 1023
  • The registered port number ranges from 1024-49151
  • The private port number ranges from 49152-6553

Multiple applications can run on a computer. When a message segment reaches the host, which application should it be sent to? How do you know that the segment is destined for the HTTP server and not the SSH server?

By port number? When a packet arrives at the server, it is the port number that distinguishes different applications, so you should use the port number to distinguish.

As an example to refute cxuan, if the two data arriving at the server are both sent by port 80, how can you tell the difference? In other words, the two data ports to the server are the same, but the protocol is different. How can we tell the difference?

Therefore, it is not enough to identify a packet only by the port number.

On the Internet, the source IP address, target IP address, source port number, and target port number are used to distinguish each other. If one of these items is different, it is considered as a different packet segment. These are also the basis of multiplex decomposition and multiplexing.

Determine the port number

Before actual communication, you need to determine the port number. There are two methods to determine the port number:

  • Standard specifies the port number

The standard set of port numbers is statically assigned, each program will have its own port number, each port number has a different purpose. The port number is a 16-bit number that ranges in size from 0 to 65535. Ports in the range from 0 to 1023 are dynamically allocated ports. For example, HTTP uses port 80, FTP uses port 21, and SSH uses port 22. This type of Port Number has a special name, Known as the well-known Port Number.

  • The port number assigned by the sequence

The second way to assign port numbers is a dynamic allocation method, in which the client application does not need to set port numbers itself at all, and the operating system assigns non-conflicting port numbers to each application. The dynamic port number allocation mechanism can identify different TCP connections even if they are initiated by the same client.

Multiplexing and multiplex decomposition

We talked about how each socket on the host is assigned a port number. When a packet segment arrives at the host, the transport layer checks the destination port number in the packet segment and directs it to the appropriate socket. Then the data in the packet segment goes through the socket to the process to which it is connected. Let’s talk about the concepts of multiplexing and multiplexing.

There are two types of multiplexing and multiplexing: connectionless multiplexing (multiplexing) and connection-oriented multiplexing (multiplexing)

Connectionless multiplexing and multiplexing decomposition

The developer writes code to determine whether the port number is a weekly port number or a sequential port number. If A 10637 port on host A wants to send data to 45438 port on host B, the transport layer uses UDP. After the data is generated at the application layer, it is processed in the transport layer and then encapsulated into IP datagrams at the network layer. The IP packet is delivered to host B through the link layer as best it can. Then host B checks the port number in the packet segment to determine which socket it belongs to, as shown in the following sequence

A UDP socket is a binary that contains the destination IP address and destination port number.

Therefore, if two UDP packet segments have different source IP addresses and/or the same source port number, but the same destination IP address and destination port number, the two packets will be located to the same destination process through the socket.

Host A sends A message to host B. Why do you need to know the source port number? For example, if I send a message to a girl that I’m interested in you, does the girl need to know which organ of mine the message is coming from? Wouldn’t it be all right if you knew it was me? In fact, yes, because if a girl wants to show that she’s interested in you, is she likely to kiss you, she needs to know where to kiss you?

In the packet segment from A to B, the source port number is used as part of the return address. That is, when B needs to send A packet segment back to A, B needs to set the source port number from A to B, as shown in the following figure

Connection-oriented multiplexing and multiplexing decomposition

If connectionless multiplexing and demultiplexing are UDP, connection-oriented multiplexing and demultiplexing are TCP. The difference between TCP and UDP in message structure is that UDP is a binary and TCP is a quad. Source IP address, destination IP address, source port number, destination port number, which we also mentioned above. When a TCP packet segment reaches a host from the network, the host disassembs the TCP packet segment to the corresponding socket based on the four values.

The figure above shows the process of connection-oriented multiplexing and multiplexing. In the figure, host C sends two HTTP requests to host B, and Host A sends one HTTP request to host C. Hosts A, B, and C have their own unique IP addresses. Host B can decompose the two HTTP connections because the two source ports that host C requests are different, so for host B, these are two requests and host B can decompose them. For host A and host C, the two hosts have different IP addresses, so host B can also be decomposed.

UDP

Finally, we’re starting to talk about UDP.

UDP, which stands for User Datagram Protocol (UDP), provides a way for applications to send encapsulated IP packets without establishing a connection. If the application developer chooses UDP instead of TCP, the application is dealing directly with IP.

Data from the application is appended with multiplexed/multiplexed source and destination port number fields, as well as other fields, and the resulting message is then passed to the network layer, which encapsulates the transport layer packet segments into IP packets and delivers them to the target host as best it can. The most critical point is that there is no handshake between the transport layer entities of the sender and receiver when the datagram is delivered to the destination host using UDP. Because of this, UDP is known as a connectionless protocol.

UDP characteristics

UDP is a transport layer protocol used by streaming media applications, voice communication, and video conferences. The DNS protocol also uses UDP. These applications or protocols use UDP because of the following points

  • Speed is fastWhen UDP is used, as long as the application process sends data to UDP, UDP packets the data into UDP packets and immediately sends the data to the network layer. TCP has the congestion control function, which determines the congestion of the Internet before sending the data. If the Internet is extremely congested, TCP senders are inhibited. The purpose of using UDP is to achieve real-time performance.
  • No connection requiredTCP requires a three-way handshake before data transmission, whereas UDP requires no preparation for data transmission. Therefore, UDP has no latency for establishing connections. If you use TCP and UDP to compare developers: TCP is the kind of engineer who will design everything well and will not develop without design. He needs to take all factors into consideration before he starts! So it isBy spectrum; UDP is the kind of direct dry dry dry, received the project requirements immediately dry, no matter the design, no matter the technology selection, is dry, this kind of developer is veryunreliable, but it’s great for fast iterative development because you can get started right away!
  • Connectionless stateTCP needs to be maintained on the end systemConnection statusThe connection state includes receive and send caches, congestion control parameters, and ordinal and acknowledgement numbers. These parameters are not present in UDP, nor are they present in send and receive caches. Therefore, certain servers dedicated to a particular application can generally support more active users when the application is running over UDP
  • The packet head has low overheadEach TCP segment has a header overhead of 20 bytes, whereas UDP has a header overhead of only 8 bytes.

It is important to note that not all applications using UDP are unreliable, and that applications can implement reliable data transmission by adding acknowledgement and retransmission mechanisms. Therefore, the biggest characteristic of using UDP protocol is fast speed.

UDP Packet Structure

The following shows the UDP packet structure. Each UDP packet is divided into UDP header and UDP data area. The header consists of four 16-bit (2-byte) fields, which respectively describe the source port, destination port, packet length, and parity value of the packet.

  • Source Port number: This field occupies the first 16 bits of the UDP packet header and usually contains the UDP port used by the application that sends the datagram. The receiving application uses the value of this field as the destination address to send the response. This field is optional and sometimes the source port number is not set. Default to 0 if there is no source port number, usually used in communications where no message is returned.
  • Destination Port number (Destination Port): Indicates the receiving port. The field length is 16 bits
  • Length (Length): This field contains 16 bits and indicates the length of the UDP packet, including the UDP packet header and the LENGTH of the UDP data. The length of the UDP packet header is 8 bytes. Therefore, the value ranges from 8 bytes to 65535 bytes.
  • The Checksum (Checksum): UDP uses the checksum to ensure data security. The checksum also provides error detection. Error detection checks whether data integrity is changed when packets are sent from the source to the destination host. The UDP of the sender performs an inverse operation on 16-bit words in the packet segment, and bit overflow is ignored during summation, such as the following example, in which three 16-bit digits are added

The first two sums of these 16 bits are

Then add the result to the third 16-bit number

The last bit that you add up is going to overflow, the overflow bit 1 is going to be discarded, and then you’re going to do an inverse operation, and the inverse operation is going to turn all the ones into zeros, zeros into ones. So the inverse of 1000, 0100, 1001, 0101 is 0111, 1011, 0110, 1010, which is the checksum, and if there’s nothing wrong with the data on the receiving end, then all four 16-bit numbers are computed, including the checksum, If the resulting value is not 1111 1111 1111 1111, then there is an error in the transmission.

Let’s consider a question. Why does UDP provide error detection?

This is an end-to-end design principle that aims to reduce the probability of various errors in transmission to an acceptable level.

File from host A to host B, that is to say AB host to communication, need to pass three part: first is to host A read from the disk file grouped into small packets packet to the data, and then the packet by connecting network transmission to A host of host A and host B B, the last is the host B received the packet and written to disk. In this seemingly simple but actually very complicated process may affect the normal communication for some reasons. For example: disk file read/write error, buffer overflow, memory error, network congestion and other factors may lead to packet error or loss, which shows that the network used for communication is not reliable.

As communication only goes through the above three links, we would like to add an error detection and correction mechanism in one of them to check the information.

Network layer certainly can’t do it, because the network layer of the main purpose is to increase the rate of data transmission, network layer does not need to consider the data integrity, data completeness and correctness of the system to detect line to the end, so in the transmission of data, for data transmission network layer can only ask its to provide the best possible service, The network layer cannot be expected to provide data integrity services.

UDP is unreliable because it provides error detection, but has no ability to recover from errors and no retransmission mechanism.

In addition, CXuan has sent six PDFS, and the official number replied to CXuan to get all the author’s PDFS.

You can also link to it below

Links:Pan.baidu.com/s/1mYAeS9hI…Password: p9rs