I. Theoretical five-layer model
The implementation of the Internet is divided into several layers, each with its own unique function, and each layer supported by the next layer. Users are only exposed to the top layer, what we call the application layer, and to understand the Internet you have to start at the bottom and understand the functionality of each layer from the bottom up.
There are three common network models:
-
OSI seven layer model
-
Theoretical five-tier model
-
TCP/IP four-tier model
The relationship between them is shown below
Among them, the theoretical five layer model is a kind of principle architecture which combines the advantages of OSI seven layer and TCP/IP four layer. The following discussion is also based on the theoretical five-layer model.
Theoretical five-tier model
The structure of is shown below
The functions of each layer are as follows:
- The application layer
- Directly provides services for user application processes
- Common protocols are
HTTP, HTTPS, SMTP, and TELNET
Etc.
- The transport layer
- The transport layer task is to provide a reliable connection between two processes that communicate at the application layer
The end-to-end
Services that blind them to the details of data communication below the transport layer. The end-to-end
It means the data comes from somethingport
To the specifiedport
port
: An integer between 0 and 65535, used to specify a specific application- Common protocols are
TCP, UDP
Etc.
- The transport layer task is to provide a reliable connection between two processes that communicate at the application layer
- The network layer
- The task of the network layer is to select the appropriate route so that the packet can accurately find the destination according to the address and deliver it to the transport layer of the destination.
- The network layer protocol is
IP
agreement
- Data link layer
- The data link layer groups data individually
frame
And determine the packet format of the electrical signal. - The data link layer protocol is
Ethernet
agreement
- The data link layer groups data individually
- The physical layer
- The physical layer defines the standards for the devices that connect computers to each other, and the network signals in the network pass through
0 s and 1 s
Is transmitted by electrical signals.
- The physical layer defines the standards for the devices that connect computers to each other, and the network signals in the network pass through
Simply put, the lower the layer, the closer to the hardware; The higher the layer, the closer it is to the user.
What was the agreement?
Each layer serves a function. In order to implement these functions, there are some common rules that need to be followed. These rules are called protocols.
At every layer of the Internet, there are many protocols defined. These protocols are collectively called The Internet Protocol Suite, and they are at the heart of the Internet.
In the following sections, we introduce the functions of each layer and explain the functions of the main protocols in each layer.
Second, the physical layer
Computer to network, need to connect the computer through a variety of equipment into the network, equipment has optical cable, cable, twisted-pair, infinite wave and so on.This is theThe physical layer
It is the physical means by which computers are connected. It mainly specifies the electrical properties of the networkTransfer 0 and 1
Of electrical signals.
Data link layer
Definition 1.
The physical layer is the medium used to transmit signals, which are 0 and 1 electrical signals. But there are no rules about how electrical signals are grouped and what each signal bit means.
This is the function of the data link layer, which determines the grouping mode of 0 and 1 above the physical layer for information transfer between two devices (the same type of data link node).
2. Ethernet protocol
In the early days, each company had its own way of grouping electrical signals. Gradually, a protocol called Ethernet came to dominate.
Ethernet states that a group of electrical signals constitutes a packet, called a packetThe frame (frame)
, each frame is divided into two parts:The header (the Head)
andData (Data)
.
- Headers and data
- The header contains some description of the packet, such as sender, receiver, data type, etc. Data is the content of the packet.
- Restrictions on headers and data
The header
The length of, fixed as18 bytes
;data
The length of, and the shortest is46 bytes
, for the longest1500 bytes
. Therefore, the wholeframe
For the shortest64 bytes
, for the longest1518 bytes
. If the data is long, it must be split into multiple frames and sent.
MTU(Maximum Transmission Unit)
The MTU is the restriction of the link layer on the physical layer.
- Ethernet frame data segment length is defined as 46-1500 bytes, this
Maximum 1500 bytes
It is called the Maximum Transmission unit (MTU) of the Ethernet and has different Mtus for different network types. - If a packet is routed from the Ethernet to the dial-up link and the length of the packet is larger than the MTU of the dial-up link, you need to fragment the packet.
MTU Impact on IP protocols
Due to the MTU limitation at the link layer, if the packet size exceeds 1500 bytes, the packet must be fragmented and sent.
- A large IP packet is divided into multiple packets and each packet is labeled with the same 16-bit ID in the IP protocol header. In this way, the packet from which the packet comes can be identified during reassembly.
- Each small package
IP header
the3
Bit flag field, the first2
MF to0
Said,Allow the shard
In the first3
Bit DFThe end tag
(Whether the current is the latter packet, if yes, set to 1, otherwise set to 0). - When arriving at the peer end, these packets are reassembled sequentially and returned to the transport layer. Once these little bags
Any packet is missing
, receiving endRestructuring will fail
, but the IP layer is not responsible for retransmitting data.
Impact of MTU on UDP
- The data carried in the UDP packet exceeds the maximum value
1472(1500-20 (IP header) -8 (UDP header))
, then it will be at the network layerSplit into multiple IP datagrams
. - These IP datagrams have
Any one missing
, both cause the receiving endThe network layer reorganization failed. Procedure
. This means that if UDP datagrams are fragmented at the network layer, the probability of the entire data being lost is greatly increased. - in
LAN environment
, UDP data control is recommended inLess than 1472 bytes
; inUnder the Internet environment
, you are advised to control the UDP data withinLess than 548 bytes
.
Impact of MTU on TCP
- The length of a TCP packet cannot be infinite and is still subject to the MTU. The maximum length of a TCP packet is called
MSS(Max Segment Size)
. - TCP is established
The connection
“Will be with each other firstNegotiate MSS
Size (also can only be negotiated during the three-way handshake, otherwise the default is 536 bytes). Ideally, the value of MSS is exactly the length at which the IP will not be sharded (this length is still subject to the MTU of the data link layer). - When SYN is sent, both parties write the supported MSS value in the TCP header. And then both sides get the MSS of the other side
Choose the smaller as the final MSS
. The MSS value is in the 40-byte variable length option at the TCP header (kind=2). MSS = MTU-TCP header - IP header
Is the data length of the TCP packet.
3. The MAC address
As mentioned above, the header of an Ethernet packet contains information about the sender and receiver. So how are senders and receivers identified?
The Ethernet stipulates that all devices connected to the network must have nic interfaces. Packets have to go from one network card to another. The address of the nic is the address for sending and receiving packets, which is called the MAC address.
Each NIC is delivered with a MAC address unique in the world. The length of the ADDRESS is 48 bits, usually represented by 12 hexadecimal numbers.
The first six hexadecimal digits are the manufacturer’s SERIAL number, and the last six hexadecimal digits are the nic serial number of the manufacturer. With a MAC address, you can locate the path of the network card and the packet.
Radio 4.
Defining the address is only the first step, there are more steps to follow:
-
1) First: How does one network adapter know the MAC address of another network adapter?
The answer is that there is an ARP protocol that can solve this problem. This is covered in this article at the network layer, where it is only necessary to know that Ethernet packets must know the MAC address of the receiver before they can be sent.
-
2) Second: even with the MAC address, how can the system accurately send the packet to the recipient?
The answer is that Ethernet works in a very primitive way. Instead of sending packets exactly to the receiver, it sends packets to all the computers on the network, and lets each computer decide for itself whether it is the receiver or not.
In the figure above, computer no. 5 sends a packet to computer No. 3. Computers No. 1, 2, 3, 4, and 6 in the same subnetwork all receive the packet. They read the packet header, find the recipient’s MAC address, compare it to their own MAC address, and if they are the same, receive the packet for further processing, or discard it. This mode of transmission is called broadcasting.
With a packet definition, a MAC address for the network card, and a way to send the broadcast, the “link layer” can transmit data between multiple computers.
Fourth, network layer
1. Origin of the network layer
Ethernet protocol that relies on MAC addresses to send data. Theoretically, chengdu’s network card can find Houston’s network card by MAC address alone, which is technically possible.
There is, however, a major drawback to this. Ethernet uses broadcast mode to send data packets, which is inefficient and limited to the sub-network of the sender. That is, if the two computers are not in the same subnetwork, the broadcast will not pass through. This design makes sense, otherwise every computer on the Internet would receive all the packets and that would be a disaster.
The Internet is a vast network of countless sub-networks, much like imagining computers in Chengdu and Houston in the same sub-network. It is almost impossible.
Therefore, a way must be found to tell which MAC addresses belong to the same subnetwork and which do not. If it is the same subnetwork, the packets are sent in broadcast mode; otherwise, the packets are sent in routing mode. (Routing, which means how packets are distributed to different subnetworks, is a big topic that won’t be covered in this article.) Unfortunately, MAC addresses alone can’t do that. It’s all about the vendor, not the network.
This led to the creation of the network layer. What it does is introduce a new set of addresses that allow us to distinguish between different computers that belong to the same subnetwork. This set of addresses is called web addresses, or web addresses for short.
So, with the advent of the network layer, each computer has two kinds of address, one is a MAC address, the other is a network address. The MAC address is bound to the network adapter, and the network address is assigned by the administrator. They are just randomly combined.
Network addresses help us determine which subnetwork the computer is on, and MAC addresses send packets to the destination network card in that subnetwork. Therefore, it is logical to assume that the network address must be processed first and then the MAC address.
2. IP protocol
The protocol for specifying network addresses is called the IP protocol. The address it defines is called an IP address. At present, the fourth and sixth editions of IP protocols, called IPv4 and IPv6, are widely used.
1) IPv4
IPv4 is defined
- IPv4 specifies the network address
32 bits
Composition; - Traditionally, we use
A decimal number divided into four segments
Indicates an IP address that ranges from 0.0.0.0 to 255.255.255.255.
Subnet mask
Every computer on the Internet is assigned an IP address. The address is divided into two parts, the first part representing the network and the second part representing the host. For example, if the IP address 14.215.177.39, which is a 32-bit address, assumes that the network part is the first 24 bits (14.215.177) and the host part is the last 8 bits (the last 1). Computers in the same subnetwork must have the same network part of their IP addresses, that is, 14.215.177.2 and 14.215.177.1 must be in the same subnetwork. The problem, however, is that we can’t judge the network from the IP address alone. Take 14.215.177.39 as an example. It is not clear from the IP address whether the network part is the first 24 bits, the first 16 bits, or even the first 28 bits. So how can you tell whether two computers belong to the same subnetwork from their IP addresses? This uses another parameter, the subnet mask.
Subnet mask:
- A subnet mask is a parameter that represents the characteristics of a subnetwork. Formally equivalent to an IP address, it is also a 32-bit binary number with all 1s in the network part and all 0s in the host part. 172.16.254.1, for example, IP address, if known network part is the first 24 bits, the host part is eight, after the subnet mask is 11111111.11111111.11111111.00000000, written in a decimal is 255.255.255.0.
- Knowing the subnet mask, we can tell if any two IP addresses are in the same subnetwork. The method is to perform a bitwise and operation on two IP addresses and subnet masks (both digits are 1, the result is 1, otherwise it is 0), and then compare whether the results are the same. If so, they are in the same subnetwork; otherwise, they are not.
NAT
We know that IPv4 addresses are only 32 bits long, and the number of Internet users on the planet has far exceeded that, so why hasn’t it run out yet?
Because we also have some technologies that can mitigate the address shortage in disguise, such as NAT technology.
Network Address Translation (NAT)
- Each small LOCAL area network (LAN) uses a private network address on a network segment, which is converted into a public network address when connected to the outside world. In this way, dozens or hundreds of computers need only one public address.
- You can even do private network over private network, NAT over NAT, layer by layer. In this way, the number of public IP addresses is greatly reduced. Just because of this, let us “continued life” to today, not unable to access the Internet.
- However, NAT also has many disadvantages. Although it is convenient for a private network address to access an Internet address, it is difficult for an Internet address to access a private network address. Many services are limited and you can only solve them through complex Settings, which will affect the efficiency of network processing.
2) IPv6
define
- IPv6 address by
Eight groups of four hexadecimal digits
Composed of, between each group by:
toseparated
. - IPv6 Address Support
Compress the leading zero
The said method, as shown in figure the address can be written as: 2001: D12:0:0:2 AA: 987: FE29:9871 - When a colon appears in hexadecimal format
Consecutive bits with a value of 0
When, these segments canCompressed to double colons (::)
Said, so we got the final simplified format: 2001: D12: : 2 aa: 987: FE29:9871 - Note: The double colon appears only once
IPv6 number segment division and prefix representation
IPv6 has a huge address space of 128 bits. For such a large space, it is not randomly divided, but uses the number segment according to the bit.
The following figure shows the IPv6 address structure
For example, n=48 and m=16 are defined in RFC4291, that is, the subnet ID and interface ID are respectively 64 bits.
IPv6 does not have the concept of subnet mask. Instead, it supports the identification method of subnet prefix.
An IPv6 address/prefix length method is used, for example:
- 2001:C3: 0:2C6a ::/64 Indicates a subnet.
- In 2001: the C3:0-2 C6A: C9B4: FF12:48 BC: 1 a22/64 indicates a node under the subnet address.
An IPv6 address consists of a subnet prefix and an interface ID. The subnet prefix is defined and assigned by the address allocation and management organization, and the interface ID can be generated by each operating system.
Advantages of IPv6
IPv6 is used to solve the problem of IPv4 address exhaustion. IPv4 addresses are 32 bits and IPv6 addresses are 128 bits. Besides the number of addresses, IPv6 has many advantages.
- IPv6 uses smaller routing tables. Making routers forward packets faster;
- IPv6 adds enhanced multicast support and flow control, which is beneficial for multimedia applications and quality of service (QoS) control.
- IPv6 added support for automatic configuration. This is the improvement and extension of DHCP protocol, making the network (especially LAN) management more convenient and fast;
- IPv6 has higher security. Users can encrypt data at the network layer and verify IP packets, which greatly enhances network security.
- IPv6 provides better capacity expansion. IPV6 allows the protocol to be extended if new technologies or applications require it.
- IPv6 has a better header format. The new header format of IPV6 simplifies and speeds up the routing process and improves efficiency.
- …
3) The difference between IPv4 and IPv6
- In the message,
IPv6
The type field of the data link layer is0x86dd
.IPv4
The type field identifier of the data link layer is:0x0800
IPv6 packet header
The fixed for40 bytes
.IPv4 packet header
Department is20 to 60 bytes
. This means that writing code to process IPv6 packets is much more efficient- IPv6 packet header
The checksum field is cancelled
To improve the forwarding efficiency of the router. However, it is worth noting that UDP and TCP enforce checksums under IPv6 (IPv4 is optional). - IPv6 data packets carry upper-layer protocols
ICMPv6, TCP, UDP
While I’m waiting,Next Heade
The values of r are, respectively58, 6, 17
Is similar to the Protocol field in the IPv4 packet header - If the protocol type is not the preceding three, the header of an IPv6 packet is
Extend the head
. Extended headers are a new concept introduced by IPv6. Each IPv6 packet can carry zero or more extended headers, which are organized in a linked list. When an IPv6 data packet carries an extended Header, the value of the Next Header is the type of the extended Header. Why to introduce the concept of extension header? This is also an improvement of IPv6 over IPv4. The extension header replaces the optional information of IPv4, simplifies the header of IPv6, and enhances the expansibility of IPv6. When an IPv6 fragment data packet is sent, IPv6 uses the extended Header to organize information about each fragment. As shown in the figure, the value of Next Header in the IPv6 packet Header 44 indicates that an extended Header exists. The extended Header is the information about IPv6 fragment data. Comparing the IPv4,Shard information
Is recorded in IPv4Message header
theShard field
In the.
4) IP protocol summary
As mentioned above, the IP protocol has two main functions:
- Assign IP addresses to each computer
- Determine which addresses are in the same subnetwork
3. IP packets
Data sent over IP is called AN IP packet. We put IP packets directly into the “data” section of Ethernet packets without modifying Ethernet specifications. That’s the beauty of the Internet’s layered structure: what happens at the top doesn’t affect what happens at the bottom.
Specifically, IP packets are also divided into header and data: The header ranges from 20 to 60 bytes (IPv6 is fixed to 40 bytes), and the total length of the entire packet is up to 65535 bytes. Therefore, theoretically, the data portion of an IP packet is 65515 bytes at most.
As shown in the figure, the 20 bytes in the header are fixed and contain the version, length, IP address, and other information, as well as the optional variable part of the header. The data is the content of the IP packet.
When placed in an Ethernet packet, the Ethernet packet looks like this:
In the Ethernet protocol, the data portion of an Ethernet packet is only 1500 bytes at most. Therefore, if an IP packet is larger than 1500 bytes, it needs to be split into several Ethernet packets and sent separately.
4. The ARP protocol
There is one last point about the network layer. Because IP packets are sent in Ethernet packets, we must know both the MAC address and IP address of the other party. Usually, the IP address of the other party is known, but we do not know its MAC address.
So, we need a mechanism to get MAC addresses from IP addresses.
Again, there are two cases:
- 1) The first case: if there are two hosts
Different subnetworks
So in factThere is no way to
Get each other’sThe MAC address
, can only put packetsTransferred to the
Of the connection between two sub-networksThe gateway
(gateway), let the gateway handle it; - 2) The second case: if two hosts are in
Same subnetwork
So we can useARP protocol
Get each other’sThe MAC address
. The ARP protocol also sends oneThe packet
(contained in the Ethernet packet), which contains the IP address of the host it is querying, inMAC address of the peer party
In this column, it saysFF:FF:FF:FF:FF:FF
Theta means that this is aThe broadcast address
. Every host in its subnetwork will receive this packet fromFetching IP Address
.With their own
The IP address of theTo compare
. If bothThe same
, bothreply
, to the other party to report their ownThe MAC address
.Otherwise throw it away
This package.
In short, with THE ARP protocol, we can get the MAC address of the host in the same subnetwork, and can send packets to any host.
ARP attack
ARP attacks take advantage of the lack of security verification vulnerabilities in THE design of ARP protocol. They steal communication data of legitimate users by forging ARP packets, causing serious harm such as affecting network transmission rate and stealing user privacy information.
ARP attacks mainly exist in the LOCAL area network (LAN). If a computer on the LAN is infected with an ARP Trojan, the system that is infected with the ARP Trojan attempts to intercept the communication information of other computers on the network through ARP spoofing, which causes communication failure of other computers on the network.
Common ARP attacks in the LAN include intermittent Internet access, file copying failure, and ARP packet surge. The MAC address corresponds to multiple IP addresses. The network data cannot be sent out. The information sent on the network is stolen. The protocol address of the packet does not match, resulting in a large number of ARP packets in the network. In the LAN environment, ARP attacks are the main security threats. In traditional networks, ARP attacks are solved through static binding, but this mode limits the ease of network extension.
5. Transport layer
1. Origin of transport layer
With MAC addresses and IP addresses, we can already establish communication between any two hosts on the Internet.
The next problem is that there are many applications on the same host that need to use the Web, for example, while you browse the Web and chat with your friends online. When a packet comes in from the Internet, how do you know if it’s the content of a web page or an online chat?
That is, we also need a parameter to indicate which program (process) is using the packet. This parameter is called port, and it is actually the number of each program that uses the network card. Each packet is sent to a specific port on the host, so different programs can get the data they need.
The ** port is an integer between 0 and 65535, with exactly 16 bits. Ports 0 to 1023 are occupied by the system. You can select only ports larger than 1023. ** Whether browsing the Web or chatting online, the application selects a port at random and then contacts the corresponding port on the server.
The function of the transport layer is to establish port-to-port communication. In contrast, the function of the network layer is to establish host-to-host communication. As long as we identify the host and port, we can implement communication between programs. For this reason, Unix systems call hosts + ports a socket. With it, you can do web application development.
2.UDP: User datagram protocol
Now, we have to add port information to packets, which requires new protocols. The simplest implementation is called UDP, and the format is almost nothing more than a port number in front of the data.
A UDP packet consists of a header and data:As shown in the figure, the UDP header is fixed as8 bytes
Contains 16-bit source port and 16-bit destination port, as well as 16-bit UDP length and 16-bit UDP checksum. Then, the entire UDP packet is placed in the data section of the IP packet, which, as mentioned earlier, is placed in the Ethernet packet, so the entire Ethernet packet now looks like this:
UDP packets are very simple. The header is only 8 bytes, and the total length is no more than 65,535 bytes, which fits into an IP packet.
The maximum size of a UDP packet
-
In a LAN environment, it is recommended that the UDP data be less than 1472 bytes
-
The length of Ethernet data frames must be between 46 and 1500 bytes, which is determined by the physical properties of Ethernet. This 1500 bytes is called the MTU(Maximum Transmission unit) of the link layer. This does not mean that the length of the link layer is limited to 1500 bytes. In fact, the MTU refers to the data area of the link layer and does not include the 18 bytes at the front and rear of the link layer.
-
So, in effect, this 1500 bytes is the length limit for network layer IP datagrams. Since the header of an IP datagram is a minimum of 20 bytes, the maximum length of an IP datagram is 1480 bytes. The 1480 bytes are used to store TCP segments sent by TCP or UDP datagrams sent by UDP.
-
Since the header of a UDP datagram is 8 bytes, the maximum length of a UDP datagram is 1472 bytes. This 1472 bytes is the number of bytes we can use.
-
What happens when we send more than 1472 UDP data? This means that the IP packet is greater than 1500 bytes and larger than the MTU. In this case, the SENDER IP layer needs to be fragmented. The datagram is divided into several pieces, so that each piece is smaller than the MTU, and the receiver IP layer needs to reorganize the datagram. This does a lot more, and worse, because of the nature of UDP, when a piece of data is lost in a transmission, the receiver cannot reconstruct the datagram, resulting in the entire UDP datagram being discarded.
-
Therefore, in a normal LAN environment, I recommend keeping the UDP data under 1472 bytes.
-
-
During Internet programming, it is recommended that the UDP data be less than 548 bytes
-
Internet programming is different because routers on the Internet may set the MTU to a different value. If we assume that an MTU of 1500 is used to send data, and the MTU of a network passing through is less than 1500 bytes, then the system will use a number of mechanisms to adjust the MTU value so that the datagram can reach its destination smoothly, which will do a lot of unnecessary operations.
-
Since the standard MTU value on the Internet is 576 bytes, I recommend that the UDP data length be controlled within 548 bytes (576-8-20) when programming UDP on the Internet.
-
The ipv4 protocol specifies a minimum reassembly buffer size of 576 for the IP layer. Therefore, it is recommended that UDP packets do not exceed this size, not because the standard MTU for the Internet is 576!
-
3.TCP: Transmission control protocol
The ADVANTAGE of UDP is that it is simple and easy to implement. However, the disadvantage is that the reliability is poor. Once a packet is sent, you cannot know whether the packet has been received. In order to solve this problem, improve network reliability, TCP protocol was born. This protocol is very complex, but can be approximated as UDP protocol with acknowledgement mechanism, each packet sent requires confirmation. If a packet is missing, no acknowledgement is received and the sender knows it is necessary to resend the packet.
As a result,TCP ensures that data is not lost. Its disadvantages are complicated process, difficult implementation and consuming more resources. As shown in the figure, the TCP header ranges from 20 to 60 bytes. In addition to the source port number and destination port number, the TCP header also contains various information, such as serial number and confirmation number, to ensure reliable connection and data resending.
Like UDP packets, TCP packets are embedded in IP packets. The length of a TCP packet is unlimited. However, to ensure network efficiency, the length of a TCP packet does not exceed that of an IP packet, so that a single TCP packet does not need to be split.
Application layer
The application receives data from the transport layer and then interprets it. Because the Internet is an open architecture, and data comes from so many different sources, it has to be formatted beforehand, otherwise it’s impossible to read. The role of the application layer is to dictate the data format of the application.
For example, TCP can transfer data for a variety of applications, such as Email, WWW, FTP, and so on. Then, different protocols must dictate the format of E-mail, web pages, and FTP data, and these application protocols constitute the application layer. This is the highest layer, directly facing the user. Its data is in the data section of the TCP packet.
So Ethernet packets now look like this: