IP protocol is at the third layer of OSI reference model — network layer. The main function of network layer is to realize the communication between terminal nodes. IP is an important protocol at the network layer. There are also ARP(obtaining MAC address) and ICMP (notifying abnormal data transmission) at the network layer.

The function of the data link layer is to realize packet transmission over the same data link, while the network layer can realize packet transmission across different data links. For example, host A is connected to Router B through Wi-Fi, router B is connected to Router C through Ethernet, and Router C is connected to host D through Wi-Fi. In this case, the data packets sent from host A to HOST D depend on the network layer for transmission.

This article mainly introduces the basic knowledge of IP protocol and IP header. IP protocol can be divided into three function modules: IP addressing, routing and IP subcontracting.

The IP address

An IP address is an address used to identify communication peer information at the network layer. It is different from the MAC address in the data link layer, which is used to identify different computers on the same link.

For example, I want to go from my home in Zhenjiang to Northeastern University in Shenyang. The addresses at both ends of communication are home and school respectively, which are equivalent to IP addresses. However, no means of transportation can take me to school directly from home, so I have to take a taxi to the railway station, then take the high-speed train to Shenyang Station, and then transfer to the bus to school. These three transfers belong to three modes of transportation (data links), and each transfer has a starting point and an ending point, which are the equivalent of MAC addresses. Each transfer can be called a Hop.

An IP address is represented by a 32-bit positive integer. For intuitive representation, we divide it into four parts, each consisting of an 8-bit integer with a decimal range from 0 to 255.

For example, 172.20.1.1 can be expressed as 10101100 00010100 00000001 00000001. The conversion rule is simple: convert a four-part decimal number (0-255) to an 8-bit binary number.

In terms of functions, an IP address consists of two parts: the network ID and the host ID.

Network ids are used to distinguish different network segments. Hosts in the same network segment must have the same network representation. Hosts in different network segments cannot have the same network id.

The host ID is used to distinguish hosts on the same network segment. The host ID cannot appear repeatedly on the same network segment.

32-bit IP addresses are divided into two parts. How many of the first bits are network identifiers? Generally, there are two methods: IP address classification and subnet mask.

IP classification

IP addresses are classified into four levels: CLASS A, CLASS B, class C and class D. The classification is based on the first four bits of an IP address:

A Class A IP address is an address whose first digit is 0. The first eight bits of A Class A IP address are network identifiers. In decimal notation, 0.0.0.0 to 127.0.0.0 is the theoretical range of class A IP addresses. In addition, we can see that there are only 128 class A IP addresses at most (actually 126, which will not be described below), and the maximum number of hosts in each network segment is 2 ^ 24, i.e. 16,777,214.

A Class B IP address is an address whose first two digits are 10. The first 16 bits of a Class B IP address are the network identifier. In decimal notation, 128.0.0.0 to 191.255.0.0 are the range of class B IP addresses. The host tag length of class B IP addresses is 16 bits. Therefore, a network segment can contain a maximum of 65534 host addresses.

A Class C IP address is an address whose first three digits are 110. The first 24 bits of a Class C IP address are the network identifier. In decimal notation, 192.0.0.0 to 223.255.255.0 are the range of class C IP addresses. The last eight bits of a Class C address are host identifiers, which contain 254 host addresses.

A Class D IP address is an address whose first four digits are 1110. Class D IP addresses are usually used for multicast because the network id is 32 bits long and has no host ID.

Subnet mask

The total length of the IP address is 32 bits, which can represent a limited number of hosts, about 4.3 billion. IP address classification is even more wasteful. There are only 10,000 types of A and B addresses, and there are many more network segments in the world that contain more than 254 hosts.

We know that the essence of IP address classification is to distinguish between network and host identifiers. Another more flexible and fine-grained method is to use subnet masks.

The subnet mask is also 32 bits long, consisting of consecutive 1s and zeros. The length of 1 represents the length of the network identifier. Take IP address 172.20.100.52 as an example. It is a class B IP address (the first 16 bits are the network id), but the first 26 bits are the network id through the subnet mask:

Routing control

Routing refers to the function of sending packet data to the destination address. This function is usually performed by the router. (Not to be confused with the small wireless router you use at home)

The router stores the routing control table. It searches for the next router address corresponding to the destination IP address in the routing control table. This process is depicted below:

The IP address of host A is 10.1.1.30 and data is sent to host 10.1.2.10. In the routing table of host A, two fields are saved. Since the destination address 10.1.2.10 does not match segment 10.1.1.0/24, it is sent to the default route 10.1.1.1, which is the IP address of the left network card of Router 1 in the figure.

Router 1 continues to look for the destination address 10.1.2.10 in its own routing control table. It finds that the destination address belongs to segment 10.1.2.0/24, so it forwards the data to the next router, 10.1.0.2, which is the address of the left network card on Router 2.

Router 2 searches for destination IP address 10.1.2.10 in the routing control table and sends data to interface 10.1.2.1, that is, the IP address of the network adapter on the right of router 2. Host B checks that the destination IP address is the same as its own and receives data.

Routing control table

The key to routing control is the routing control table, which can be manually set up by administrators, called static routing control, but most people probably don’t do this. This is because routers can exchange information with other routers than even if the routing table is automatically refreshed. The protocol for this information exchange is not defined in the IP protocol, but is managed by a protocol called “routing protocol”.

The loop

In the figure above, suppose host A sends data to an IP address that does not exist, and the default routes set by routers 1, 2, and 3 form A loop, then the data will be continuously forwarded through the network, resulting in network congestion. This problem will be resolved when the IP header is analyzed below.

IP packet segmentation and reassembly

In the data link layer, we have mentioned that different data links have different maximum Transmission units (Mtus). So one of the tasks of IP protocol is to fragment and reorganize data. Sharding is carried out by the sending host and the router, and regrouping is carried out by the receiving host.

Path MTU Discovery

Sharding will increase the burden on the router, so we do not want the router to fragment IP packets whenever possible. In addition, if a fragment is lost, the entire IP datagram is invalidated.

The technology to solve these problems is path MTU discovery. The host first obtains the minimum Mtus of all data links in the entire path and fragments the data according to the entire size. Therefore, any router in the transmission process does not need to do sharding work.

To find the path MTU, the host sends the entire packet first and sets the no fragmentation flag at the IP header to 1. In this way, the router does not slice packets that need to be fragmented. Instead, the router directly discards the data and sends the entire unreachable message to the host through ICMP.

The host sets the MTU in the ICMP notification to the current MTU and fragments data based on the entire MTU. This repeats until ICMP notifications are no longer received, at which point the MTU is the path MTU.

The following uses UDP as an example:

restructuring

The receiving end reorganizes data according to the Flag and Fragment Offset in the IP header. The details will be explained in detail when analyzing the IP header.

The IP header (IPv4)

IP header is a somewhat complex structure, we do not need to memorize its structure, just to understand the role of each part, so as to deepen the understanding of IP protocol.

Some of the important parts are as follows:

  • Total Length: indicates the Total number of bytes in the IP header and data part. The Length of the IP header is 16 bits. Therefore, the maximum Length of an IP packet is 65535 bytes (2^16). Different data links have different Mtus, but THE IP protocol hides these differences. From the perspective of the upper layer, the IP protocol can always transmit packets with the maximum packet length of 65535 through its own data sharding function.

  • ID: Identification: used for fragment reassembly. Frames belonging to the same shard have the same ID. However, even if the ids are the same, if the destination ADDRESS, source address, or upper-layer protocol are different, they are considered to belong to different fragments.

  • Flags: Consists of three bits due to sharding reorganization.

    The first bit is unused and must currently be 0.

    The second bit indicates whether sharding is performed, with 0 indicating that sharding can be performed and 1 indicating that sharding cannot be performed. This bit is used in path MTU discovery technology.

    The third bit indicates whether it represents the last packet when sharding. 1 indicates not the last packet and 0 indicates the last packet allocated.

  • FO: Fragment Offset: consists of 13 bits, indicating the position of the Fragment relative to the original data. It can represent 8192(2^13) positions in 8 bytes, so it can represent a maximum offset of 8 x 8192 = 65536 bytes.

  • TTL: Time To Live: indicates the number of routers through which a packet can be forwarded. Each time it passes through a router, the TTL decreases by 1. This avoids the problem of infinite package passing mentioned earlier.

  • Protocol: Indicates the protocol to which the next IP header belongs. For example, the TCP protocol number is 6, and the UDP protocol number is 17.

  • Header checksum: used to check whether the IP header is damaged

  • Optional: For testing or diagnostic purposes only. If so, use Padding to fill up 32 bits.