This article will familiarize you with the TCP/IP protocol.

Again, this article is a long one, so let’s take you through a mind map.

One, the computer network architecture layer

Computer network architecture is layered

As you can see, TCP/IP differs slightly from OSI in the layered modules. The OSI reference model focuses on “what are the necessary functions of the communication protocol”, while TCP/IP puts more emphasis on “what kind of program should be developed to implement the protocol on the computer”.

Second, TCP/IP foundation

1. Specific meaning of TCP/IP

In a literal sense, one might think that TCP/IP refers to both TCP and IP protocols. In real life, it sometimes refers to these two kinds of agreements. However, in many cases, it is only the use of IP communication protocol group must be used. Specifically, IP or ICMP, TCP or UDP, TELNET or FTP, and HTTP are among the TCP/IP protocols. They are closely related to TCP or IP and are an essential part of the Internet. The term TCP/IP refers to these protocols in general, so TCP/IP is sometimes called the Internet Protocol group. Communication over the Internet requires corresponding network protocols. TCP/IP is originally a protocol family developed for the use of the Internet. So the protocol for the Internet is TCP/IP, and TCP/IP is the protocol for the Internet.

2. The packet

The terms packet, frame, packet, segment, and message are used to describe units of data. They are roughly divided as follows:

Package is a totipotency term;
A frame is used to represent the unit of a packet in the data link layer.
A packet is a unit of packet in layers above the network layer such as IP and UDP.
Segment represents the information in the TCP data stream.
A message is a unit of data in an application protocol.

In each layer, a header is attached to the data being sent. This header contains the necessary information for the layer, such as the destination address to be sent and information related to the protocol. Generally, the information provided for the protocol is the packet header, and the content to be sent is data. From the perspective of the next layer, all packets received from the previous layer are considered data of this layer.

The data packet transmitted over the network consists of two parts: one is the header used by the protocol, and the other is the data transmitted from the previous layer. The header structure is defined by the protocol specification. At the head of the packet, it is clear how the protocol should read the data. On the other hand, by looking at the header, you can understand the information necessary for the protocol and the data to be processed. The packet header is like the face of the protocol.

3. Data processing process

The following figure shows user A sending an email to user B:

(1) Application processing First, the application will conduct coding processing, which is equivalent to the OSI presentation layer function; After the encoding is converted, the mail is not necessarily sent immediately. This management function of when to establish communication connection and when to send data is equivalent to the OSI session layer function.
According to the application instructions, TCP is responsible for establishing connections, sending data and disconnecting connections. TCP provides reliable transmission of data from the application layer to the peer end. To achieve this, a TCP header needs to be attached to the front end of the application layer data.
③ IP module processing IP TCP transmission TCP header and TCP data together as their own data, and in the front of the TCP header plus their own IP header. After an IP packet is generated, the route or host that accepts the packet is determined by referring to the routing control table.
(4) Processing of the network interface (Ethernet-driven) IP packets from IP are data to Ethernet. The Ethernet header is attached to the data and sent. The generated Ethernet data packets are transmitted to the receiver through the physical layer.
⑤ Processing of the network interface (Ethernet driver) After receiving an Ethernet packet, the host first finds the MAC address in the Ethernet packet header and determines whether the packet is sent to itself. If not, the host discards the data. If the packet is sent to itself, the data type is determined from the type in the Header of the Ethernet packet and then transmitted to the corresponding module, such as IP or ARP. The example here is IP.
⑥ Processing of IP module The IP module also does similar processing after receiving data. Check whether the IP address in the packet header matches your own IP address. If yes, the data is sent to the corresponding module according to the protocol type of the packet header, such as TCP or UDP. The example here is TCP. In addition, in the case of a router, the recipient address is often not its own. In this case, the routing control table is used to investigate the host or router to be sent before forwarding the data.
In the TCP module, first of all, the checksum will be calculated to determine whether the data is damaged. Then check whether the data is being received according to the sequence number. Finally, check the port number to determine the specific application. Once the data has been received in its entirety, it is passed to the application identified by the port number.
The receiving application receives the data sent by the sender directly. By analyzing the data, show the corresponding content.

TCP and UDP in transport layer

TCP/IP has two representative transport layer protocols, namely TCP and UDP.

TCP is a connection-oriented, reliable streaming protocol. A flow is an uninterrupted data structure. When an application sends a message over TCP, the order of the message can be ensured, but it is still sent to the receiver as if there were no data flow at any interval. TCP implements sequence control or resend control to provide reliable transmission. In addition, it has many functions such as “flow control”, “congestion control”, and improving network utilization.
UDP is a datagram protocol with no reliability. The subtlety will be left to the applications at the top. In the case of UDP, while the size of the sent message is guaranteed, there is no guarantee that the message will arrive. As a result, applications sometimes resend as needed.
The advantages and disadvantages of TCP and UDP cannot be compared simply and absolutely: TCP is used in situations where reliable transport is necessary at the transport layer; On the one hand, UDP is mainly used for communications or broadcast communications that require high speed transmission and real-time performance. TCP and UDP should be used as needed for the purpose of the application.

1. The port number

The address in data link and IP refers to MAC address and IP address respectively. The former is used to identify different computers on the same link, while the latter is used to identify interconnected hosts and routers in a TCP/IP network. At the transport level, there is also the concept of an address, which is a port number. The port number is used to identify the different applications communicating within the same computer. Therefore, it is also referred to as the program address.

1.1 Identifying applications by Port Number

Multiple programs can run on a computer at the same time. It is these port numbers that transport layer protocols use to identify the applications that are communicating with each other on the host and accurately transfer data.

1.2 Communication identification by IP address, port number, and protocol number

It is not enough to identify a particular communication by the target port number alone.

The communication is identified by port number, IP address, and protocol number

Communication between ① and ② takes place on two computers. They both have the same target port number, 80. This can be distinguished by the source port number.
The target and source port numbers of ③ and ① are the same, but their source IP addresses are different.
In addition, when the IP address and port number are all the same, we can also distinguish by the protocol number (TCP and UDP).

1.3 Determining the Port number

Standard established port number: This method is also called static method. It means that each application has its specified port number. But that doesn’t mean you can use just any port number. For example, the widely used application protocols such as HTTP, FTP, and TELNET have fixed port numbers. These are known as well-known port numbers and range from 0 to 1023; In addition to the well-known port numbers, there are a number of officially registered port numbers ranging from 1024 to 49151, but these can be used for any communication purpose.
Timing allocation: It is necessary for the server to determine the listening port number, but the client receiving the service does not need to determine the port number. In this way, the client application can not set the port number itself, and the operating system can assign it. The dynamically allocated port number ranges from 49152 to 65535.

1.4 Port numbers and Protocols

The port number is determined by the transport layer protocol used. Therefore, different transport layer protocols can use the same port number.
In addition, those well-known port numbers have nothing to do with transport layer protocols. As long as the port is consistent, the same application will be assigned for processing.

2. UDP

UDP does not provide complex control mechanisms, and uses IP to provide connectionless communication services.
And it is a mechanism for sending data from an application to the network the moment it is received, as-is. Even if the network is congested, UDP cannot perform traffic control to avoid network congestion.
In addition, if a packet is lost in transit, UDP is not responsible for resend.
There is no correction even when the arrival order of packages is out of order.
If the above details are needed, they have to be handled by an application using UDP.
UDP is used in the following aspects: 1. Communication with a small number of packets (such as DNS and SNMP). 2. Video, audio and other multimedia communication (instant communication); 3. Application communication is restricted to specific networks such as LAN. 4. Broadcast communications (broadcast, multicast).

3. TCP

TCP is quite different from UDP. It can fully realize various control functions during data transmission, such as retransmission control when the packet is lost, and sequential control of the disordered subcontracting. None of this is in UDP.
In addition, TCP, as a connection-oriented protocol, sends data only when the communication peer is confirmed to exist, thus controlling the waste of communication traffic.
According to these mechanisms of TCP, high reliability communication can also be achieved on the IP connectionless network (mainly through verification and, serial number, acknowledgement reply, resend control, connection management, window control and other mechanisms).

3.1 Three-way Handshake (Key)

TCP provides connection-oriented communication transmission. Connection-oriented refers to the preparation between the two ends before data communication begins.
A three-way handshake is a TCP connection that requires the client and server to send a total of three packets to confirm the connection. In socket programming, this process is triggered by the client executing connect.

Here’s a flow chart of the three handshakes:

First handshake: The client sets the SYN flag bit to 1, generates a value seq=J randomly, and sends the packet to the server. The client enters the SYN_SENT state, waiting for the server to confirm.
Second handshake: After the server receives the packet, the flag bit SYN=1 knows that the client requests to establish a connection. The server sets the flag bit SYN and ACK to 1, ACK =J+1, randomly generates a value seq=K, and sends the packet to the client to confirm the connection request, and the server enters the SYN_RCVD state.
The third handshake: After receiving the acknowledgement, the client checks whether the ACK is J+1 and ACK is 1. If it is correct, it sets the flag bit ACK to 1 and ACK =K+1 and sends the data packet to the server. The server checks whether the ACK is K+1 and ACK is 1. The client and server enter the ESTABLISHED state and complete the three-way handshake. Data transmission between the client and server can begin.

3.2 Four waves (key points)

A TCP connection is terminated by four waves. When a TCP connection is disconnected, the client and server need to send a total of four packets to confirm the disconnection. In socket programming, this process is triggered by either the client or the server performing a close.
Since TCP connections are full-duplex, each direction must be closed separately. The principle is that when a party finishes sending data, it sends a FIN to terminate the connection in that direction. Receiving a FIN simply means that no data is flowing in that direction. However, data can still be sent on this TCP connection until a FIN is also sent in that direction. The first party to close will perform an active close, while the other party will perform a passive close.

Here’s a diagram of the four waves:

The disconnecting end can be either the client or the server.
First wave: The client sends a FIN=M to stop data transfer from the client to the server. The client enters the FIN_WAIT_1 state. This means “my client has no data to send you”, but if you still have data to send from the server, you don’t have to close the connection. You can continue sending data.
Second wave: After receiving FIN, the server sends ACK =M+1 to tell the client that I have received your request, but I am not ready yet. Please continue to wait for my message. Then the client enters the FIN_WAIT_2 state and continues to wait for FIN packets from the server.
Third wave: When the server confirms that the data has been sent, it sends a FIN=N packet to the client, telling the client that it is ready to close the connection. The server enters the LAST_ACK state.
The fourth wave: After receiving the FIN=N packet, the client knows that it can close the connection, but it does not trust the network. Therefore, the client sends an ACK =N+1 and enters the TIME_WAIT state. If the Server does not receive an ACK, the client can retransmit the packet. When the server receives the ACK, it knows it is ready to disconnect. The client still does not receive a reply after waiting for 2MSL, which proves that the server has been shut down normally. Ok, my client can also close the connection. Four handshakes were completed.

The above is the case where one party actively closes and the other party passively closes. In practice, there will also be the case where active closing is initiated at the same time. The specific process is as follows:

3.3 Improving reliability by using serial numbers and Acknowledgement Responses

In TCP, when data from the sender reaches the receiving host, the receiving host returns a notification that the message has been received. This message is called an acknowledgment acknowledgment (ACK). When the sender sends the data, it waits for the acknowledgement from the peer. If there is an acknowledgement reply, the data has successfully reached the peer end. Otherwise, the possibility of data loss is high.
If no acknowledgement is received within a certain period of time, the sender can consider the data lost and resend it. Thus, even if packet loss occurs, the data can still reach the peer end, realizing reliable transmission.
Not receiving an acknowledgement does not necessarily mean that data is lost. It is also possible that the data was received, but the return acknowledgement was lost en route. This can also cause the sender to resend the data in the mistaken belief that it did not reach its destination.
In addition, there may be other reasons for the delay in the arrival of the acknowledgement reply, and it is not uncommon for it to arrive after the source host has retransmitted the data. In this case, the source host only needs to resend the data according to the mechanism.
It is not desirable for the target host to receive the same data over and over again. In order to provide reliable transmission to upper-layer applications, the destination host must discard duplicate packets. For this we introduced serial numbers.
Serial number is the sequence in which each byte (8-bit byte) of the data sent is numbered. The receiver queries the sequence number and the length of the data in the TCP header of the received data, and sends back the sequence number that it should receive in the next step as the acknowledgement reply. Through the serial number and acknowledgement reply number, TCP can identify whether the data has been received and whether it needs to be received, thus achieving reliable transmission.

3.4 Determination of resend timeout

Resend timeout refers to the specific interval of time that waits for an acknowledgement to arrive before resend data. If no acknowledgement is received after this period, the sender resends the data. Ideally, find a minimum time in which “the confirmation response must be returned”.
TCP requires high performance communication regardless of the network environment, and must maintain this feature regardless of changes in network congestion. To do this, it calculates the round-trip time and its deviation each time a packet is sent. Add this round trip time to the offset time, and the resend timeout is a value slightly greater than this sum.
On BSD’s Unix and Windows systems, timeouts are controlled in units of 0.5 seconds, so the resend timeout is an integer multiple of 0.5 seconds. Initially, however, the default value for the resend timeout is generally set to about 6 seconds.
If no acknowledgement is received after the data is resended, the data is sent again. At this point, the waiting time for a confirmation response will increase exponentially by a factor of two or four.
In addition, data is not resent indefinitely and repeatedly. If there is still no acknowledgement after a certain number of retransmission times, the network or peer host is abnormal and the connection is forcibly closed. And notifies the application that the communication is abnormal and forcibly terminates.

3.5 Sending data by segment

While establishing a TCP connection, you can also determine the unit to send the packet, which we can also call the “maximum message length” (MSS). Ideally, the maximum message length is exactly the maximum data length in the IP that will not be sharded.
When TCP transmits a large amount of data, the data is divided and sent in the size of MSS. The resend is also in MSS units.
The MSS is calculated between the two hosts during the three-way handshake. When the two hosts send a connection request, they write the MSS option in the TCP header to tell each other the size of the MSS that their interfaces can accommodate. A smaller value between the two is then selected for use.

3.6 Use window control to improve the speed

In TCP, the unit is one segment. Each segment is sent for an acknowledgement response. One disadvantage of such a transport is that the longer the round trip time of the packet, the lower the communication performance.
To solve this problem, TCP introduced the concept of Windows. The confirmation response is no longer in each segment, but in larger units, and the forwarding time will be greatly shortened. In other words, the sending host, instead of waiting for an acknowledgement after sending a segment, continues sending. As shown in the figure below:

Window control
The window size is the maximum amount of data that can be sent without waiting for an acknowledgement. The window size in the figure above is 4 segments. This mechanism enables the use of a large number of buffers to acknowledge and respond to multiple segments simultaneously.

3.7 Sliding Window control

The data in the window above can be sent without receiving an acknowledgement. However, the sender is still responsible for retransmission if some of the data is lost before an acknowledgement reply for the entire window arrives. To do this, the sending host needs to set up a cache to hold the data to be retransmitted until it receives an acknowledgement.
The part outside the sliding window includes the data that has not been sent and the data that has been acknowledged by the peer. When the data is sent and the acknowledgement is received as expected, the data can be cleared from the cache.
Upon receipt of the acknowledgement reply, slide the window to the location of the serial number in the acknowledgement reply. In this way, multiple segments can be sent sequentially at the same time to improve communication performance. This mechanism is also known as sliding window control.

3.8 Resend control in window control

In the use of window control, packet loss is generally divided into two cases:

① Confirm the failure to return the response. In this case, the data has reached the peer end, and there is no need to resend it, as shown in the following figure:

② A packet segment is lost. When a receiving host receives data other than the serial number it is supposed to receive, it returns an acknowledgement of the data it has received so far. As shown in the following figure, when a certain packet segment is lost, the sender will always receive an acknowledgement with the serial number 1001. Therefore, when the window is large and the packet segment is lost, the acknowledgement with the same sequence number will be returned repeatedly. If the host on the sending end receives the same acknowledgement for three consecutive times, the host resends the corresponding data. This mechanism is more efficient than the timeout management mentioned earlier and is therefore also known as high speed resend control.

IP protocol at the network layer

IP (IPv4, IPv6) is equivalent to layer 3, the network layer, in the OSI reference model. The main function of the network layer is to “realize communication between terminal nodes”. This communication between terminal nodes is also called “point-to-point communication”.
The next layer of the network, the data link layer, is primarily used to transfer packets between nodes that interconnect the same type of data link. Once spanning multiple data links, the network layer is needed. The network layer can span different data links so that packets can be transmitted between two nodes even on different data links.
IP is roughly divided into three functional modules, which are IP addressing, routing (forwarding up to the final node), and IP subcontracting and grouping.

1. The IP address

1.1 Overview of IP Addresses

In computer communication, in order to identify the communication peer, it is necessary to have an identification code similar to the address. A MAC address in a data link is an identifier used to identify different computers on the same link.
An IP address at the network layer also has this type of address information, commonly called an IP address. The IP address is used to identify the target address for communication among all hosts connected to the network. Therefore, in TCP/IP communication, all hosts or routers must set their own IP address.
Regardless of which data link a host is connected to, the form of its IP address remains the same.
An IP address (IPv4 address) is represented by a positive 32-bit integer. IP addresses are processed inside the computer in binary mode. However, since we are not used to binary, we divide the 32-bit IP addresses into groups of 8 bits and each group is “.” And then convert each group of numbers to decimal numbers. As follows:

2⁸	2⁸	2⁸	2⁸
10101100	00010100	00000001	00000001	(Base 2)
10101100.	00010100.	00000001.	00000001	(Base 2)
172.	20.	1.	1	(Base 10)

1.2 THE IP address consists of the network identifier and the host identifier

As shown below, the network identity is configured with a different value for each segment of the data link. The network identifier must ensure that the address of each segment connected to each other is not the same. The hosts connected in the same segment must have the same network address. The host ID of the IP address cannot be repeated in the same network segment. Therefore, you can set the network address and host address to ensure that the IP address of each host does not overlap on the entire interconnected network. That is, the IP address is unique.

As shown in the following figure, when an IP packet is forwarded to a router, it is routed using the network ID of the destination IP address. Because even without looking at the host id, as long as you see the network id can determine whether the host is in the network segment.

1.3 IP Address Classification

IP addresses are classified into four levels: A, B, C, and D. It distinguishes the network identity from the host identity based on the first to fourth bit column in the IP address.
A Class A IP address must start with 0. Bits 1 through 8 are its network identity. In decimal notation, 0.0.0.0 to 127.0.0.0 are class A network addresses. The last 24 bits of A Class A address correspond to the host id. Therefore, the maximum number of host addresses in a network segment is 16,777,214.
A Class B IP address consists of the first two digits of “10”. Bits 1 through 16 are its network identity. In decimal notation, 128.0.0.0 to 191.255.0.0 are class B network addresses. The last 16 bits of a Class B address correspond to the host id. Therefore, a network segment can contain a maximum of 65,534 host addresses.
A Class C IP address is an IP address whose first three digits are 110. Bits 1 through 24 are its network identity. In decimal notation, 192.0.0.0 to 223.255.255.0 are class C network addresses. The last eight bits of a Class C address are equivalent to the host id. Therefore, a network segment can contain a maximum of 254 host addresses.
A Class D IP address is an address whose first four bits are 1110. Bits 1 through 32 are its network identity. In decimal notation, 224.0.0.0 to 239.255.255.255 are class D network addresses. Class D addresses have no host id and are often used for multicast.
There is one thing to note about host identifiers when assigning IP addresses. That is, when you use bits to represent the host address, all the bits cannot be 0 or 1. Because all zeros are used only when the corresponding network address or IP address is not known. A host with all 1s is usually used as a broadcast address. Therefore, both cases should be removed from the allocation process. This is why class C addresses can have a maximum of 254 (28-2 = 254) host addresses per network segment.

1.4 Broadcast Address

Broadcast addresses are used to send packets between interconnected hosts on the same link. Set all the host addresses in the IP address to 1 to make it a broadcast address.
There are two types of broadcasting: local and direct. Broadcasts within the network are called local broadcasts; Broadcasting between networks is called direct broadcasting.

1.5 IP multicast

Multicast is used to send packets to all hosts within a specific group. Because it uses IP addresses directly, there is no reliable transmission.
Compared to broadcast, multicast can both penetrate the router and send packets only to the necessary groups. See below:

IP multicast
Multicast uses Class D addresses. Therefore, if the number of digits from the first digit to the fourth digit is “1110”, it is considered a multicast address. The remaining 28 bits can be multicast group numbers.
In addition, for multicast, all hosts (non-router hosts and terminal hosts) must belong to group 224.0.0.1, and all routers must belong to group 224.0.0.2.

1.6 Subnet Mask

Now the network identity and host identity of an IP address are no longer limited by the category of the address, but by an identifier called A “subnet mask” through the subnet network address subdivided into A smaller granularity than class A, B, C networks. In fact, this method is A mechanism that uses the original host addresses of class A, CLASS B, and Class C as subnet addresses and divides the original network into multiple physical networks.
The subnet mask is a 32-bit number in binary mode. The bits corresponding to the NETWORK id of the IP address are all 1, and the bits corresponding to the host ID of the IP address are all 0. Thus, an IP address can no longer be restricted to its own category, but can use such a subnet mask to freely locate its own network id length. Of course, the subnet mask must be the first consecutive “1” of the IP address.
There are two methods for representing a subnet mask. In the first, the IP address and the subnet mask address are represented by two lines each. For example, if the first 26 bits of 172.20.100.52 are network addresses:

The IP address	172.	20.	100.	52
Subnet mask	255.	255.	255.	192

The network address	172.	20.	100.	0
Subnet mask	255.	255.	255.	192

The broadcast address	172.	20.	100.	63
Subnet mask	255.	255.	255.	192

The second method is to append the bits of the network address to each IP address separated by slicing (/), as follows:

The IP address	172.	20.	100.	52	/ 26
The network address	172.	20.	100.	0	/ 26
The broadcast address	172.	20.	100.	63	/ 26

In addition, in the second way, the network address can be omitted after the “0”. For example, 172.20.0.0/26 is the same as 172.20/26.

2. The routing

The IP address used to send packets is a network layer address, that is, an IP address. However, the IP address alone is not enough to send the data packet to the destination address of the peer. In the process of sending the data, something like “specify the router or host” is needed to actually send the data to the destination address. This information is stored in the routing control table.
The routing control table can be manually set by the administrator or automatically refreshed when a router exchanges information with other routers. The former is also called static routing control and the latter is called dynamic routing control.
The IP protocol always assumes that the routing table is correct. However, IP itself does not define the protocol for making routing control tables. That is, IP has no mechanism for making routing control tables. This representation is made up of a protocol called the Routing Protocol.

2.1 IP Address and Routing Control

The network address part of an IP address is used for routing control.
The routing control table records the network address and the address that should be sent to the router next.
Before sending an IP packet, the destination address in the packet header must be determined, and then the record with the same network address as the IP address must be found in the routing control table. The IP packet is forwarded to the next router according to the record. If there are multiple records of the same network address in the ROUTING control table, select the network address that most matches the routing control table.

Routing control table and IP packets are sent

3. IP subcontracting and group package

The maximum transmission unit (MTU) of each data link is different because each type of data link serves a different purpose. The MTU that can be carried varies according to the purpose.
Any host must process IP fragments accordingly. Fragments are processed only when large packets cannot be sent all at once on the network.
IP datagrams that have been fragmented can only be reassembled by the destination host. Routers do shards but they don’t reassemble.

3.1 Discovering a Path MTU

Sharding also has its drawbacks. For example, the processing load of the router is increased. Therefore, it is not desirable for routers to fragment IP packets whenever possible.
In order to deal with the shortage of fragment mechanism, “path MTU discovery” technology came into being. The path MTU is the maximum MTU size between the sending host and the receiving host without fragments. That is, the smallest MTU of all data links in the path.
After the path MTU is discovered, fragments can be avoided on the midway router and larger packets can be sent through TCP.

4. IPv6

IPv6 (IP Version 6) is a standardized Internet protocol to solve the problem of IPv4 address exhaustion. An IPv4 address contains four 8-bit bytes, that is, 32 bits. The length of an IPv6 address is four times that of the original address, or 128 bits, which is generally written as eight 16-bit bytes.

4.1 Features of IPv6

IP know extension and routing control table aggregation.
Performance improvement. The packet header length is fixed (40 bytes), and the header verification code is no longer used. Simplify the header structure and reduce the burden of routers. Routers no longer do fragmentation.
Support plug and play. Automatic IP address allocation can be implemented even if there is no DHCP server.
Authentication and encryption are used. The network security function to deal with forged IP addresses and the function to prevent wiretapping.
Multicast and Mobile IP are extended functions.

4.2 Methods of marking an IPv6 IP address

Generally, 128-bit IP addresses are grouped in 16-bit groups, separated by colons (:).
If consecutive zeros appear, they can be omitted and separated by two colons (” : : “). However, two consecutive colons are allowed in an IP address only once.

4.3 Structure of aN IPv6 address

Similar to IPv4, IPv6 identifies an IP address by its first few bits.
In Internet communication, a global unicast address is used. It is a unique address on the Internet and does not need to be formally assigned an IP address.

undefined	0000… 0000 (128 bits)	: : / 128
The loopback address	0000… 0001 (128 bits)	: : 1/128
Unique local address	1111, 110,	FC00: / 7
Link-local unicast address	1111, 1110, 10	FE80: : / 10
The multicast address	1111, 1111,	FF00: : / 8
Global unicast address	(other)

4.4 Global Unicast Address

A global unicast address is a unique address in the world. It is the most commonly used IPv6 address in Internet communication and intra-domain communication.
As shown in the following figure, the current IPv6 network uses n = 48, m = 16, and 128-n-m = 64. The first 64 bits are the network id, and the last 64 bits are the host ID.

4.5 Link-local Unicast Address

Link-local unicast addresses are unique addresses on the same data link. It is used to communicate on the same link without going through a router. Generally, the interface ID stores the MAC address of the 64-bit version.

4.6 Unique Local Address

The unique local address is the address used when Internet communication is not conducted.
The unique local address is not connected to the Internet, but as much as possible, a unique global ID is randomly generated.
L is usually set to 1
The value of the global ID is determined randomly
Subnet ID indicates the subnet ADDRESS of the domain
Interface ID indicates the ID of an interface

4.7 IPv6 segmentation

IPv6 sharding is performed only on the originating sending host. Routers do not participate in the sharding.
The minimum MTU in IPv6 is 1280 bytes. Therefore, for devices with certain system resource limitations in embedded systems, there is no need to “discover path MTU”. Instead, IP packets are directly sent in 1280-byte fragments.

4.8 IP Header (Omitted)

5. IP protocol related technologies

IP is designed to get packets to the final destination host, but it is not possible to communicate with IP alone during this process. There must also be the ability to resolve host names and MAC addresses, as well as the ability to handle exceptions during packet transmission.

5.1 the DNS

We don’t normally use an IP address when visiting a website, but a string of Roman characters and dots. Generally, users do not use IP addresses when using TCP/IP to communicate. This is possible because of the Domain Name System (DNS) function. DNS can automatically translate that string into a specific IP address.
This DNS works not only for IPv4 but also for IPv6.

5.2 the ARP

Once the IP address is determined, an IP datagram can be sent to the target address. However, at the underlying data link layer, it is necessary to know the MAC address of each IP address for actual communication.
ARP is a protocol for solving address problems. The target IP address is used as a clue to locate the MAC address of the next network device that should receive the data subcontract. However, ARP only applies to IPv4, not IPv6. In IPv6, ICMPv6 can be used to send neighbor discovery messages instead of ARP.
RARP is a protocol that inverts ARP to locate IP addresses from MAC addresses.

5.3 the ICMP

ICMP is used to check whether an IP packet has been successfully sent to the destination address, notify the reason why an IP packet is discarded, and improve network Settings.
In IPv4, ICMP only supports IPv4 as a secondary function. In other words, in the IPv4 period, even without ICMP, IP communication can still be achieved. However, in IPv6, the role of ICMP is expanded. Without ICMPv6, IPv6 cannot communicate properly.

5.4 the DHCP

Setting IP addresses for each host is tedious. In particular, when using mobile devices such as laptops, mobile terminals and tablets, the IP address must be reset every time you move to a new place.
Therefore, the Dynamic Host Configuration Protocol (DHCP) is created to automatically set IP addresses and centrally manage IP address allocation. With DHCP, as long as the computer is connected to the network, it can carry out TCP/IP communication. In other words, DHCP makes plug and play possible.
DHCP is available not only in IPv4 but also in IPv6.

5.5 NAT

Network Address Translator (NAT) is a technology that uses a private Address on a local Network to switch to a global IP Address when connecting to the Internet.
In addition to IP Address translation, Network Address Ports Translator (NAPT) technology can translate TCP and UDP port numbers. Therefore, one global IP Address can be used to communicate with multiple hosts.
NAT (NAPT) is actually a technology developed for IPv4, which is facing address exhaustion. However, NAT is also used in IPv6 to improve network security, and NAT-PT is often used in communication between IPv4 and IPv6.

5.6 the IP tunnel

Two IPv6 networks sandwiched between an IPv4 network

In the network environment shown in the preceding figure, network A and network B cannot communicate with each other directly. In this case, the IP tunnel must be used to ensure normal communication between them.
The IP tunnel can combine the IPv6 packets sent from network A into one packet, append an IPv4 header to the packets, and then forward the packets to network C.
Generally, the IP header is followed by the TCP or UDP header. However, the “IP header is followed by IP header” or “IP header is followed by IPv6 header” situation is increasing with each passing day. This communication method of apending the head of the network layer after the head of the network layer is called “IP tunnel”.