The article was first published on the official account: Mr. Wu’s Garden of Eden


This article is mainly about the computer network related content, need to have a certain computer network basic knowledge to learn more knowledge. Of course, no foundation can also understand, will have a basic understanding of the computer network. In this introduction, I try to cover some interview questions, through this article, whether you are actually developing, or preparing for the interview process, you can have a more in-depth understanding of the computer network.

The article occasionally gets to the bottom of what a three-way handshake is, goes a bit further and asks why three handshakes are needed instead of two, and so on. It may be a long article, but spending half an hour to an hour reading this article will give you a sense of what computer networking is all about.

This article will only talk about the network layer, the data link layer and the physical layer below the network layer will not be discussed in depth in this paper, after all, it belongs to the hardware related content, rather than software development related content.

The main contents of the article are as follows:

  1. Overview of Computer Networks

    1. The application layer

    2. HTTP

    3. Socket

    4. The transport layer

    5. TCP

    6. UDP

    7. The network layer

      • IP
      • DHCP
      • ICMP

Computer Network Overview:

Computer network is the product of the combination of communication technology and computer technology, that is to say, computer network is to solve the problem of communication between computers. What is the problem of communication is the problem of data exchange and information exchange. In the real world, we use information to describe what is exchanged. In the computer world, we use the word “data” instead. These words are abstract generalizations of content, because the real world needs to exchange so complex that a piece of text is called a message, a picture is also called a message. So these words are abstract generalizations of these things, which are not so lofty at all.

Moving on, we know what computer networks are, and you can list a lot of things, like network cables, network cards, routers, IP addresses in computers. The professional people know TCP, UDP, HTTP, FTP. These things correspond to different layers in the computer network. Levels have different points, OSI (network) model will computer network is divided into seven layers (search keywords OSI seven layer model), because of the points too much too fine, with real life operations have some mismatches, we called theoretical results, the failure of the market, but it is a good tool for us to learn computer network.

What works in the marketplace is dividing the computer network into five or four layers. I like to have five tiers. The network cable belongs to the physical layer, the network adapter belongs to the data link layer, and the router belongs to the network layer. The corresponding protocols include IP and ICMP. The last letter P stands for Protocol, which we sometimes call IP Protocol, HTTP Protocol and so on.

Computer network has an architecture, divided into so many layers because the computer network system is too complex, but also involves a variety of components, it is not practical to regulate so much content at one time, so we divided into different levels according to the function of different devices. The advantage of this is that each layer is transparent to the others, making it easier to standardize. Changes in one layer do not affect the work of other layers.

The idea of stratification is widely used in the field of computer. The benefits of layering are transparency and easier standards. How to understand the concept of transparency, transparency means I don’t need to know how you work, you can give me what I want. The simplest example is, I don’t need to know how the phone works to play, I just need to know how to click the screen, I can deal with what you show me. The content on the screen is what the phone gives us. But we don’t need to think about how the internal battery is powered, how much current is supplied. That’s what transparency means.

In computer networks, too, don’t need to understand how the upper and lower work, I just need to accept the lower data to me, and I can read, after I this layer, according to the above specifications in the first place I good data formats, submitted to a layer, a layer will correctly receive I submitted data. After layering, changes in one layer do not affect other layers. IPv4 and IPv6 are both at the network layer, different versions of the protocol, but switching from IPv4 to IPv6 is no different for application layer HTTP, HTTP doesn’t care whether you’re using IPv4 or IPv6, You can just upload the data in my HTTP format. That’s the idea.

The application layer

The application layer has different protocols, and each protocol has different functions and uses. The protocols at this layer are most abundant, but also more specific. HTTP, for example, is used to transport our web content. FTP is used for file transfer. SMTP is used for mail transfer, TELNET and so on. Each protocol is complex, has detailed provisions, detailed description of the protocol, and more functions, so I will not write here. If you need to, you can be directed to understand the details of a particular agreement.

Socket

There is an interface called Socket between the application layer and the transport layer. Socket does not belong to a layer, it is like a pipe between the application layer and the transport layer, used to connect the operating system and the application layer of the specific application process. Applications can manipulate sockets to use the network functions of the operating system. Socket translates to “Socket”. It’s a strange name. Because the transport layer is generally controlled by the operating system, and some protocols at the application layer are controlled by the application process, we need an interface to bridge the application process and the underlying protocol. This interface is called API (Application programming Interface). UNIX defines a concrete implementation called a Socket.

Most operating systems now support sockets. Microsoft has also made an improved API called winSock. It’s just an API that’s been customized for different operating systems, so I don’t know if you’ve got that. Another differentiator is the WebSocket craze. WebSocket is an application-layer protocol. The Socket does not belong to a layer, the implementation of the process with the operating system in the network communication interface concrete implementation. Because the interface refers to the API, the implementation refers to the Socket. We can face Socket programming, that is, they define a transmission protocol to achieve communication between the two sides of the computer.

Such as HTTP is too complex, do you think what status code cache is too fat, I want to write a program, in fact you only need to use the Socket to send and receive data, use the code to deal with the data transmission to come over, completely can completely meet the problem of communication between the computer and don’t need to follow the other application layer protocol. The data processing rules you create are called custom protocols.

You write the program to run, the operating system will be automatically assigned to you a port number, plus the machine’s IP address, can only make sure you the process of the program, such as the operating system to receive send you this program data, IP address to: distribution of the port, will find you in this process, the data submitted to you, The process processes the data according to the logic of the program. Port numbers are automatically assigned to you or manually configured when you use some of the Socket’s functions. So you know where all these port numbers come from, to distinguish between different processes, for communication. Some Web servers, such as Nginx, must also use sockets to bind port 80.

The transport layer

The two most typical transport layer protocols are TCP and UDP. Of course, there are others, if you are curious to know, such as DCCP, SCTP, etc. The transport layer provides a logical communication mechanism between application processes. How do we understand this? We know that there are different processes running on the computer, and at the transport layer, we have to figure out which process this data is going to. What does it depend on? It depends on the port number. A network process must use a Socket, which randomly assigns a port number, or manually binds a port number to the process using the Socket binding function.

Multiplexing, there is a TCP and UDP transport layer multiplexing, which like multiplexing is not in the physical layer, physical layer multiplexing refers to the physical line multiplexing method, including the frequency division multiplexing, time division multiplexing, code division multiplexing and wavelength division multiplexing (with frequency division multiplexing belongs to a type, there are relationship between wavelength and frequency), And the multiplexing in the transport layer is multiple packets received at the same time, how the computer processes these packets, how they are distributed, according to what principles, the multiplexing in the transport layer is about this. Below the transport layer, all packets undergo the same processing, but at the transport layer different datagrams are transmitted differently. “All” UDP datagrams sent to me by other hosts are submitted to the same port. TCP datagrams are distributed to different ports according to different connections. Because TCP connections are one-to-one, each port corresponds to only one connection and also to a unique port on the other host. Even if another host wanted to establish two TCP connections with me at the same time, it would require the creation of two processes or threads and two port numbers.

Let’s talk more about UDP.

UDP is a simple extension of the IP protocol. It’s a very simple extension. It’s as simple as what it does, is it sends data from the network layer, identifies the port number, delivers it to the process, and if you want to do error detection. He did nothing but this. This simple extension brings many benefits. Because it is simple, it is easier to customize. You can add some functions according to your own needs. In the application layer. The application layer can process UDP datagrams and then modify the specific requirements for network traffic as it sees fit. In addition to being easy to extend at the application layer, UDP provides the following benefits:

  1. Compared with TCP, there is no need to establish a connection, so the delay of sending data is small. You do not have to work hard to establish a connection and then send data.
  2. Implementation is relatively simple and does not require a responsible implementation process. Such as maintaining connections.
  3. The header overhead is small. On top of the data content, UDP adds additional data to the data to distinguish between port numbers and error-detection fields. This data is called UDP headers. I’ll talk about that later.
  4. The application layer has good control over sending time and sending data. Want to use UDP to achieve complex functions, give you the freedom to customize go.

These features are mainly compared with TCP.

Let’s take a look at what a UDP datagram looks like.

As you can see, a UDP datagram adds some additional data (header) in addition to the specific data to be transmitted. This data is 8 bytes in total. So the header of UDP is 8 bytes. The first two specify which port the datagram is coming from on the machine and which port it is destined for on the other computer. It also describes the length of the data, including the first 8 bytes, so the length of the data section is limited in order for the data length field to correctly represent the limited length of the datagram. In fact, in general, the data portion is not very large. UDP is a word, simple, because of its concise header.

In UDP headers, the source port number can even be omitted. If omitted, it means that I only need to send the data, and do not need to reply, because even if I want to reply, I do not know your port number. The checksum field can be omitted, that is, the data is not checked, but it is generally checked. Data verification is to prevent data in the transmission process of some errors, resulting in data unavailable, resulting in incomplete, or disorder, if the incomplete data is forcibly displayed on the computer is garbled, can not be used. If there is an error in the datagram, it will be discarded at the transport layer. After being discarded, an ICMP datagram will be sent to the host that sent loose data, telling him that the data error, and then the data transmission process is over! UDP does not provide a retransmission mechanism. Even if the retransmission is controlled by the application layer, and is the application layer to reconstruct the data again, again sent, UDP does not provide retransmission! And then, the ICMP datagram, which WAS supposed to be covered later, got stuck… The next time.

To continue.

UDP has one tricky aspect when it comes to error checking. Error check method is not mentioned here, a lot of information on the Internet, and IP protocol error check method is the same. UDP error checking provides a mechanism to ensure correct port to port transmission. You can also verify if there was an error in the data section. Therefore, UDP error detection is the only part of the transmission process that checks the data (TCP also checks). During error detection, some additional data will be added to participate in the calculation of error detection. You just use it, you do the checksum and then you don’t use it, and then you submit it to the next layer, and you submit it in the same format as the picture above, which is what you actually submit. What is added? Look at the picture below.

The red part is called the false header, don’t be misled by the name, it is just taken to participate in the calculation, similar to the password in the encryption process, need to be used when calculating the ciphertext, and then put back after the calculation, not in the ciphertext transmitted together. Error detection implements a very useful function to tell whether the data is being transmitted through the correct port and whether it is using UDP, which has a value of 17 (octal) in the protocol number field. Under IPv6, the red box is different from this one, so that errors that are not detected at the IP layer are detected. This checksum ensures that the entire submitted data is correct.

Well, UDP difficult point finished, about the purpose of UDP is not said, about the content of this piece of pseudo head, I found that there is no very detailed information on the Internet, there is no accurate statement, finally I looked at the UDP RFC document to understand.

Tell me the TCP

TCP is an excellent protocol that provides a lot of rich functionality. It is point-to-point, that is, port-to-port protocol. Is a reliable, sequential byte stream protocol. Connection-oriented protocol.

Both the receiver and sender have certain storage space to cache data, which is usually used for retransmission and group reorganization.

Connection-oriented meaning: A connection must be established before communication. The connection state is maintained by both ends, but not by the middle node.

TCP provides reliable data transmission, flow control, congestion control.

TCP’s ability to deliver reliable data is based on an acknowledgment (acknowledgment: Acknowledgement mechanism. That is to say, I send you a packet and you must notify me that you have received it. If you don’t notify me, I will regard it as acknowledgement and send you another packet until you tell me that you have received it. This logic ensures the reliable transmission of data.

In the concrete implementation process of reliable transmission, TCP uses the accumulative confirmation, which is the byte number in the data, rather than the number of data blocks received. Because of this, TCP is called the protocol of sequential byte stream.

After sending the TCP datagram, I waited for a long time and did not receive your reply, I will send you again. There is an algorithm to calculate the timeout Time. The main variable is Round Trip delay Time (RTT). RTT refers to the Time between sending a message and receiving an ACK message. The weighted value is set to the timeout of the current datagram, based on the average of the historical values and the RTT value of the previous packet. In this way, the calculated value is more reasonable after combining the current network status with the historical network status. If I send a message and I exceed the timeout, I retransmit that datagram. On the other hand, TCP has a sliding window mechanism, because sending a byte and waiting for an acknowledgement seriously affects the efficiency of data transmission, so the data is sent in segments. When confirming, only the number of bytes received is acknowledged. In this process, there will be a situation that is lost in paragraph 5, 6, 7, 8 period of the past have been received, the receiver because no received the content of the paragraph 5, after receiving the sixth paragraph number of fifth period of the first byte of the serial number will send acknowledgement (ACK) message, and told that I haven’t received the bytes, then immediately received the seventh paragraph, can’t ah, You send a sequence number confirmation message with the first byte of paragraph 5, and then you immediately receive paragraph 7. If three consecutive confirmation packets with the same serial number are sent, the sender immediately resends the packet in the fifth paragraph instead of waiting until the timer expires. This mechanism is called fast retransmission. Why is set to 3 times repeated confirmation message immediately retransmission, www.zhihu.com/question/21… Here’s the clear answer. It’s theoretically proven that three times is best.

TCP also implements flow control, how to implement flow control, just mentioned TCP has a sliding window mechanism. The receiver also has a cache window. If the application data read slowly, speed is fast, a large number of data concentration, more than the recipient’s cache size, even if again from data also can’t accept it, that also can constantly return to the last confirmation message data into the cache area, more than three times, the sender will continue to repeat sending the same data, a waste of network resources. Therefore, a traffic control mechanism is proposed. When the acknowledgement message is returned each time, the remaining amount of cache is added into the message. The sender sends data based on how little cache you can receive because he knows you can’t receive more. What if the sender receives the acknowledgement message from the receiver and finds that the cache margin is 0? I can’t stop sending, because if I stop, I can’t get the remaining number of Windows, even if the other person already has a large cache, I don’t send data to him, so I can’t bring back the remaining cache of the other person. Therefore, if the cache margin of the other party is 0, the sender will indirectly and tentatively send a set of probe packets (ZWP) to detect how much cache the other party has left, because whether the packet is received or thrown away, it will receive the acknowledgement message from the receiver, which contains information about the cache margin. This is to achieve the TCP flow control principle.

Another function TCP implements is congestion control. TCP uses an end-to-end control approach, that is, two computers to control, the routing device in the middle does not provide obvious control. TCP congestion control has multiple algorithms. The current FLOW of TCP traffic control is like this. At the beginning, the slow start algorithm was adopted, and the size of the sending window was 1,2,4,8,16,32,64. When the window reaches a certain threshold, it begins to increase slowly, setting the threshold to 64. Start with the additive increment algorithm, which is to add the sending window one by one, 65,66,67… , when it increases to 80, three repeated ACKS are suddenly received. We think that the 80 message is lost, and the multiplicativity is immediately reduced (reduced by half), and the window drops to 40. And then we’re adding, one by one. If increased to 50, they found a datagram timeout, at this time, the window size is 1 reduction directly, because the bad network environment, and why, as we know, received three confirmation message, that the other party at least three packets received, but is out-of-order packets, can also returned to the sender to three the same confirmation message, At this time, the network is relatively unobstructed, because only one packet is lost, but the other three arrive, and timeout means that the packets sent may not be received by the sender, and the route is discarded in the middle. Therefore, the network environment is very poor at this time, and the transmission rate needs to be reduced to prevent congestion. Why there will be a congestion mechanism, we can think about the absence of this mechanism, intermediate routing already overburdened, the sender after the timeout is constantly sending data, all of the host to send data, because everyone’s a timeout, require retransmission, must increase the congestion situation, like a traffic jam, when blocked, also has a new car in the back, Of course it gets more and more congested. That’s why we have congestion control. However, other protocols don’t have congestion control, like UDP, ICMP, UDP is not restricted by the network at all, you can send it if you want, the network is blocked, it’s ok, I don’t need to guarantee that the data has to arrive anyway. And the UDP transmission rate is not limited, so the current network environment, a large number of UDP datagrams will have a serious impact on the network. In congestion control, TCP uses the algorithm of slow start, additive increase, multiplicative decrease and quick recovery. If you need it, you can look at it in detail. This piece of content is a little complicated. In ordinary networks, these algorithms will be used, not a single specific algorithm.

Ok, so with some of the functionality of TCP out of the way, let’s look at the TCP header.

  • 1. Source port number: 16bit port number of the data initiator
  • 2. Destination port number: 16bit port number of the data receiver
  • Serial number: a 32-bit serial number used to tell the serial number of the first byte of the sent data. If the sent packet does not contain data (such as an acknowledgement packet), this field remains unchanged. If SYN=1 or FIN =1, the datagram is considered to contain one byte of data.
  • 4, confirm the serial number: 32 bit confirmation number, is receiving data expected to receive the next message of the sender’s serial number, thus confirm the serial number should be the other party has just sent to you in the TCP sequence number field value plus 1, is your confirmation number, you need to reply, he said, I have received your data, so make sure the serial number.
  • 5, header length: 4 bits, the maximum number can be expressed is 15. There are 20 bytes in the header, so each 1 represents 4 bytes, which is the line in the figure above. The value of this field is the number of lines. A maximum of 15 x 4 = 60 bytes.
  • 6. Reserved: 6 bits, both 0
  • 7. Urgent URG: When URG=1, it indicates that urgent data exists in the packet segment and should be transmitted as soon as possible. The urgent pointer field is valid.
  • 8. Acknowledgement bit ACK: ACK = 1 indicates that this is an acknowledgement packet, 0 indicates that this is not an acknowledgement packet.
  • 9. Push bit PSH: When the PSH of the sender is 1, the receiver will deliver it to the application process as soon as possible;
  • 10, reset bit (RST) : when RST=1, it indicates that there is a serious error in the TCP connection, the connection must be released, and then re-establish the connection;
  • 11. Sync bits: SYN is used to synchronize serial numbers when establishing a connection. SYN=1 and ACK=0 in a packet indicate the connection request packet segment. SYN=1 and ACK=1 indicate that the receiver agrees to establish a connection.
  • Terminate bit FIN: When FIN=1, it indicates that the data on the sender of this packet segment has been sent and the transmission connection needs to be released.
  • 13, window: is to provide flow control function, indicating the number of cached bytes.
  • 14. Checksum: The scope of this field check includes the header and data. Calculated and stored by the originator and validated by the receiver.
  • 15. Emergency pointer: The emergency pointer is valid only when URG=1. It indicates the number of bytes of emergency data in this paragraph.
  • 16, options: Variable length, up to 40 bytes. Most do not apply to this field, which means TCP is mostly 20 bytes.

It should be mentioned that the TCP checksum also requires a dummy header. The value of the protocol field in the TCP dummy header is 6, and that in UDP is 17. The error check of transport layer is the only time to check the correctness of data in the whole transmission process.

The following describes the TCP connection mechanism.

We know that TCP is connection-oriented. The two parties need to establish a connection before exchanging data. The TCP connection mechanism is famous for the three-way handshake and four-way wave. Four waves refers to the process of TCP disconnection. So let’s talk more about that.

The process of three handshakes.

A few concepts need to be consolidated before we get into the mechanics of the handshake. Serial number, the serial number that identifies the first byte of the data I send. When the connection is established, a serial number, such as 5433, will be initialized according to the clock information of the computer, which is required by the operating system. Part of the data says that the serial number will be increased by 1 every 4US. I haven’t understood that level yet. At this point I’m just setting up a connection, I’m not sending any data. So the serial number in my header is the initial value

1. One end initiates a connection request: sends a packet to the target computer to establish a connection. At the same time, initialize a sequence number (sequence = seq) X, where SYN=1. The initialization of the serial number is based on system startup time, mainly to prevent middlemen from forging connections.

2. After receiving the establishment request, the target computer agrees to the establishment request and returns a packet. SYN=1, ACK =1. Because SYN=1, the target machine also needs to initialize a sequence number Y. ACK=1 indicates that the sequence number is valid. Confirm that the serial number value is X+1, which is ACK=X+1.

3. After receiving the establishment request packet from the target computer, send an acknowledgement packet to the target computer. ACK=1 indicates the confirmation packet, and the sequence number is set to Y+1.

It should be noted that the confirmation number confirms the data sent by the other party and I have received it. Therefore, the confirmation number is obtained based on the serial number corresponding to the datagram sent by the other party plus the data length.

The two computers had different serial numbers when they made the connection. The process of establishing a connection three – way handshake is to communicate and exchange serial numbers.

Let’s talk more about the serial number thing. Initialization: initializes a value from the system’s time information. For each byte of data sent, the sequence number is +1. When SYN =1 or FIN=1, the sequence number is also +1. The offset of the sequence number relative to the initialized sequence number is the number of bytes sent followed by -1. Because when I made the connection, I added an extra 1. The confirmation number is the serial number of the sent datagram plus the value of the data length. Each end has its own serial number. All right, let’s be clear.

Why three handshakes instead of two.

I want to establish a connection with you, SYN=1 SEq =4444, but this packet has not been sent to you for a long time. I thought this datagram was lost. SYN=1, ACK=0, seQ =5555 SYN=1, ACK=4445 This is two handshakes, there is a problem, my SEQ does not match. If I want to send the data, the seQ value is calculated relative to 5555, not 4444, and there will be an error in the exchange of data. So you have to shake hands three times, you get ACK=4445, you know it’s out of date, you don’t process it, you wait for ACK=5556, you reply with ACK=Y+1, you exchange seQ values, you can start transmitting data.

Here’s the four-wave mechanism.

I’m ready to disconnect and send FIN=1 seq = 2222 ACK=1. According to the official document, except for the ACK value 0 when a connection is established, all other packets are ACK value 1, including the packets when the connection is disconnected. Because the network does not guarantee the stable delivery of acknowledgement packets, we simply add ACK, because it is a waste to use this field anyway.)

1 Once I send a FIN packet, I will not send you any more packets with data.

2 You have received my FIN packet. Reply with an acknowledgement message (ACK=2222+1).

At this point, even though you answered my disconnection message, you can continue to transmit data. For example, I will send you a disconnected message before your data is finished. If you have not finished transmitting data, you can continue to transmit.

You send one data, I return one acknowledgement message, you send one, I return one. Finally it’s done, and you think you can disconnect.

Therefore, 3 you send me a FIN packet.

I received your FIN packet, and 4 sent you a confirmation packet. If you receive an acknowledgement you can disconnect immediately. I can’t. I’m worried that the confirmation packet will be lost on the way, and if you lose it, you will retransmit the FIN packet. If I do not receive a FIN packet retransmitted by you after a period of time (twice the maximum datagram lifetime), I disconnect.

The reason for four waves is that TCP is full-duplex, so you can send data to me, and I can send data to you. The reason for the four waves is that both parties need to disconnect, i.e., two waves. If A sends A FIN and B returns an ACK, A confirms that B has received its disconnection request. Similarly, B also needs to obtain a disconnection acknowledgement packet. So that’s four waves.

It is also important to note that B can continue to send data even if A requests disconnection. It should also be noted that the four-wave wave can be simplified into A three-wave wave. That is, A applies for port connection, but B directly returns A FIN acknowledgement packet without any data to be sent. After A returns an acknowledgement packet, both parties are disconnected. In fact, it is also four times, but the middle two are put in the same message.

Why wave four times?

There aren’t that many. Why, four times? One FIN and one confirmation. Three is fine. Yes, it says so. I mean, one less confirmation, of course not, one less confirmation how do I know you received my disconnection message? Well. Just answer it like that.

The more interesting question is why the party that initiates the disconnection request needs to wait twice as long as the Maximum Segment Lifetime (MSL) of TCP.

The main purpose of waiting for 2MSL is to prevent the peer from receiving the last ACK packet. After waiting for your ACK packet to time out, the peer will resend the FIN packet for the third wave. After receiving the resend, the peer can resend another ACK packet. The ports on both ends cannot be used in TIME_WAIT state until the end of 2MSL. When the connection is in the 2MSL wait phase, any late segments are discarded. TTL and MSL are related but not simply equal. MSL must be greater than or equal to TTL.

The network layer

The core function of the network layer is forwarding and routing. The network layer contains many protocols, because the network layer needs to realize the functions of routing and forwarding, so many routing protocols are used in this layer. To realize the forwarding function, we need to know where to forward, that is, we need to have the address, and we need to use the IP protocol. IP protocol specifies some addressing rules, how to slice the datagram into small pieces. In addition, the network layer also includes the important ICMP protocol. The data relying on this protocol is like the command in the network. The computer or intermediate routing equipment becomes the role of officers and soldiers and makes corresponding actions and responses according to the command content.

Let’s talk about IP addresses.

We know that the length of IP addresses is limited, only 32 bits, and the number is also limited, so we need to divide them, for example, part for the United States and part for China. Once it’s broken up into chunks, it needs to be broken up into smaller provinces and cities, or enterprises, or government agencies. This process is called subnets. So it’s stipulated that an IP address consists of two parts: the network number and the host number. However, the number of the network number is uncertain. If a subnet is large, the number of its network number is very small, so the number of hosts is a little more, and the hosts in the subnet can be many. As an example, China unicom has a large, subnet, so he will rarely the network number of digits, such as the four, so the host leaves 28, then it can be identified by nearly 2 ^ 28 host, then the cities belong to China unicom network, road of the router, all using unicom network access equipment, All belong to the hosts in the Unicom subnet. Of course, this description is extremely inaccurate, mainly for the convenience of everyone to understand the concept of network number and host number.

A long time ago, we divided the network address into five types of network address, A, B, C, D, E network address, but for now, has not been in use, such A division caused A great waste of IP address, so now almost no class address to divide the subnet. IPv4 addresses have already been allocated, but NAT ensures that every internet-connected device has an IP address. NAT is a little bit more complicated than that. Due to the rapid development of the Internet of Things, the popularity of network resources, more and more devices to access the Internet, so the country is vigorously promoting the development of IPv6. Many people describe IPv6’s number of addresses as enough to assign an IP address to every grain of sand on the planet, let alone a device. Funnily enough, the push for IPv6 will benefit the real-name system. Ahem, go on.

X /n. For example, 202.113.132.45/24 indicates the first 24 bits of the network number and the last 8 bits of the host number. Now subnets are divided according to the above rules. Interested students can go to understand the specific process of subnets.

Subnets are divided according to personal interests to understand, the following content I hope we can understand, very useful, some special network address:

These special addresses can be done first to understand, later in the project will often encounter, then understand the specific meaning and concept.

The following are some private network addresses, which are not used to register on the public network. In other words, the following addresses are usually accessed on their own small LAN:

RFC1918 specifies block names IP address segment IP number Classification Network Description Maximum CIDR block (subnet mask) Length of the host bit
24 blocks 10.0.0.0 – those 16777216 A single Class A network 10.0.0.0/8 (255.0.0.0) 24
Shared Address Space 100.64.0.0-100.127.255.255 4194304 64 continuous class B networks 100.64.0.0/10 (255.192.0.0) 22
20 blocks Along – 172.31.255.255 1048576 16 continuous class B networks Along / 12 (255.240.0.0) 20
16 block 192.168.0.0-192.168.255.255 65536 256 continuous class C networks 192.168.0.0/16 (255.255.0.0) 16

Understand.

Ok, so with the network literacy over, let’s look at the format requirements for data. TCP or UDP segments from the transport layer need to have a header added to the network layer and then submitted to the data link layer. What does the network layer do to the data? First, the network layer has the forwarding function, and forwarding needs to know the destination, so the network layer adds the destination address to the data. The destination address is the IP address, which uniquely identifies a host. The network layer fragments the packets that exceed the Maximum Transfer Unit (MTU). The MTU value of the Ethernet is 1500 bytes. If the MTU value is too small, a packet is divided into many blocks. If the MTU value is too large, the transmission delay and packet error rate increase. Well, if the data to be transmitted at the transport layer is larger than the MTU, the data will be fragmented at the network layer. Let’s look at the header of the IP datagram:

  1. Version 4bit indicates the version of the IP protocol. Currently, the widely used IP protocol version is 4 and IPv6 version is 6.

  2. The header length is 4 bits. The unit is 4 bytes (32 bits is the line above). The maximum length of a 4bit is 15, and the minimum length of a header is 20 bytes. The header length is 20 bytes. Header length The value ranges from 20 to 60 (4 x 15) bytes. The header length is usually 20 bytes. When the header length of an IP packet is not a multiple of 4 bytes, it must be filled with the last fill field. So the data portion always starts at a multiple of 4 bytes.

  3. Discriminating service 8bit is almost not used at present. To support discriminating service, all routers in the middle need to support discriminating service. However, most routers do not support discriminating service, so discriminating service is almost not used.

  4. Total length 16 bits Length of the header + data length, in bytes. The maximum length of a datagram is 2^16-1 = 65535 bytes. However, the MTU of the maximum transmission unit of the data link layer is less than 65535. Therefore, when a datagram is encapsulated into a frame of the link layer, the total length of the datagram (including the data part in the header) must not exceed the MTU value of the following data link layer. If it exceeds the MTU value, the datagram needs to be fragmented (hence the second line in the figure above, for slicing).

  5. Identification 16bit software maintains a counter in memory that increments by 1 each time a datagram is generated and assigns this value to the identification field. But this “identifier” is not a serial number, because IP is a connectionless service and datagrams are not received in order. When a datagram must be fragmented because its length exceeds the MTU of the network, the value of this identity field is copied to the identity field of all datagrams. The same value of the identity field enables the fragmented datagrams to be correctly reassembled into the original datagrams.

  6. Flag 3bit currently only 2 bits make sense.

    The bit in the middle of the flag field is “Don’t Fragment” (DF), which means “cannot Fragment”. Sharding is allowed only when DF=0. Sharding is not allowed if DF= 1

    The lowest value in the flag field is More Fragment (MF). MF=1 means the datagram followed by “sharding”. MF=0 indicates that this is the last of several datagrams.

  7. Slice offset 13bit when a long IP datagram is sharded, the relative position of the first bit of a slice in the original packet. That is, where the datagram is truncated relative to the starting point of the data portion of the transport layer datagram. The offset unit is 8 bytes. This indicates that the length of each shard must also be a multiple of 8 bytes (64 bits). So why is the ipv4 slice offset in 8-byte bits? If you use bytes directly, the total length is 16 bits and the slice offset is only 13 bits, you get an underdescription. So the offset is specified in units of 8 bytes.

  8. TTL 8bit TTL(Time To Live) indicates the maximum number of routers through which an IP packet can be forwarded (for each router, the TTL is reduced by 1). Prevent datagrams from being looped endlessly in the network due to some errors, wasting network resources. As you can see, the router will decrement the value of this field and recalculate the header checksum when forwarding.

  9. Protocol 8bit Indicates that the data carried by this datagram is of the protocol type (such as TCP, UDP, etc.) so that the IP layer of the destination host knows which process to hand the data portion to.

  10. Header check and 16bit Note that this field only checks the header of the datagram, not the data portion. This is because each time the datagram passes through a router, the router recalculates the head check sum (since the lifetime, flag, and slice offset can all change, meaning that the intermediate route will change when it is re-fragmented). , not checking the data part can reduce the amount of calculation.

  11. The source IP address contains 32 bits.

  12. The destination IP address contains 32 bits.

This is the header of the IP datagram.

The following describes the Dynamic Host Configuration Protocol (DHCP).

As we know, there are two ways to set your computer’s IP address. The first way is to set a fixed IP address directly, and also need to configure a series of complex information, such as subnet mask, DNS server address and so on. It is cumbersome and requires expertise. I used to look at the tutorial configuration, what do not understand, the final Settings of the half a day also do not connect to the Internet, simply choose automatic IP address.

Dynamic Host Configuration Protocol (DHCP) is used to automatically obtain IP addresses. By making a request to a DHCP server on the local network, the DHCP server looks at which IP addresses it still has unassigned and selects one for you. You have an IP address. The DHCP server not only gives you the IP address, but when it assigns the IP address, it sends you the subnet mask, default gateway address, and DNS server address in a packet.

What are the advantages of dynamic IP address acquisition? The first is simple, plug in the network cable can be used, without the user to configure their own. The second is that THE IP address can be reused, your computer is shut down, I will give this IP address to others to use.

Here is how to communicate with the DHCP server for an IP address.

  1. The device that wants to access the Internet broadcasts a DHCP message to the local network and the one that responds is the DHCP server. This process is called sending discovery packets
  2. The DHCP server finds an appropriate IP address in its own body and broadcasts an offer message to all.
  3. Another request message is sent to the internet-connected device to broadcast whether a certain IP address can be used.
  4. The DHCP server sends an acknowledgement message in response to whether the IP address can be used.

The main reason is that there may be multiple DHCP servers on a network. When a device requests an IP address, all the DHCP servers receive a broadcast message and send an IP address that they deem appropriate. After receiving many IP addresses, the Internet access device randomly selects an IP address and broadcasts it again, asking whether the IP address can be used. When the IP address is provided, all THE DHCP servers confirm whether the IP address is their own. If it is their own, they reply with a confirmation message, and then update their database. This IP address is already in use. Returns an error if it is not its own IP address and retrieves the IP address it just assigned.

The specific communication process is as follows:

  1. The default port number of the DHCP server is 67
  2. The default port for a networked device to send DHCP packets is 68
  3. Discover packets: Source IP address 0.0.0.0:68 destination IP address 255.255.255.255:67 The two parties do not know each other’s IP addresses, so they broadcast communication
  4. Provides packets: source IP address X.X.X.X: 67 destination IP address 255.255.255.255:67
  5. Now it’s time to send another request and acknowledgement message.

Ensure that the packet contains information such as the IP address, subnet mask, and some data, such as the expiration time of the IP address, to reclaim the IP address. When an IP address expires, the device uses DHCP to renew the IP address.

DHCP packets have the following characteristics:

  1. In the application layer, through the application process to achieve
  2. The use of UDP
  3. IP broadcast was used

This seems like a simple and interesting protocol.

I’m looking at an interesting Internet Control Message Protocol (ICMP).

ICMP is the Internet Control message protocol.

It has to do with IP. Its main functions are error reporting and network detection. Error reporting means that some errors may occur when data is forwarded on the network. For example, when a packet passes through a router and finds TTL=0, the router discards the packet and sends an ICMP packet to the sender. The ping tool is most commonly used for network detection. The ping tool is based on ICMP echo request response packets, which belong to a packet format of ICMP. According to ICMP functions, ICMP packets are classified into two categories.

Error message:

Destination unreachable packet: The packet is sent when the destination network is unreachable or the destination port is unreachable.

Source suppression: When the cache of the router is full, the router sends packets to the source host to suppress the transmission rate for congestion control. However, other congestion control methods are used, and such packets are rarely used.

Timeout/expiration packets: The number of intermediate routing devices typically exceeds the TTL value.

Parameter error: This type of packet is sent if there is a problem in the header of the datagram.

Redirection: When the destination network is not the address forwarded by my router, but the address forwarded by another router next to me, a packet like this will be sent to the source host.

The second type is the message of network exploration:

Echo request and reply packet: ping

Timestamp request and reply packet: used to obtain the timestamp.

In fact, there are a lot of ICMP packet types, and it is not easy to understand. It is good to dabble in them and not to delve too deeply. As long as you remember that a lot of report information in the transmission process on the network is generally based on ICMP.

There are some situations in which ICMP packets are not sent:

  • If an ICMP packet is sent and an error occurs in the datagram, the packet will not be sent. If an error occurs in the datagram, the packet will not be sent.
  • The IP datagram fragment sends ICMP packets only to the first fragment, but not to other fragments. For example, if the target network is unreachable and the first problem occurs, all subsequent packets are sent to the same network. In order to reduce duplication and reduce network consumption, packets are generally sent only to the first fragment.
  • Packets using multicast IP addresses do not send ICMP packets. The packets are broadcast to all hosts, which may cause a wide range of error packets. Therefore, packets of this type are not allowed to send ICMP packets.
  • Datagrams with special IP addresses such as 127.0.0.0 or 0.0.0.0 are not sent.

With the development of the Internet, several ICMP packets are no longer in use. For example:

Information request to reply message.

Subnet mask request and reply packets.

Router query and notification messages

Let’s take a look at the magical use of ICMP Tracerout to detect the router information that a datagram passes through to the destination host.

Traceroute Detects routing paths

The first set of IP datagrams sets the TTL to 1, so that the next route is discarded, and the router sends an ICMP message to measure the round trip time

The second group of IP packets TTL=2

.

… .

… .

Stop condition: Traceroute encapsulates UDP packets using an unused port number. When the destination host sends a packet indicating that the port is unreachable, the destination host returns an error message. This probes the entire routing path. When measuring a route, a group sends three datagrams to prevent problems.

Well, the network layer of related knowledge is finished, I do not know whether you see the addiction, you can review, repeated look, what problems friends can communicate with me more. WeChat WytheHard.


Welcome to pay attention to the public number: Mr. Wu’s Garden of Eden