A: an overview
1.1 Five-layer model
To understand the Internet, you have to start at the bottom and work from the bottom up to understand the functions of each layer.
There are different models of how to layer, some models have seven layers, some have four layers. I think it’s easier to explain the Internet by dividing it into five layers.
As the figure above shows, the lower layers are closer to the hardware, and the higher layers are closer to the user
1.2 Layers and protocols
Each layer is designed to accomplish a function, and in order to accomplish that function, everyone needs to follow the same rules.
Rules that everyone follows are called agreements.
Each layer of the Internet defines a number of protocols. Collectively, these protocols are called Internet protocols. They are the core of the Internet. The following describes the functions of each layer, mainly introducing the main protocols of each layer.
Second, the entity layer
What’s the first thing you need to do to network a computer? First, of course, connect the computer. You can use optical cable, cable, twisted-pair, radio waves, etc.
This is called the ‘physical layer’, it’s the physical means of connecting computers together. It mainly specifies some electrical characteristics of the network, the role is responsible for the transmission of 0 and 1 electrical signals
3. Link layer
3.1 define
Pure 0 and 1 have no meaning, must stipulate the way to read: how many electrical signals count a group? What does each signal bit mean?
This is the function of the ‘connection layer’, which is above the ‘physical layer’ and confirms how zeros and ones are grouped.
3.2 Ethernet Protocol (0 and 1)
In the early days, each company had its own way of grouping electrical signals. Gradually. A protocol called Ethernet dominates.
Ethernet states that a group of electrical signals constitutes a packet called a frame. Each frame is divided into two parts: Head and Data.
The ‘header’ contains the description of the packet, such as sender (MAC address), receiver, data type, and so on. ‘Data’ is the specific content of the packet.
The length of the ‘header’ is fixed at 18 bytes, and the length of the ‘data’ is fixed at 46 bytes and 1500 bytes, so the whole ‘frame’ is 64 bytes and 1518 bytes. If the data is very long, it must be split into multiple frames to be sent.
3.3 the MAC address
As mentioned above, the ‘header’ of an Ethernet datagram contains information about the sender and receiver. How are the sender and receiver identified?
Ethernet specifies that all devices connected to the network must have a “nic” interface, and the data packets must be sent from one nic to another. The IP address of the nic is the sending and receiving address of the data packets, which is called the MAC address.
Each NIC has a unique MAC address. The MAC address contains 48 binary digits and is usually represented by 12 hexadecimal digits.
The first six hexadecimal numbers are the vendor’s serial number, and the last six are the serial number of the vendor’s network adapter. With the MAC address, you can locate the path of the network adapter and data packets
3.4 radio
First, how does one nic know the MAC address of another nic
Answer: There is an ARP protocol (function: Converts a known IP address to MAC address, but the ARP agreement can only be used within the local area network (LAN), the operating system first determines whether the destination IP for LAN, if not in message frames sent to the gateway, network operating system initialization when the default gateway address, gateway for message forwarding) after receiving the message routing table, can solve this problem, I’ll save that for later, but all you need to know is that Ethernet packets must know the MAC address of the recipient before they can be sent
Second, even with the MAC address, how can the system accurately send packets to the recipient?
Answer: Ethernet takes a very primitive approach. Instead of sending packets exactly to and from the receiver, it sends them to all computers in the network, and lets each computer decide for itself whether it is the receiver or not
Above, no. 1 computer to send a data to the computer 2, 3, 4, and 5 of the same computer will receive the package, they read the packet header, find the MAC address of the receiver, and then compared with its own MAC address, if the same, just accept the package, do further processing, or discard this package, This mode of transmission is called ‘broadcasting’
With the definition of the packet, the MAC address of the network card, and the transmission mode of the broadcast, the ‘link layer’ can transmit data between multiple computers
4. Network layer
4.1 Origin of the Network Layer
Ethernet protocol, rely on MAC address to send data, theoretically, only rely on MAC address, Shanghai network card can find the NETWORK card in Los Angeles, technology is possible.
However, do have a major shortcoming, the Ethernet USES the broadcast mode to send data packets, all members of a ‘package’ hand, not only low efficiency, and limitations on the sender’s network, that is, if two computers are no longer the same subnet, that is to say, if the two machines are no longer the same subnet, radio is not the past, This design makes sense, otherwise every computer on the Internet would receive all the packets, and that would cause disaster.
The Internet is a huge network of countless sub-networks, and it is hard to imagine that computers in Shanghai and Los Angeles would be on the same sub-network. It is almost impossible
So we must find a way to distinguish between MAC addresses belong to the same subnet, what is not, if it is the same subnet, will adopt the way of the radio to send, otherwise we are using “routing” send (” routing “meaning is refers to, how to distribute packets to a different subnet, this is a big topic, this paper does not involve) unfortunately, The MAC address itself does not do this; it is vendor dependent, not network dependent.
This leads to the creation of the “network layer”. What it does is introduce a new set of addresses that allow us to distinguish between different computers belonging to the same sub-network. This set of addresses is called “Internet addresses”, or “web addresses” for short.
As a result, when the “network layer” appeared, every computer was given two kinds of addresses, one a MAC address and the other a network address. There is no connection between the two types of addresses. The MAC address is bound to the network adapter and the network address is assigned by the administrator. They are just randomly combined.
The network address helps us determine the subnetwork on which the computer is located, and the MAC address sends the packet to the target network card in that subnetwork. Therefore, it logically follows that network addresses must be processed first and MAC addresses must be processed later.
4.2 IP
The protocol that defines network addresses is called IP protocol. The address it defines is called an IP address.
Currently, the fourth version of the IP protocol is widely used (IPv4 was exhausted on November 25, 2019, and IPv6 is currently used), or IPv4 for short. This version specifies that a network address consists of 32 binary bits.
Traditionally, we use four-segment decimal numbers to represent AN IP address, from 0.0.0.0 to 255.255.255.255.
So, how do we tell if two computers are on the same LAN by their IP addresses?
Every computer on the Internet is assigned an IP address. The address is divided into two parts, with the first part representing the network and the second part representing the host. For example, the IP address 172.16.254.1, which is a 32-bit address, assumes that the network part is the first 24 bits (172.16.254) and the host part is the last 8 bits (the last 1). Computers on the same subnetwork must have the same network parts of their IP addresses, that is, 172.16.254.2 and 172.16.254.1 must be on the same subnetwork.
The problem, however, is that we can’t tell the network part from the IP address alone. Again, for 172.16.254.1, the first 24 bits, the first 16 bits, or even the first 28 bits of the network are not visible from the IP address.
So, how can you tell from the IP address whether two computers belong to the same subnetwork? This requires another parameter, the subnet mask.
The so-called “subnet mask” is a parameter that indicates the characteristics of the subnetwork. Formally equivalent to an IP address, it is also a 32-bit binary number. The network part of it is all 1 and the host part is all 0. 172.16.254.1, for example, IP address, if known network part is the first 24 bits, the host part is eight, after the subnet mask is 11111111.11111111.11111111.00000000, written in a decimal is 255.255.255.0.
Knowing the subnet mask, we can determine whether any two IP addresses are on the same subnet. This is done by taking the two IP addresses AND the subnet mask AND comparing them separately (if both digits are 1, the result is 1, otherwise it is 0) AND then comparing the results to see if they are the same, if so, they are in the same subnet, otherwise they are not.
For example, if the subnet masks of IP addresses 172.16.254.1 and 172.16.254.233 are 255.255.255.0, are they on the same subnet? The AND operation with the subnet mask results in 172.16.254.0, so they are on the same subnet.
In summary, the IP protocol has two main functions, one is to assign an IP address to each computer, and the other is to determine which addresses in the same subnetwork.
4.3 IP Packets
The data sent according to the IP protocol is called IP packet. It is not hard to imagine that this must include IP address information.
But as mentioned earlier, Ethernet packets contain only MAC addresses, not IP addresses. Do you need to change the data definition and add another field?
The answer is no, we can put the IP packets directly into the “data” part of the Ethernet packets, so there is no need to change the Ethernet specifications at all. That’s the beauty of the layered structure of the Internet: changes at the top have nothing to do with the structure at the bottom.
Specifically, IP packets are also divided into “header” and “data” parts.
The Header contains the version, length, and IP address information, and the Data contains the specific content of the IP packet. When it is put into an Ethernet packet, the Ethernet packet looks like this.
The length of the “header” portion of an IP packet ranges from 20 to 60 bytes, and the total length of the entire packet is a maximum of 65,535 bytes. Thus, in theory, the “data” portion of an IP packet should be a maximum of 65,515 bytes. As mentioned earlier, the maximum “data” portion of an Ethernet packet is 1500 bytes. Therefore, if an IP packet exceeds 1500 bytes, it needs to be split into several Ethernet packets and sent separately.
4.4 the ARP protocol
One final note about the network layer.
Because IP packets are sent in Ethernet packets, we must know both the MAC address and the IP address of the other party. Usually, the other party’s IP address is known (explained later), but its MAC address is not known.
So, we need a mechanism to get a MAC address from an IP address.
This can be divided into two cases:
In the first case, if the two hosts are not on the same subnetwork, then there is virtually no way to get the MAC address of the other, and the packets can only be sent to the “gateway” at the connection of the two subnetworks.
Second case: if two hosts are in the same subnetwork, then we can use ARP protocol, get each other’s MAC address. ARP also sends a packet (contained in an Ethernet packet) containing the IP address of the host to be queried. FF:FF:FF:FF:FF:FF :FF indicates a broadcast address. Each host in its subnetwork receives the packet, pulls out the IP address, and compares it with its own IP address. If they are the same, the packet is sent back with its MAC address, or discarded.
Therefore, it is necessary to remind that ARP can only be used in the same subnetwork environment. With ARP, we can get the MAC address of the host in the same subnetwork, and can send packets to the target computerCopy the code
In different subnetwork environments, how do we get the MAC address of the target computer and pass the frame through the routing table of the gateway?
5. Transport layer
5.1 Origin of the Transport Layer
With MAC addresses and IP addresses, we can communicate with any two hosts on the Internet.
The next problem is that there are a lot of programs on the same host that use the Web, for example, while you’re browsing the Web and chatting with friends online. When a packet comes over the Internet, how do you know if it represents the content of a web page or the content of an online chat?
That is, we also need a parameter that indicates which program (process) will use the packet. This parameter is called “port,” which is the number of every program that uses the network card. Each packet is sent to a specific port on the host, so different programs can get the data they need.
“Port” is an integer between 0 and 65535, with exactly 16 binary bits. Ports 0 to 1023 are occupied by the system. You can select only ports larger than 1023. Whether it’s for web browsing or online chat, the application picks a random port and communicates with the corresponding port on the server.
The function of the Transport layer is to establish port-to-port communication. The network layer, by contrast, establishes host-to-host communication. As long as we identify the host and port, we can implement communication between programs. Therefore, Unix systems refer to a host plus a port as a socket. With it, web application development is ready.
5.2 the UDP protocol
Now, we have to add port information to the packet, which requires a new protocol. The simplest implementation is called UDP, and its format is basically the port number before the data.
A UDP packet is also composed of “header” and “data”.
The “header” section mainly defines the sending and receiving ports, and the “data” section is the specific content. Then, the entire UDP packet is placed in the “data” section of the IP packet, which, as mentioned earlier, is placed inside the Ethernet packet, so the entire Ethernet packet now looks like this:
UDP packets are very simple. The “header” section is only 8 bytes in total, and the total length is no more than 65,535 bytes, which fits into an IP packet.
5.3 the TCP protocol
UDP is simple and easy to implement. However, it is not reliable. Once a packet is sent, you cannot know whether the packet is received.
To solve this problem and improve network reliability, TCP protocol was born. This protocol is very complex, but it can be approximated as UDP with a concatenation mechanism. Each packet sent requires a concatenation. If a packet is lost and no acknowledgement is received, the sender knows it is necessary to resend the packet.
Therefore, TCP ensures that data is not lost. Its disadvantage is that the process is complex, the implementation is difficult, and the consumption of more resources.
TCP packets, like UDP packets, are embedded in the “data” part of IP packets. There is no limit on the length of A TCP packet. In theory, the length of a TCP packet can be indefinitely long. To ensure network efficiency, the length of a TCP packet does not exceed that of an IP packet.
6. Application layer
The application receives data from the transport layer and then interprets it. Because of the open architecture of the Internet, data sources are so diverse that they have to be formatted in advance to make it impossible to interpret.
The purpose of the application layer is to define the data format of the application.
For example, the TCP protocol can transfer data to a wide variety of programs, such as Email, WWW, FTP, and so on. Then, different protocols must define the format of E-mail, web pages, FTP data, and these application protocols constitute the “application layer”.
This is the highest level, facing the user directly. Its data is located in the “data” section of the TCP packet. As a result, the current Ethernet packet looks like this.
At this point, the whole five layers of the Internet structure, bottom to bottom, all finished. This is how the Internet is structured from a systems point of view.
Vii. A summary
So let’s make a little summary of what we’ve done.
We already know that network communication is all about switching packets. Computer A sends A packet to computer B. Computer B receives the packet and replies with A packet, thus realizing communication between the two computers. The structure of the packet is basically as follows:
To send this packet, you need to know two addresses:
-
MAC address of the peer
-
IP address of the peer
With these two addresses, packets can be delivered to the recipient accurately. However, as mentioned earlier, MAC addresses have limitations. If two computers are not on the same subnetwork, they cannot know each other’s MAC address and must be forwarded through the gateway.
In the figure above, computer 1 is sending a packet to computer 4. It first checks whether computer number 4 is on the same subnetwork, finds that it is not, and sends the packet to gateway A. Gateway A, through the routing protocol, finds that computer 4 is on subnetwork B, and sends the packet to gateway B, which then forwards the packet to computer 4.
Computer 1 must know the MAC address of gateway A to send the packet to gateway A. So, the destination address of the packet is actually divided into two cases:
scenario | Packet address |
---|---|
Same subnetwork | MAC address and IP address of the peer |
Not the same subnetwork | The MAC address of the gateway and the IP address of the peer |
Routing protocol
Routing protocols mainly run on routers. Routing protocols are used to determine the arrival path, including RIP, IGRP (Cisco Proprietary protocol), EIGRP (Cisco Proprietary protocol), OSPF, IS-IS, BGP. Play a map navigation, responsible for the role of the road. It works at the network layer.Copy the code
Working principle:
The figure shows two Lans connected to the Internet, which belong to Router1 and Router2. In this case, the terminal with IP 1.1 needs to send a data packet to the host with IP 4.2. Host 1.1 packages the data and specifies the source address 1.1 and destination address 4.2 in the packet header. When the data packet is sent to the gateway (Router1 here), The gateway finds that the destination address 4.2 of the data packet belongs to network segment 4.0. Therefore, the gateway finds in the routing table that the data packet is forwarded through interface S0 of Router1.
When the data packet reaches Router2, Router2 finds that the destination address of the data packet is its subnet address, searches for network segment 4.0 in the routing table, and forwards the data packet from Router2’s E0 interface. The host with IP address 4.2 detects that the data packet is destined for Router2 and receives it. This is the end of a packet transmission.
Before sending a packet, the computer must determine whether the other party is on the same subnetwork and then select the appropriate MAC address. Now, let’s look at how this process works in practice.
8. User’s Internet access Settings
8.1 Static IP Address
You buy a new computer, plug in the network cable, turn it on, can the computer surf the Internet?
Usually you have to do some Settings. Sometimes, the administrator (or ISP) will tell you the following four parameters, which you can plug into the operating system to connect your computer to the Internet:
-
IP address of the local PC
-
Subnet mask
-
DNS IP address
-
Specifies the GATEWAY IP address
Below is the Windows system Settings window.
All four parameters are necessary, and I’ll explain why you need to know them to get online. Because they are given, every time the computer is turned on, it is assigned the same IP address, so this situation is called “static IP address Internet access.”
However, such Settings are professional and intimidating to the average user, and if one computer’s IP address stays the same, other computers can’t use it, which is not flexible enough. For these two reasons, most users use “dynamic IP address Internet access”.
8.2 Dynamic IP Address
The so-called “dynamic IP address”, refers to the computer is turned on, will be automatically assigned to an IP address, no manual setting. It uses a protocol called DHCP
This protocol provides that in each subnetwork, there is a computer responsible for the management of all IP addresses in the network, it is called “DHCP server”. When a new computer joins the network, it must send a DHCP request packet to the DHCP server to apply for an IP address and related network parameters.
As mentioned earlier, if two computers are on the same subnetwork, they must know each other’s MAC and IP addresses before they can send packets. However, the new computer does not know these two addresses, how to send packets?
The DHCP protocol has some clever rules.
8.3 the DHCP protocol
First, it is an application-layer protocol, built on top of UDP, so the entire packet looks like this:
(1) The first “Ethernet header”, set the MAC address of the sender (local machine) and receiver (DHCP server) MAC address. The MAC address of the local network adapter is the MAC address of the local network adapter. If you do not know the MAC address of the local network adapter, enter a broadcast address: ff-ff-ff-ff-ff.
(2) Set the IP address of the sender and the IP address of the receiver. At this point, the machine does not know either. Therefore, the IP address of the sender is set to 0.0.0.0 and that of the receiver is set to 255.255.255.255.
Once the packet is constructed, it is ready to be sent. Ethernet is broadcast, and every computer on the same subnetwork receives the packet. Because the MAC address of the recipient is FF-ff-ff-ff-ff, it is not clear to whom the packet is destined. Therefore, each computer that receives the packet must analyze the IP address of the packet to determine whether it is destined for itself. When the sender IP address is 0.0.0.0 and the receiver 255.255.255.255, the DHCP server knows “this packet is for me” and other computers can discard the packet.
The DHCP server then reads the packet, assigns an IP address, and sends back a “DHCP response” packet. The structure of the response packet is similar. The MAC address of the Ethernet header is the nic address of both parties, the IP address of the IP header is the IP address of the DHCP server (the sender) and 255.255.255.255 (the receiver), the UDP header is port 67 (the sender) and port 68 (the receiver), and the UDP header is port 67 (the sender) and port 68 (the receiver). The IP address and local network parameters assigned to the requestor are contained in the Data section.
The new computer receives the response packet and knows its IP address, subnet mask, gateway address, DNS server, and so on.
8.4 Internet Access Settings: Summary
This part, need to remember is a point: whether “static IP address” or “dynamic IP address”, the first step of the computer Internet, is to determine the four parameters. These four values are important enough to be repeated:
-
IP address of the local PC
-
Subnet mask
-
DNS IP address
-
Specifies the GATEWAY IP address
With these numbers, the computer can “surf” the Internet. Next, let’s look at an example of how Internet protocols work when a user visits a web page.
An example: visiting a web page
Let’s assume that, following the steps in the previous section, the user has set his network parameters:
-
The IP address of the local PC is 192.168.1.100
-
Subnet mask: 255.255.255.0
-
DNS IP address: 8.8.8.8
-
The gateway IP address is 192.168.1.1
Then he opened his browser, wanted to visit Google, and typed in the address box: www.google.com.
This means that the browser sends Google a packet of requests for a web page.
9.2 the DNS protocol
We know that to send a packet, you have to know the IP address of the other party. But, for now, we only know the website www.google.com, not its IP address.
The DNS protocol can help us translate this web address into an IP address. We know that the DNS server is 8.8.8.8, so we send a DNS packet to this address (port 53).
The DNS server then responds, telling us that Google’s IP address is 172.194.72.105. So, we know each other’s IP address.
9.3 Subnet Mask
Next, we need to determine if the IP address is in the same subnetwork, using the subnet mask.
Given that the subnet mask is 255.255.255.0, this machine uses it to do a binary AND operation (both digits are 1, the result is 1, otherwise it is 0) on its IP address 192.168.1.100, AND the calculation result is 192.168.1.0. Then do an AND on Google’s IP address 172.194.72.105. The result is 172.194.72.0. The two results are not equal, so the conclusion is that Google is not on the same subnetwork as the native.
Therefore, to send a packet to Google, we must forward it through gateway 192.168.1.1, which means that the MAC address of the recipient will be the MAC address of the gateway.
9.4 Application Layer Protocols
The HTTP protocol is used to browse the web, and its entire packet structure looks like this:
The HTTP portion of the content looks something like this:
We assume that the length of this section is 4960 bytes and it will be embedded in the TCP packet.
9.5 the TCP protocol
TCP packets require port Settings. The HTTP port for the receiver (Google) is 80 by default, and the port for the sender (local host) is a randomly generated integer between 1024 and 65535, assuming 51775.
The TCP packet header is 20 bytes long, and when you add the embedded HTTP packets, the total length becomes 4980 bytes.
9.6 IP
The TCP packet is then embedded in the IP packet. IP packets need to be set to the IP addresses of both parties, which are known to be 192.168.1.100 for the sender (local host) and 172.194.72.105 for the receiver (Google).
The header length of an IP packet is 20 bytes. When embedded TCP packets are added, the total length becomes 5000 bytes.
9.7 Ethernet Protocols
Finally, the IP packets are embedded in the Ethernet packets. The MAC address of the sender is the MAC address of the local network adapter, and the MAC address of the gateway 192.168.1.1 is the MAC address of the receiver (obtained through ARP).
The maximum length of the data portion of an Ethernet packet is 1500 bytes, whereas the current IP packet length is 5000 bytes. Therefore, IP packets must be split into four packets. Because each packet has its own IP header (20 bytes), the lengths of the four IP packets are 1500, 1500, 1500, and 560, respectively.
9.8 Server Response
After forwarding through multiple gateways, Google’s server 172.194.72.105 received the four Ethernet packets.
Based on the serial number of the IP header, Google pieced together the four packets, took out the complete TCP packet, read the “HTTP request” inside, then made an “HTTP response” and sent it back over TCP.
After receiving the HTTP response, the machine can display the web page and complete a network communication.