Computer network brief flow

The background,

There are two main reasons for writing this blog. One is that I have been thinking about the nature of RPC before. When I read the source code of Dubbo, I began to wonder about the nature of the communication between consumers and producers. How is the bottom level implemented? Another point is to see the recruitment information of most companies are written “familiar with TCP/ network”, not to be familiar with, at least to understand the general process of the network.

To this end, I dug out the university period “computer network” textbook, learn from scratch. Writing this blog is a summary. I’m going to try to simplify my thinking about the nature of computer networks. Not necessarily complete, not necessarily correct. I hope you can straighten it out.

Blogs use pictures from the Internet

Second, the overview

2.1. Computer network history

The beginning of the network came from ARPANET of the US military, and then NSF began to form NSFNET. Then, with more and more hosts, Internet began to be commercialized, which made a hundred flowers blossom and a hundred schools of thought contend.

The form of the network has also been enriched with the passage of time. At first, it was just for communication between different hosts. Later, it supported document communication. Finally, various forms of text, picture, audio and video have been derived.

2.2. Network architecture

Currently, there are two network reference models, the OSI seven-layer model and the TCP/IP reference model

The differences between the two models are not important to the topic of this article and will not be explored here. In the following sections, the physical layer, data link layer, network layer of the OSI reference model and the network layer, transport layer and application layer of the TCP/IP reference model will be introduced according to the chapters in the textbook Computer Networks.

Let’s start with the conclusion, and briefly talk about my thinking on each layer:

2.2.1 Physical layer

In the physical layer, the exit is a bit stream, and the entrance can be varied, such as telephone lines (ancient dial-up Internet), cable, optical fiber, unlimited communication lines, etc., which can have many different transmission media. The transmission medium is different, the signal is different. How to process the transmission media from different sources into uniform and standardized bit stream is the problem that the physical layer should think about. A bit like a virtual machine in Java, the underlying system may be Linux or Windows, but the JVM shields the underlying differences and provides a unified environment for executing class files.

Similarly, different transport media require different protocols. I understand the agreement here as an industry standard. Only when the standard is set and the whole world acts according to the standard, can we exchange and exchange what we need. The better known is probably the ADSL protocol.

2.2.2 Data link layer

There are errors (for example, bit error rate) in the data signals that the physical layer transmits over the physical wires. The main purpose of the data link layer is twofold:

One is to improve the error-free data link from the error-free physical link by adopting the methods of error detection, error control and flow control on the basis of the physical layer. Methods include: CRC cyclic redundancy coding, ARQ feedback retransmission.
One is the MAC layer. The full name for the Mac is Medium Access Control. The control is in the issue of who sends the data to the media first and who sends the data later. Prevent chaos.

In a nutshell, error control and multiplexing.

For the data link layer, there are several well-known concepts to be explained.

The network card

In a local area network, the coaxial cable that connects multiple computers is called the Shared Bus Transmission Media, or Shared Media for short. Multiple hosts need to send and receive data over a shared medium, known as multiplexing or access. In order to prevent the interference/Collision between Multiple Access, we need a set of protocol to control, this protocol is CSMA/CD(Carrier Sense Multiple Access with Collision Detection) baseband Collision Detection Carrier Sense Multiple Access technology.

In fact, the essential network card of computer is to complete the function of data sending and receiving and conflict detection through the physical layer and MAC layer. One end of the network card is connected to the transmission medium, and the other end is connected to the computer. Each network card also has a physical address (MAC address), theoretically guaranteed to be unique and unmodifiable.

switches

Through the switch, any number of machines can be set up as a LAN. In other words, the switch takes care of the communication within the LAN.

The switch completes the communication within the LAN in detail, which will be described later. Here understand the function of the network card.

Bridges/routers

Bridges are the basis of routers and gateways. Bridges can communicate with multiple LANs. The bridge connects to the LAN through the network card, and records the relationship between the MAC address of different hosts and the bridge port through the internal forwarding.

The generation method of forwarding and the process of router communication are also analyzed. This is where the bridge/router function is understood.

2.2.3 Network layer

Network communication is usually completed through routers. The router here does not mean the home router, but the router of the network operator (such as China Telecom/China Unicom/China Mobile, etc.). Routers have routing tables, which are generated by routing algorithms, that store the destination address and information about how to get there.

But the routing table can not be infinite, so the concept of autonomous system is proposed. Computers and routers in a region constitute an autonomous region. An autonomous area can be a computer room in a university lab, or a computer in a community.

Within the autonomous region

Routers within the autonomous region update their routing tables using the OSPF protocol, which requires that

Each router periodically tells other routers about the routers it can reach
Each router generates a shortest spanning tree with itself as root

Inter-autonomous region

The routing algorithm between autonomous systems is BGP, which requires each autonomous system to choose a router as the speaker, and different speakers should communicate with each other about the routing information in the autonomous system.

So that’s the essence of the network. In fact, the network is host-router-router-host. The routers in the middle might have n of them.

The key to network is how routers jump from one router to another, and how routers know which router is the next hop. This is actually the routing table matching process.

The routing table has three matching principles:

1) Longest mask matching (the longer the mask, the more accurate it is)

2) Management distance (the smaller the better)

3) Measurement value (dynamic routing protocol judgment)

That is, in general, there will always be a route to the next hop, but that next hop may not be the destination. The general trend is getting more accurate (not every jump is accurate).

You can see the progress of each hop with the traceroute command

2.2.4 Transport Layer

The essence of a network is process communication between different hosts. The network layer, the data link layer and the physical layer realize the data communication between the hosts and provide the basis for the process communication.

The transport layer provides the protocol for process communication – TCP UDP.

The so-called three handshakes of TCP protocol to establish connection are in fact the concept of the transport layer. The three handshakes to establish connection are reflected in the network layer, the data link layer and the physical layer, and there is no special place. It does not divide a line into a physical line, nor does it divide a fixed group into an IP protocol. It is just a logical concept of the TCP protocol. A bit like Kafka’s topic, we often say that the data is stored under a topic. In fact, the data is stored under a partition.

At this level, it doesn’t have much to do with the nature of the network. But when it comes to application development, you need to know more about this layer.

2.2.5 Application layer

The transport layer realizes process communication between different hosts, so the application layer naturally realizes some functions through process communication

For example, typical FTP, HTTP.

2.3. Network Message

A message simply means the transmission of information. When carrying on the network transmission, it is impossible to transmit directly without producing a piece of information. Generally, the message is divided into several groups for transmission by grouping. This requires a protocol to represent the grouping, resulting in a structure: flag fields + grouping information

Since the network is constantly layered, with each layer having its own unique protocol, the grouping information is constantly adding flag fields. The underlying structure becomes the information in the upper layer, and the markup fields in the upper layer become an entirely new structure. (commonly known as nesting dolls)

Third, the details

3.1 LAN – Local Area Network (abbreviation :LAN)

Two computers connected by a network cable can form a minimal LAN.

3.1.1 Hub

Many computers form a LAN through a hub, and it copies every byte it receives to other ports.

Hub is a pure hardware network equipment, basically do not have the “intelligent memory” and “learning” ability similar to the switch. It also doesn’t have the MAC address table of a switch, so it sends data untargeted and instead broadcasts it. So when it sends data to a node, instead of sending it directly to the destination node, it sends the data packet to all the nodes that are connected to the hub.”

How the hub works:

You stand to the school atrium, shout 1 “small fang, I came you to look for you!” (radio)
If someone else happens to be yelling at the same time, you must wait for him to finish yelling. (line)
If you happen to meet another person Shouting at the same time as you, then neither you nor he will be heard. (conflict)
When you’re Shouting, you can’t hear what others are saying. Only when you’re finished do you start listening. (Half-duplex working mode, monitoring)
Sure enough, across the building came your girlfriend’s voice “you go to hell!” (response)

Hubs can cause a broadcast storm

In fact, I understand here is the broadcast jam. The root of all evil is radio. Hubs work by broadcasting. A message is broadcast to all the machines in the LAN, and when a certain number of devices are connected, the message is broadcast in turn. This will result in a large number of identical messages flooding the LAN (which is why I call it a jam). And, after receiving the message is generally will reply to the message, this reply message behavior may aggravate the broadcast storm (jam).

3.1.2 Switch

A switch is an upgrade to a hub that has a port-MAC address mapping table (see the switch diagram in 2.2.2).

With switches, communication between the two hosts in the LAN does not have to be broadcast.

3.2 LAN communication

3.2.1, bridge

In fact, network communication is the communication between multiple LANs, and a bridge is a network device that realizes the interconnection of multiple LANs. The working principle of a bridge is basically the same as that of a router and a gateway.

The working principle of the network bridge is as shown below:

As you can see, the bridge connects to multiple LANs through multiple ports to enable communication between the two LANs.

Multiple Bridges can also be connected to each other

3.2.2 Bridge and Router

The main differences between a bridge and a router are:

A bridge can only connect two networks with the same logic (it is equivalent to a two-layer switch), while a router can connect different networks; Bridge is a large LAN composed of machines in different physical locations, connected to a number of networks belong to the same LAN; The bridge connects two networks that logically belong to the same LAN, but can be networks with different policies, such as Ethernet and Token Ring. Routers can connect to different networks, which are independent of each other, so to speak.
Bridge forwarding based on MAC address, router forwarding based on IP;
Bridges do not isolate broadcasts, while routers can isolate broadcasts.
The bridge works in the link layer and the router works in the network layer.

3.2.3 IP Protocol

Router is based on IP forwarding, here it is necessary to introduce IP related knowledge

3.2.3.1 Features of IP Protocol

IP protocol is a connectionless and unreliable packet transport service protocol
IP protocol is a point – to – point network layer communication protocol

3.2.3.2 IPv4 grouping format

3.2.3.3 IPv4 address classification

IPv4 addresses fall into five categories

The network number is used to identify the network in which the host resides, and the host number is used to identify the host in that network.

P addresses are divided into five categories: A is reserved for government agencies, B is allocated to medium-sized companies, C is allocated to anyone who needs it, D is used for multicast, and E is used for experiments. The number of addresses that each category can accommodate varies.

As can be seen from the above figure, an IP address is composed of network number + host number. This structure is not flexible enough to cause low utilization of IP addresses (such as hundreds of IP addresses assigned to one or two hosts in one location). Therefore, the concept of subnet was proposed: a part of host number was borrowed as the subnet number of subnet, and more subnet IP addresses were divided

Subnet provisions:

The new structure consists of network number + subnet number + host number
All hosts in the same subnet use the same subnet number
The subnets must be very close together. Assigning a subnet is an internal LAN matter, no application is required, and has no effect on external router addressing.

3.2.3.4 Subnet Mask

The subnet is used to retrieve the subnet number from the host number through the subnet mask. The subnet masks for class A, B, and C are as follows:

Example: Class C IP address 192.168.10.0, 200 machines are divided into 4 subnets, subnet mask

Step 1:200 machines, 4 subnets, that is, 50 machines per subnet, the default subnet mask of class C IP address is 255.255.255.0;
Step 2: Everyone knows that 2 to the power of 0 to the power of 10 are 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024;
Step 3: if you want to have 50 in each subnet IP address can be for the machine, then you will need to prepare at least to 52 per subnet IP address, because of the need to add two head, not with the network and broadcast IP, so you need to choose the recent more than 52, 64, is 64 IP addresses per subnet.
Step 4: The subnet mask should be 256-64=192, so the subnet mask is: 255.255.255.192

3.2.3.5. No inter-domain routing CIDR

CIDR (Classless Inter-Domain Routing) eliminates the traditional concepts of class A, B, and C addresses and subnets, thus allocating IPv4 address space more efficiently. IP address = network prefix + host number, use slash notation, after the IP address with a slash “/”, then write the number of network prefix, such as:

IP address 20.1.1.1, subnet mask 255.192.0.0, recorded as 20.1.1.1/10 according to CIDR, 10 represents 10 consecutive ones, that is, the network prefix occupies 10 bits.
The CIDR address 200.1.1.2/24 indicates that the first 24 bits are used as a network prefix.

One of the biggest benefits of CIDR is that it greatly reduces the size of the router’s routing table and reduces address waste.

CIDR consists of contiguous IP addresses with the same network prefix as a “CIDR address block”. The addresses are contiguous. Otherwise, it would be impossible to design a prefix that contains the desired address, but excludes the unwanted address. To do this, supernet blocks, or large contigous blocks of addresses, are assigned to ISPs, which then divide them among users, reducing the burden on their own routers.

3.2.4 Router

3.2.4.1 Routing jump

Network communication is the process that information jumps from the source host to the source router, then from one router to another, and finally arrives at the destination router to the destination host.

Single router:

Multiple routers:

3.2.4.2 Autonomous system

In order to solve the problem of complex routing table generation and routing information updating, people put forward the concept of autonomous system. Features are:

The network of an autonomous system belongs to a unit, such as a university, a corporation. You can choose your own routing protocol within the autonomous system
An autonomous system has both an internal gateway protocol (IGP) and an external gateway protocol (EGP)

IGP mainly includes: Routing Information Protocol (RIP), Development Shortest Path First (OSPF)

The most commonly used EGP is BGP-4. The BGP-4 protocol requires each autonomous system to select a router as its “speaker”. The speaker directly wants to exchange routing information.

About IGP and EGP detailed introduction here to leave pit, fill later. Now it is clear that the router mainly completes the internal and external routing table information generation and dynamic update through these several protocols.

3.3 Network Layer Related Protocols

3.3.1 Internet Message Control Protocol (ICMP

IP protocol is to provide the best possible service, in the transmission process, such as network connectivity, whether the host is reachable, routing is available and other information source host does not know. The ICMP protocol is used to transfer control information between the host and the router, including reporting errors, exchanging restricted control and status information, etc

ICMP protocol belongs to a protocol in the network layer and is a component of IP protocol.

The Ping and Traceroute commands are typical applications of ICMP protocol.

Specific ICMP protocol to leave pit, here is a general impression can be.

3.3.2 IP Multicast and IGMP Protocol

3.3.3 VPN and MPLS Protocol

3.3.4 Address resolution protocol ARP

During network communication, the source host is required to know the IP address and MAC address of the destination host. IP address is generally known in advance, so how to find MAC address based on known IP address? ARP protocol. ARP protocol solves the IP address and MAC address mapping problem.

The process of finding MAC addresses from IP addresses is called forward address resolution, and the corresponding protocol is called ARP protocol.

The process of finding an IP address from a MAC address is called reverse address resolution, and the corresponding protocol is called RARP.

The ARP protocol is essentially a broadcast way between the LAN to find the corresponding IP address or MAC address.

Specific ARP protocol to leave pit, there is a general impression can be.

3.3.5 Mobile IP Protocol

3.4. Transport layer

The network layer considers mainly point-to-point communication, while the transport layer considers end-to-end communication.

3.4.1 Basic concepts

TPDU

The packets transmitted between transport layers are called Transport Protocol Data Units (TPDU)

Application processes, transport layer interfaces, and sockets

A host can be identified by its IP address, and an application process on the host can be identified by its port number

The application process communicates through the transport layer protocol (TCP, UDP)

Sockets are defined by IP addresses and port numbers. A process on the network can be identified through a socket.

3.4.2 User data protocol UDP

Leave the pit

3.4.3 Transmission Control Protocol TCP