The last one is about the basic principles of computer composition which need to be mastered in the front end of the non-technical class. This article is on computer networks.
The main purpose of this article is also to supplement the basic knowledge of the computer science courses for non-professional front-end, such as you want to do node development, not familiar with how data is transmitted in the network, understand a lot of API will be more confused, then, take a look at this carefully prepared article for you 😁!
In fact, these contents are rather boring, but I take the form of pictures and pictures, and for the unimportant knowledge will be marked, as far as possible to reduce the boredom of reading. Don’t say a word! Take a drink of water!
Specific actual combat topic high-quality article: juejin.cn/post/692867…
1. Computer network concepts (a simple glance)
Computer network: is a decentralized, with independent functions of the computer system, through communication equipment and lines connected, by the functional software to achieve the sharing of resources and information transfer system.
It is important to note here that in terms of distribution, computer networks include local area networks (Lans) and wide area networks (Wans), where LAN stands for Ethernet, and the most important distinction between the two networks is that LAN is based on broadcast technology and WAN is based on packet switching technology. (Just listen to these concepts, and more on them later, because understanding broadcast and packet switching technologies will allow you to understand lans and Wans in general)
2. A measure of the performance of a computer network
This section describes basic concepts to be mastered when you learn specific protocols and analyze packets of these protocols.
Rate of 2.1
Rate is the rate at which data is being transferred (0’s and 1’s), so if you use thunderbolt download, for example, 1 megabit per second, to measure how fast data is being transferred. It is the most important performance index in computer network.
2.2 the bandwidth
In computer networks, network bandwidth refers to the amount of data that can be transmitted per unit of time (generally referred to as 1 second). For example, your telecommunications network is 100 megabits, meaning that the maximum transfer rate per second is 100 megabits.
2.3 throughput
Throughput represents the amount of data passing through a network (or channel, interface) per unit of time.
Above three points, let’s take an example
- Up to 100 cars per second can pass on one road (100 cars per second for broadband).
- Instead of 100 cars passing every second, let’s say there were 0 cars in the first second, 10 cars in the second… (but not more than 100 cars).
- So we have 0 cars per second in the first second, 10 cars per second in the second second, and 30 cars per second in the third second, so we can’t say what the bandwidth is, so we can say how much traffic is going through in a given time in terms of throughput (which might be equal to the bandwidth).
- So the bandwidth is the maximum rate, and the throughput is the rate at some point in time. However, the throughput cannot exceed the maximum rate.
2.4 time delay
Latency is the time it takes for data (packets/packets/bitstreams) to travel from one end of a network (or link) to the other. The units are s. There are several kinds of delays:
(1) Send delay
It means the time between when I’m talking to you, and when I’m talking to you, it’s the transmission delay.
(2) Propagation delay
As shown in the GIF, the propagation delay is the time from the first bit on the channel to the last bit reaching the host interface.
(3) Queue delay
-
As a packet travels through the network, it passes through many routers.
-
However, after entering the router, the packet must first queue up in the input queue for processing.
-
After the router determines the forwarding interface, it still waits in the output queue for forwarding, which causes queuing delay.
-
The length of the queue delay usually depends on the current traffic of the network. When the traffic of the network is large, queue overflow occurs, which is packet loss.
(4) Processing delay
When a router or host receives a data packet, it takes a certain amount of time to process it, such as analyzing the header of the data packet, checking the header error, searching the routing table and selecting the outgoing interface for the data packet, which results in processing delay.
(5) Round trip time (RTT)
In computer networks, round trip time is also an important performance indicator. It represents the total time between the time the sender sends the data and the time the sender receives the acknowledgement from the receiver (who sends the acknowledgement immediately after receiving the data)
(6) delay-bandwidth product
The propagation delay times the bandwidth
3. Start the text! OSI Reference Model
The OSI reference model is a seven-layer framework for network interconnection. I won’t go into detail here. The details of each layer will be covered later, but just a first impression is needed here.
As shown in the figure below, layer 1, 2, and 3 are mainly composed of physical links, such as optical fibers, routers, and hubs, which are mainly responsible for data communication. Layers 5, 6, and 7 are software controlled, such as HTTP protocol, which is a software layer controlled protocol, mainly responsible for processing the transmitted data.
4. Physical layer
4.1 What is the use of the physical layer?
For the physical layer, one might say, isn’t it just the network cable, like the network cable that connects the router at home, or the optical fiber on the telegraph pole? It doesn’t matter what the physical medium is, like optical fiber or twisted pair, as long as you can transmit data according to the physical standard.
What are the main tasks of the physical plane?
- For example, there are rules
Electrical characteristics
, the signal level is used+10V - +15V
Binary representation0
with-10V - -15V
Binary representation1
As long as a network cable represents this property it doesn’t matter what material you use. - There are other features that we don’t need to know,
Know that the physical layer is the standard that defines the interface of the transmission media
Can.
4.2 Optical fiber broadband Internet access is in what form of data transmission?
- First, computers
The network card
The data that comes out isElectrical signals
.Optical fiber
The transmission isLight pulse
Signal, represented by a pulse of light1
, without light pulse0
. - And the frequency of visible light is about
Ten to the eighth MHz
Therefore, the bandwidth of optical fiber communication system is far greater than that of other transmission media - So we need to transfer data from the computer first
Electrical signals
toOptical signal
And thenOptical signal
When we get close to the server, we’llOptical signal
toElectrical signals
.
4.3 Physical Layer Device Repeater
Why do you need a repeater?
Because the power of the signal transmitted on the line will gradually decay, the attenuation to a certain extent will cause signal distortion, so it will lead to receiving errors.
The repeater can regenerate and restore the signal, increasing the transmission distance of the signal.
It is important to note that the two ends of the repeater are connected to different network segments, not subnets. You need to learn IP classification at the network layer to understand the concept of different network segments. In this case, different network segments are the networks connected by different routers.
You can see through the above so many concepts is really not easy, drink water, zha we continue!
Let’s move on to the next layer, the data link layer!
5. Data link layer
5.1 What does the Data Link Layer do?
Let’s take a little story for example
- The network layer is the big Boss and is responsible for giving this to the data link layer
Small secretary
Assign the task, let the secretary give 5 documents to B company, the secretary, to find a CourierThe physical layer
Do it - But the physical layer is a fool, he only know to pick up the file on the dash to B company, in the middle of the lost or lost things are not clear, so the data link layer this small secretary must have a bottom in mind, a total of 5 files sent, and write to the express outer layer. When the secretary of COMPANY B gets the document sent by the fool, she should see whether the document is lost or not. If it is lost, she should ask the fool to go back and get the lost document again.
- From this story, we can summarize the main functions of the data link layer
5.2 Functions of the Data Link Layer
(1) Encapsulation into frame data link layer is not brainless forward boss information, she needs to encapsulate the file number. An encapsulated network packet at the link layer is called a data frame.
(2) Transparent transmission
Transparent transmission means that no matter what information the boss sends, such as the secretary’s dismissal information in the file, the secretary has to be transmitted exactly as it is. The data part of the frame may have the same characters as the beginning of the frame. In this case, measures should be taken to ensure that the receiver is not misled about which is the beginning of the frame and which is the data of the frame. Does the string
(3) Error control
Error control is when the documents are sent to the small secretary of COMPANY B, the express package is written with 5 documents, and the secretary sees only 3 documents, so he will ask a fool to resend the documents that have not been sent. The error control method is the CRC cyclic redundancy code, which I won’t go into detail about, because I don’t know much about it myself, but the link layer frame has an FCS bit set aside for this code to determine whether or not a frame is wrong.
(4) Error correction
Error correction is when the link layer knows which files 1, 2, 3, 4, and 5 are missing and can correct by resend the missing files.
There are also some data link layer features not mentioned in the story, such as:
(5) Flow control, such as the sender sending speed is very fast, the receiver receiving speed is very slow, will cause transmission error.
It is important to note here that the transport layer TCP also has flow control. The difference is that TCP is end-to-end flow control, while the link layer is point-to-point (such as from one router to the next).
The methods of flow control are the sliding window protocol and the selective retransmission protocol, which I’ll leave for when I talk about TCP. Next up is Ethernet, the most common LAN technology today. It is helpful for us to understand LAN.
6 Ethernet and wireless network
Ethernet is a LAN technology, its rules on the access control method, the transmission control protocol (TCP), the network topology structure, transmission rate, etc., to complete some contents of the data link layer and physical layer, it USES a known as CSMA/CD media access method (below), some other LAN technology, such as wireless local area network (LAN).
6.1 Ethernet Frame Format
- Among them
The destination address
andSource address
Refers to theThe MAC address
That is, the physical address of the device. MAC addresses are used to identify nics. Each NIC has a unique MAC address - If host A needs to send A message to host B on the same LAN, host A will
Ethernet frame
After receiving the Ethernet frame, the network card in the host compares the destination MAC address with its own MAC address.If it's different, it's discarded, if it's the same, it's received
, then host B receives the message from A. - At the end is the CRC cyclic redundancy code, used for
Error control
That testCorrectness of frames
- In Ethernet protocols, there are three types of destination addresses
Unicast address
,The broadcast address
,The multicast address
, in which the unicast address A is sent to host B. The receiver of the unicast address is one, and the low order of the highest byte of the destination address is0
6.2 Features of Ethernet
- No connection. The connection between the sender and receiver is not established.
- Is not reliable. The receiver does not confirm to the sender, and the error frame is directly discarded.
6.3 Ethernet Topology
Ethernet-related topologies include star and bus.
The star topology is as follows:
The bus topology is as follows:
In the early days of the Internet, the Ethernet bus topology was common. Reliability decreases as the number of sites on bus Ethernet increases, while developments in large-scale integrated circuits and specialized chips make star Ethernet cheaper and more reliable.
Note that although the Ethernet topology is a star physically, it is a bus logically.
6.4 the network card
The data from the computer, through the network card, will become Ethernet frames, but also to complete some link management (CDMA/CD implementation), as well as encoding and decoding (encoding decoding I do not understand… It seems to be the Manchester code, which specifies how to represent high and low levels).
6.5 Wlan
WLAN is a local area network (LAN) that uses wireless communication technologies such as radio waves, laser, and infrared.
Wireless LAN We will only introduce the typical network structure.
- The image above
AP
isCommunication base station
If the mobile devices covered by the base station can communicate directly through the base station - If it’s a mobile device in a different base station, not only does it need to transmit data through its own base station, it also needs to
Transmit data between base stations and base stations
, to enable communication between two wireless devices within range of different base stations.
6.6 CSMA/CD protocol
Since this protocol is used by Ethernet, there are only a few features to keep in mind.
- Each station checks whether other computers on the bus are sending data before and during transmission.
- is
Main line
.Half duplex network
(Half-duplex is when data is allowed to travel in both directions, but only in one direction at a time.)
6.7 Devices at the link Layer
(1) Network bridge
The bridge forwards and filters the MAC frame based on the destination address. When a bridge receives a frame, it does not forward the frame to all interfaces. Instead, it checks the destination MAC address of the frame before deciding which port to forward the frame to, or discarding it.
The important thing to note here is that a bridge connects to a different network segment, and what is a network segment? I’ll tell you a little bit about it here, but when we talk about IP addresses, the same network segment means that the IP address and the subnet mask (we’ll talk about that in detail when we talk about IP addresses) are the same network address.
(2) Ethernet switch
When it comes to switches, we have to mention two concepts, conflicting domain and broadcast domain
- Conflict domain: Indicates the same time
Only one device can be used
The range of messages to be sent. - Broadcast domain: If a site sends a broadcast signal,
Anyone who can pick up the signal
The device range is called the broadcast domain - In other words, the broadcast domain can
Across a network segment
, and the conflict domain only occursSame network segment
.
For example, in a company, everyone’s computers are connected to switches, because switches can isolate conflicting domains, and the problem with conflicting domains is that only one machine can transmit data at a time, and if you have so many people in the company, it’s too slow to transmit data. The switch then connects to the router, which isolates the broadcast domain. Second, packets at your data link layer can’t get into the Internet without going through the router, which is a device at the network layer.
Brother, can read above content, you also really enough can endure, cow B, drink water! Continue!
7. Network layer
Concept walk, take a look, take a look at it!
7.1 Network Layer Concepts
The main task of the network layer is to move packets from one host to another. (The idea is that most computer networks can’t transmit data of any length continuously, so the network system actually divides data into small chunks, called packets). Thus provides host to host communication services and various forms of process to process communication.
7.2 Learning Network Layer Concepts
7.2.1 Packet switching
When the host H1 to send data to another host H2 (message), in the first place will be divided into several equal length data packet, and then one by one, these subgroups in the route that goes with the H1 sent A, after receiving A packet, the first into the buffer, then according to certain routing algorithm, the next step is to determine the grouping hair note which nodes nodes in A such A transfer, All the way to the final destination, H2.
This passage is repetitive, simple grouping is score according to the block, and at the same time also has no connection is established, store-and-forward (store-and-forward refers to Ethernet switch controller input port arrival packets cached first, first check whether the packet is correct, and filter out the conflict packet error) and dynamically allocated routes (exchange equipment will choose a different route according to network state, Such as router).
7.2.2 the datagram
A datagram is a basic unit of data transmitted over a network, consisting of a header and the data itself. To put it plainly, is the address of the data, such as you wrote a wechat “hello”, this string of text with the source address, destination address, is the data packet.
7.2.3 Datagram Format
- The fixed part of the head is
20 bytes
, total 20 * 8 = 160 bits (1 byte =8 bits) 0 to 4
Bit isThe version number
, the version can be ipv4/ipv6- Head length,
The unit is 4 b
The minimum is 5. Why is 5? Since the header is at least 20 bytes, 4* 5 is 20 bytes - Differentiate services without looking.
- The total length is,
Header + data
- Time to live is
TTL
, which tells the network whether the packet has been on the network for too long and should be discarded. Every time we pass a router, we subtract one and discard it as a 0 - Protocol is what protocol is used for the data part, we just need to know
TCP
Deal with6
Said,UDP
Deal with17
That’s ok. The header checksum is 16 bits
. This field validates only the header of the datagram, but not the data portion.- The destination IP address and source IP address are both
The IP address
, the destination address is passedDNS
From the query.
7.2.4 IP fragmentation
Why shard?
The size of data encapsulated by data frames at the link layer is limited. The MTU of the Ethernet (the maximum packet size that can pass through a certain layer of a communication protocol) is 1500 bytes.
Next, let’s look at the IP packet, which fields identify the sharded data?
- Logo is in
Shards of the same data
When the same. - The logo has three bits, but only two are meaningful, the first bit is called
MF
.MF=1
That is, the datagram “with fragments” is displayed.MF=0
Indicates that this is the last of several datagrams. - The middle bit of the flag field is denoted as
DF
(Don’t Fragment), which means “cannot Fragment”. Only when theDF=0
Is displayed before sharding is allowed. - Chip offset, the relative position of a piece in the original group after the longer group is fragmented.
7.2.5 IP Address Classification
There are five types of IP addresses
- Class A:
1.0.0.0 ~ 126.255.255.255
- Class B:
128.0.0.0 ~ 191.255.255.255
- Class C:
192.0.0.0 ~ 223.255.255.255
- Class D:
"~ 239.255.255.255
- E class:
240.0.0.0 ~ 254.255.255.255
- Among them
127.0.0.0 ~ 127.255.255.255
For loopback testing, class D address for multicast, class E address for scientific research
*.*(* is a number from 0 to 255); * is a number from 0 to 255. In the company and home are this network segment, is not very strange, your home network segment and how the company?
In fact, there is a part of the so-called private IP address, you can not get the network to communicate with other computers. Can only be Intranet internal use. For example, there are:
As you can see, class C private addresses are in the 192.168 network segment, and each LAN can have these private IP addresses.
There are some special addresses you need to know about
Notice, all the ones here are,An ipv4 address consists of four bytes
, each byte is 8 bits, all 8 bits is oneDecimal 255
, i.e.,255.255.255.255
.
- The first row, all ones, means 1
255.255.255.255
If you write this at the destination address on this network, it willNetwork radio
- The second line, the network number specific value, the host number is all zeros, for example,
192.169.10.1
, this is a class C network, so the network number is192.169.10
The host number is1
When all host ids are 0, it is 0192.169.10.0
The network segment - The third row, again
192.169.10.1
This class C address, the host number is all 1, which is 8 1’s for 255, so192.169.10.255
Indicates the broadcast address of the network segment - The fourth line, the most familiar,
127
As a network number, the host number is not all 0s or 1s, for example127.0.0.1
Represents the native, calledRing back
The address.
7.2.6 Network Address Translation
In the IP address category, we know that private IP address is not with the external network interaction, in small companies, most of the computer address is 192.168 network segment, are private IP address, it is how to interact with the external network data, here brings out a knowledge point called network address translation NAT.
As shown in the figure above, 192.168.0.3 and 192.168.0.4 belong to private network segments and cannot communicate with the extranet. At this time, because the router is installed with NAT software, it can use its OWN IP address, that is, the IP address 172.38.1.5 of the router, as the proxy of the Intranet to access the extranet. The data returned by the extranet. After route translation, the IP address is converted to the private IP address of network segment 192.168 on the Intranet.
7.2.7 Subnet Division and Subnet mask
First of all, why subnets?
First of all, you need to know:
In general, subnets not only do not increase the number of available IP addresses, but also reduce the number of available IP addresses, because all 0 network addresses and all 1 broadcast addresses in each subnet cannot be used as host IP addresses.
Why subnets:
- For example, A Class A network can accommodate 16,777,214 hosts.
- However, in practice, it is impossible to use A class A network only for A subnet, because it is very inconvenient to manage, there will be broadcast storms and other problems, so it needs to be divided into several smaller subnets according to the actual needs. A Class B network can accommodate 65,534 hosts and often requires subnets.
- Even in a small enterprise, in order to function between departments, configure those computers can access each other, which can not access each other, it needs to be achieved through the method of subnets.
Next, let’s look at subnets
As shown on the right of the figure above, the network segment 145.13.0.0 is divided into three subnets, one of which is 145.13.3.0 and the other is 145.13.21.0. The problem arises: if a network packet arrives and the IP address of the packet is 145.13.3.10, how can we know which subnet to give it to?
If the destination IP address is 145.13.3.10 and the subnet mask is 255.255.255.0, the destination IP address is 145.13.3.0. If the destination IP address is 145.13.3.10 and the subnet mask is 255.255.255.0, the destination IP address is 145.13.3.0. So the subnet sent to is 145.13.3.0.
Here some people will ask what is the subnet mask, the subnet mask format is the same as IP, 0,0,0,0 to 255.255.255.255, mainly to help us to divide the subnet, understand this is enough for our front-end.
7.3 the ARP protocol
Why do you need ARP?
Let’s briefly review the Ethernet frame format
Here’s oneSource address
andThe destination address
Both of these addresses refer toThe MAC address
.The MAC address
What is it? Basically, two neighboring routers, A and B, how can A send data to B? It has to know B’s physical address, rightHouse number
Same thing. I need to know where you live before I can send the data?
First of all, you must know what your MAC address is, because it’s on your network card, but the question is, what’s the MAC address of everyone else? ARP is here to help you find your MAC address.
Next, let’s talk about the process of ARP protocol (more official introduction, you can skip) :
-
Each host creates an ARP list in its OWN ARP buffer to represent the mapping between IP addresses and MAC addresses
-
Before sending a packet to the destination host, the source host first checks whether the MAC address corresponding to the IP address exists in the ARP list
-
If so, the packet is sent directly to the MAC address. If no, it sends an ARP broadcast packet to the local network segment to query the MAC address of the destination host
-
The ARP request packet contains the IP address of the source host, hardware address, and destination host. After receiving the ARP request, all hosts on the network check whether the destination IP address in the packet is the same as their own IP address
-
If not, the packet is ignored. If they are the same, the host first adds the MAC and IP addresses of the sender to its ARP list
-
If the IP address already exists in the ARP table, the device overwrites the IP address and sends an ARP packet to the source host, telling the source host that the MAC address is the one that the source host needs to search for
-
After receiving the ARP response packet, the source host adds the IP address and MAC address of the destination host to its ARP list and uses the information to start data transmission
-
If the source host does not receive any ARP response packet, the ARP query fails
7.4 the DHCP protocol
DHCP (Dynamic Host Configuration Protocol) is a LAN network protocol. The lP address range is controlled by the server. When the client logs in to the server, the client can automatically obtain the lP address and subnet mask assigned by the server. In other words, when you access the LAN, the DHCP server will automatically assign IP addresses to you. Windows users may know that the network card configuration contains the automatic IP address function. If the router provides DHCP services, you will automatically obtain the IP address randomly assigned to you.
This service can be enabled on the router.
General working process (understanding is ok)
7.5 the ICMP protocol
ICMP is a network layer protocol. Why do we need ICMP?
A new network, often need to first carry out a simple test, to verify whether the network is smooth; But the IP protocol does not provide reliable transmission. If a packet is lost, the IP protocol does not inform the transport layer whether or not the packet is lost and why.
So we need a protocol to complete such a function – ICMP protocol.
ICMP provides the following functions:
- Verify that the IP packet has successfully reached the destination address
- Indicates the cause of the IP packet being discarded during the notification
Here’s an example:
Host H2 receives a UDP packet from host H1. Host H2 finds that the port in the packet is not being listened on and sends an ICMP reply to host H2. This means that the UDP packet cannot be delivered to the application process and must be discarded.
The following are four common ICMP error report packets
The ping command uses ICMP to detect whether the host can find the destination host.
7.6 Network Devices Overview of routers
A router is a dedicated computer with multiple input ports and multiple output ports whose task is to forward and group.
As shown in the following figure, the forwarding and grouping functions are described respectively.
Next, let’s look at what the router input port does
-
First, the physical layer is the fool layer, which transmits bits. We restore bits from the physical layer to data frames at the data link layer, and then strip data packets from the data link layer into network layer packets and give them to the router. It is time to determine what type of packet the packet is.
-
If it’s a group of routers that exchange routing information, that packet will be delivered to a routing selection processor like the one in the figure above for processing and computation. If the data is grouped, it is placed in a queue, queued, and then selected a suitable output port for output.
-
Finally, let’s look at what the router output port does
As you can see from the figure above, output ports do the reverse of input ports, converting network layer data packets into link layer data frames and finally into physical layer bit streams.
Input and output port is important to note that they all have one buffer queue, such as input data of the too fast, slow, the output data in order to balance the input and output speed, with the buffer queue data buffer down, one by one slowly processing, but the buffer queue has limit, beyond the limit, buffer queue can hold, package will be lost.
We are about to get to the most important transport and application layer knowledge for the front end! Get ready to take that!
8 Transport layer knowledge
The transport layer is a computer-only layer that provides logical communication between processes plus reliable or unreliable transport. For example, your QQ and your long-distance girlfriend’s QQ video chat, this is the communication between different computers, between processes.
Here is a brief description of the reliable transport protocol TCP + the unreliable transport protocol UDP.
TCP is connection-oriented, reliable, does not provide broadcast or multicast, and has a large time delay. It is suitable for large file transfers. UDP No connection, received packets are not confirmed, but the time delay is small, applicable to small messages.
8.1 What is the Use of port Numbers
The port number can be used to identify the different applications communicating on the same host (that is, which application is using the port).
So why can only one port be assigned to one application, not multiple?
If the server has two applications, A and B, which have service A and service B started, and they are listening to the same port, then when data comes, the server cannot tell whether the data is going to A or B.
8.2 the UDP protocol
UDP is a connectionless transport layer protocol in the reference model, which provides simple and unreliable transaction-oriented information transmission service.
(1) Features of THE UDP protocol
-
UDP is connectionless, reducing the overhead and time delay before sending data. You know TCP’s three handshakes and four breakups, and that takes time and expense, but UDP doesn’t have that expense.
-
UDP uses best effort delivery, that is, reliable delivery is not guaranteed. Who will ensure reliable delivery? It is guaranteed by the application layer, the layer above UDP.
-
UDP is packet-oriented and suitable for network applications that transmit a small amount of data at a time. What do you mean, the following figure, UDP this layer, all the content of the application layer as a datagram, added an IP in IP layer is first, we know that in Ethernet, the data on the link layer more than 1500 bytes, if will be divided, so the network layer found above the transport layer gives too much data will be divided, In addition, UDP is an unreliable protocol, which increases the reliability of UDP, easy to lose, so UDP is suitable for small data volume.
- UDP has no congestion control and is suitable for many real-time applications. That is to say, if the network congestion, UDP no matter so much, at its own rate data, so that some people would say that this agreement is a little pit B, roads are blocked, hair SiJin hair data yet, but in turn, this also is the advantages of UDP, it allows the packet loss, if your network condition is good, UDP is very suitable for real-time applications, Like video conferencing.
- UDP header is small, only 8 bytes, while TCP header is 20 bytes. This is also
Reduce network transmission overhead
On the one hand.
(2) UDP header
- The 16-bit port number is 2B, that is, 16 bits, indicating that the port number range is
0-65535.
. The source port number may not exist because you do not want to receive a response from the peer. You can write all zeros to the source port number. The destination port number must exist. - The 16-bit UDP length indicates that
Header + length of data
For example, if 2B, the header is fixed as 8B, then the UDP length is 2+8 = 10B - The 16-bit UDP checksum is used for verification
Error in header and data
If anything is wrong, throw it away. For example, if the destination host cannot find the corresponding port number, it will return an ICMP message to the sender.'Port unreachable'
Error packets.
8.3 the TCP protocol
TCP is simply a transport-layer, connection-oriented, reliable, byte stream-based transport-layer communication protocol
The characteristics of TCP are as follows:
- TCP is
connection-oriented
Transport layer protocol. TCP’s three-way handshake and four-way breakup, for example, are for connections. - Each TCP connection
You can only have two endpoints
, each TCP connection is point-to-point.That is, TCP is the communication of processes between different computers
. - TCP provides reliable delivery of services, no error, no loss, no duplication, in order. So in summary,
Reliable and orderly, not lose not heavy
. - TCP provides
Full duplex communication
. Full duplex means that both sides of the connection can send and receive data at the same time. There are send caches and receive caches on the receiving and sending ends. The send cache is a queue that is ready to send, and the receive cache is a queue that is ready to receive. - TCP
Byte oriented stream
. Let’s explain what a byt-oriented stream is:
1,2,3,4….. Blocks of data, each representing a byte. TCP converts the data of the application layer into such bytes to be sent. For example, if you have played Node and know a buffer, a buffer is a byte stream.
Header format of a TCP packet
As shown in the figure below, let’s take a look at some of the more important header fields, which are covered hereFixed 20 bytes
The TCP header
- Source port and destination port refer to the port of the sender application and the port of the destination application respectively.
- Serial number refers to the sequential number of each byte in the byte stream transmitted in a TCP connection. This field indicates the serial number of the first byte of the data sent in this text segment.
- The acknowledgment number is the number of the first data byte expected to receive the next packet segment from the peer. A weak confirmation of bit N proves that all data has been received correctly until n-1 is required. As shown in the figure below, let’s give an example
The receiver receives a packet of 1, 2, and 3 bytes. The receiver then sends an acknowledgement packet to the sender. The acknowledgment number of the acknowledgement packet should be 4, because the packet consisting of 1, 2, and 3 bytes has been received.
- Data offset refers to the data start point of a TCP segment. For example, how far is the start point of a TCP segment?
- The six control bits are described as follows
Control bits | role |
---|---|
ACK | 1 indicates that the acknowledgement number is valid; 0 indicates that the data segment does not contain acknowledgement information and the acknowledgement number is ignored |
PSH | When set to 1, the requested data segment can be sent directly to the application as soon as the receiver gets it, rather than waiting until the buffer is full |
RST | Set to 1 to rebuild the connection. If the RST bit is received, something is usually wrong |
SYN | Set to 1 to initiate a connection |
FIN | If the value is set to 1, the sender completes the sending task. Used to release the connection to indicate that the sender has no data to send |
URG | The critical pointer tells the receiving TCP module that the critical pointer field points to critical data |
8.3.3 ESTABLISHING a TCP connection
Take a look at the process of establishing a connection, as shown in the figure below:
- First, the client sends a packet telling the server to set up a connection
The SYN set to 1
.seq
It’s a serial number. It’s randomly generated. - After the server receives the packet, the TCP connection is identified
Allocate cache and variables
, cache refers to a byte stream queue. (Both the sender and the receiver have this queue, and both have send and receive caches if they need to communicate with each other), and then an acknowledgement message is returned, in whichThe SYN control position is 1
Which means to allow connections to be made,ACK is the confirmation number
, acknowledges receipt of the sender’s packet, and sets oneSeq number
Is also a random number.Lowercase ack
Is the confirmation number, which is where you want the sender to start sending the next data. - Finally, the client needs to return an acknowledgement to the server
The SYN control bit becomes 0
“, meaning this is not a connection request, it’s about to send data,ACK is a confirmation code
“, which means the server has received a confirmation request.
8.3.4 TCP releases a connection
Take a look at the process of releasing a connection, as shown in the figure below:
- The client initiates a request to disconnect the link.
The FIN = 1, seq = u
. The U was transmitted earlierThe sequence number of the last byte is +1
.
FIN: Used to release a link. When FIN=1, it indicates that the sender of the packet has completed sending data, no new data is to be transmitted, and the link is to be released.
The client waits for the server to return confirmation
-
The server returns an acknowledgement message after receiving a request from the client to disconnect the link. ACK=1, seq=v, ACK= u+1. In this case, the client cannot send packets to the server, but can only receive packets. But if the server still has information to send to the server, it can still send. What does v mean here? It depends on what the packet confirmation number is before the server sends it to the client.
-
When the server has no information to transmit, the server sends a request ending packet to the client. FIN=1, ACK=1, ACK= u+1, seq=w. W here is the same as v above, why not both of them are V, because data may be sent between this step and the previous step, so the byte stream number of seq data may have to be changed.
-
After receiving a FIN=1 packet, the client returns an ACK=1, SEq = U +1, and ACK= W +1. After sending, the client enters the wait state and waits for two time periods. Shut down.
Why wait two time cycles at the end?
- The last one on the client
ACK
The packet was lost during transmission. The server did not receive the packet. Procedure At this point the server will time out and retransmit thisFIN
Message, and then the client returns the last oneACK packet
And wait for two time cycles to complete the shutdown. - If you do not wait for these two time periods, the server retransmission message will not be received. The server cannot be shut down because it cannot receive information from the client.
8.3.5 TCP3 handshake 4 waves often meet the test questions
Why are there three handshakes to connect and four handshakes to close?
- Description The server received a packet from the peer when the connection was closed
FIN
When sending a packet, it indicates that the peer party no longer sends data but can still receive data. However, the peer party may not send all data to the peer party. Therefore, the peer party can disable the data immediately or send some data to the peer party before sending the dataFIN
The packet is sent to the peer indicating agreement to now close the connection - So, our side
ACK and FIN
They are usually sent separately, resulting in one more time.
Why can’t you connect with two handshakes?
Here are some common answers on the Internet:
-
If you only need two handshakes to establish a connection, the client is not much changed and still needs to get a reply from the server before entering the ESTABLISHED state, whereas the server enters the ESTABLISHED state after receiving the connection request.
-
In this case, if the network is congested and the connection request sent by the client cannot reach the server for a long time, the client resends the request after timeout. If the server receives and acknowledges the reply correctly, the two parties start the communication and release the connection after the communication ends. At this point, if the failed connection request reaches the server, since there are only two handshakes, the server will enter the ESTABLISHED state upon receipt of the request, waiting to send data or actively sending data
-
However, the client has already entered the CLOSED state, and the server will wait forever, which wastes the connection resources of the server
-
But I think this is just a problem caused by the two handshakes. The most important thing is that the server confirms the start serial number of the client in the two handshakes, but the client does not confirm the start serial number of the server, which cannot guarantee the reliability of transmission.
What if the connection is established, but the client suddenly fails?
If the server does not receive any data from the client within two hours, the server sends a probe packet to the client every 75 seconds. If the client does not respond after 10 consecutive packets are sent, the client is considered faulty and the connection is closed.
What is a SYN flood attack?
- The SYN flood attack takes advantage of the TCP protocol feature (three-way handshake).
- The attacker sent
TCP SYN
When the server returns an ACK, the attacker does not reconfirm the TCP connection. In this case, the TCP connection is in the suspended state, or half-connected state. If the server does not receive the ACK, the attacker will repeatedly send the ACK to the attacker. - This is more of a waste of server resources. The attacker sends a large number of these TCP connections to the server, and since each connection fails to complete the three-way handshake, these TCP connections on the server consume CPU and memory due to the pending state, and the server may eventually crash.
Why is the sequence (ISN) random?
To enhance security, the Reset RST packet is forged by the third party.
Can the first of three handshakes carry data? Why?
- No, the three handshakes are not complete. And this will magnify
SYN FLOOD
(SYN flood) attack. - If an attacker forges thousands of handshake packets carrying 1K+ bytes of data, and the recipient opens up a large cache to accommodate this huge amount of data, memory can easily run out, leading to denial of service.
Can the third of three handshakes carry data? Why?
The third handshake, by which time the client is in the ESTABLISHED state. For the client, he has established the connection and knows that the server is capable of receiving and sending. So you can carry data.
8.3.6 How can TCP Be Transmitted Reliably
The reliable transmission mechanism can be implemented in the following four ways:
- Check. The pseudo header is used to increase the error detection capability of THE TCP checksum. The destination IP address of the pseudo header is used to check whether the TCP packet is received incorrectly, and the transport layer protocol number of the pseudo header is used to check whether the transport layer protocol is selected correctly. Note that the pseudo-header does not actually exist and is only used to verify whether the TCP packet is incorrect.
- The serial number. Earlier we mentioned that TCP is byte stream oriented, such that the first byte is number 1 and the second byte is number 2. In the description of the TCP packet format, there is a serial number field, which refers to the serial number of the first byte in a packet segment. A segment is each of your packets. With serial numbers, you can ensure that the data is passed into the application layer in order.
- Confirmation. The sender sends the rest of the data only after receiving the acknowledgement packet from the receiver.
- The retransmission. If the TCP sender does not receive the acknowledgement within the specified period of time, it must retransmit the sent segment (timeout retransmission). The retransmission time changes dynamically based on the weighted average round trip time (RTTS).
8.3.7 TCP traffic control
Why do you need flow control?
For example, the speed of the sender is very fast, and the speed of the receiver is very slow. In this case, serious packet loss occurs.
TCP implements flow control through the sliding window mechanism. Here’s the simple answer (I suggest you find an animation tutorial online to understand, the text effect is not very good):
- In TCP, sliding Windows are used for transmission control.
The size of the sliding window means how much buffer the receiver has available to receive data
. The sender can determine how many bytes of data to send by sliding the size of the window. - When the slide window is
0
, the sender is normalNo more datagrams can be sent
There are two exceptions. One is when you can send emergency data, for example, by allowing a user to terminate a running process on a remote machine. Alternatively, the sender may send a 1-byte datagram to notify the receiver to restate the next byte it wishes to receive and the size of the sender’s sliding window.
8.3.8 TCP congestion Control
This part I also think the text part is too stiff, I did not understand at first, after watching a video I understand the basic principle, here I put the text version and the video address.
The video address is as follows: Congestion Control Video edition
Text version is as follows:
-
If the network is congested, packets will be lost and the sender will continue to retransmit packets, resulting in higher network congestion. Therefore, when congestion occurs, the rate of the sender should be controlled. This is similar to flow control, but the starting point is different. Traffic control is to make the receiver can receive in time, and congestion control is to reduce the congestion of the entire network.
-
TCP implements congestion control through four algorithms: slow start, congestion avoidance, fast retransmission, and fast recovery.
-
The sender maintains a state variable called a Congestion window (CWND). Note the difference between a congestion window and a sender window: a congestion window is only a state variable; it is the sender window that actually determines how much data the sender can send.
-
For the sake of discussion, make the following assumptions:
-
The receiver has a large enough receive cache so that flow control does not occur;
-
Although THE TCP window is based on bytes, the size unit of the window is set as segment.
Slow start and congestion avoidance
-
The initial execution of the send starts slowly. If CWND = 1, the sender can only send one segment. When the acknowledgement is received, the CWND is doubled, so that the number of segments that the sender can send is: 2, 4, 8…
-
Note that the slow start doubles the CWND per round, which causes the CWND to grow very fast, which in turn causes the sender to send too fast, which increases the likelihood of network congestion. Set a slow start threshold ssthRESH, when CWND >= SSthRESH, enter congestion avoidance, add CWND only by 1 per round.
If a timeout occurs, set SSthRESH = CWND / 2 and then re-execute the slow start.
Fast retransmission and fast recovery
-
At the receiver, each received segment is required to acknowledge the last received segment. For example, if BOTH M1 and M2 are received, and if M4 is received, you should send an acknowledgement for M2.
-
If the sender receives three repeated acknowledgments, it knows that the next packet segment is lost. In this case, fast retransmission is performed to immediately retransmit the next packet segment. For example, if three M2 are received, M3 is lost and M3 is immediately retransmitted.
-
In this case, only individual segments are lost, not network congestion. So perform quick restore, make ssThRESH = CWND / 2, CWND = SSThRESH, notice at this point directly into congestion avoidance.
-
The speed of slow start and fast recovery refers to the CWND set point, not the CWND growth rate. Slow start CWND is set to 1 and fast restore CWND is set to SSthRESH.
9. Application layer
9.1 What is the Use of the Application Layer?
The application layer provides services for application communication.
- Distinguish between sending and receiving packets
- Define the syntax of the message type, such as the meaning of a field, such as the meaning of the CONTent-type field in HTTP.
- The final question is how and when the process transfers data from the transport layer to the application layer.
Some important application-layer protocols are as follows:
9.2 Common models at the application layer
The first is the client/server model, or C/S architecture. E-mail, the Web.
The second is the P2P model, where each host can both provide and request services. For example, thunderbolt download also uses P2P technology.
9.3 Short and Long links
TCP connections work in two ways: short-live Connection and long-live Connection.
- Short connection:
-
When the client has a request, it establishes a TCP connection, and when it receives a response from the server, it disconnects. The next time there is a request, the connection is established, and the response is received, and the connection is disconnected. And so on. This approach has two main disadvantages:
-
It takes three “handshakes” to set up a TCP connection and four “waves” to disconnect a TCP connection, which requires seven packets. If the request and response account for one packet each, then only 2/9 of a short connection’s interaction is effectively transmitted, which is too low utilization.
-
Disconnect the TCP connection. Then TCP enters the TIME_WAIT state. If short connections are used too frequently, it is possible for the client machine to generate a large number of TCP connections in TIME_WAIT state.
- Long connection mode:
- After a TCP connection is established between the client and the server, the client continues to use this connection for data interaction until no data is transmitted or the connection is interrupted abnormally. During the idle period, keep-alive packets are usually used to Keep the link open. At present, the long connection mode is widely used.
9.4 the DNS
What is DNS is to translate the domain name into IP, such as www.qq.com, this is the domain name, can be the network packet needs the IP address of the other party, the domain name can not be added to the network packet header, so you need to find a server to ask, qq domain name corresponding to the IP is how much.
The general communication process is as follows:
- The user host is running
DNS
The client is our PC or mobile client running the DNS client - The browser will extract the domain name field from the URL it receives, which is the host name to access, for example
http://www.baidu.com/
And pass the host name to the client of the DNS application - The DNS client sends a copy to the DNS server
Query message
, the message contains the information to be accessedHost name field
(Intermediate includes some column cache queries and distributed DNS cluster work) - The DNS client will eventually receive a copy
Reply message
, which contains the corresponding of the host nameThe IP address
- Once the browser receives an IP address from the DNS, it can send a TCP connection to the HTTP server located at that IP address
Need to add a little knowledge here, about the domain name, about the line
9.5 World Wide Web and HTTP Protocol
The World Wide Web (WWW) is a large-scale, online information store, a collection of countless web sites and pages.
Knowledge blind spot: In many people’s view, the Internet, the Internet, the World Wide Web does not have most of the difference, in fact, the relationship between the three should be: the Internet contains the Internet, the Internet contains the World Wide Web.
-
Internet Internet. They could communicate with each other by a network of devices is called the Internet, even if only two machines (computer, mobile phone, etc.), regardless of the technology used to communicate with each other, are called the Internet, so, the Internet has a wide area network, metropolitan area network and local area network (LAN), the international Internet standard way is Internet, letter I must lowercase!
-
Internet Internet. The Internet is the Internet, not just a network of two machines, but a network of tens of millions of devices (and that network is very large). The Internet uses the TCP/IP protocol to allow different devices to communicate with each other. But a network that uses TCP/IP does not have to be the Internet. A LAN can also use TCP/IP.
-
The Internet is based on TCP/IP protocol implementation, TCP/IP protocol is composed of many protocols, different types of protocols are placed in different layers, among which, there are many protocols located in the application layer, such as FTP, SMTP, HTTP. Therefore, the services provided by the Internet generally include: WWW (World Wide Web) service, E-mail service (Outlook), remote login (QQ) service, file transfer (FTP) service, network telephone and so on.
-
The world wide web. As long as the application layer uses THE HTTP protocol, it is called the World Wide Web. The reason why you can see the webpage provided by Baidu net when you input the Website of Baidu net in the browser is because the HTTP protocol is used between your personal browser and the server of Baidu net to communicate.
The World Wide Web uses the Uniform Resource Locator as an identifier to access resources.
The URL format is as follows:
- Users click on hyperlinks to access resources, and these resources pass through
Hypertext Transfer Protocol (HTTP)
To the user. - The HTTP protocol defines how the browser should approach
The World Wide Web server requests world Wide Web documents
, as well asHow does the server send the document to the browser
.
HTTP header analysis:
Here is a brief description of each section:
-
Method: The action that the client expects the server to perform on the resource, which is a single word, such as GET, POST, or HEAD
-
Request-url: To talk to the server directly, as long as the requested URL is the absolute path of the resource, the server can assume that it is the host/port of the URL
-
Version: Indicates the HTTP version used by packets. Its format: HTTP/< major version number >.< Minor version number >
-
Status-code: A status code is a three-digit number that describes what happens during a request. The first digit of each status code is used to describe the general category of status (for example, “success,” “error,” and so on)
-
Reason -phrase: A readable version of a numeric status code that contains all the text before the line terminating sequence. The cause phrase is only meaningful to humans, so although the meaning of the cause phrase in the response line HTTP/1.0 200 NOT OK is different from that in HTTP/1.0 200 OK, it is treated as a success indication
-
Header: There can be zero or more headers, each containing a name followed by a colon (:), then an optional space, then a value, and finally a CRLF header is terminated by a blank line (CRLF), indicating the end of the header list and the beginning of the body of the entity
-
Entity-body: The body of an entity contains a data block composed of arbitrary data. Not all packets contain the body of an entity. Sometimes, packets only end with a CRLF.
Here are the common heads:
Generic header: It can appear in either request or response packets and provides the most basic information about the packet
-
Connection: Allows clients and servers to specify options related to the request/response Connection. Http1.1 defaults to keep-alive
-
Date: Provides a Date and time flag, indicating when the packet was created
-
Transfer-encoding: tells the receiver what Encoding mode is used to ensure reliable transmission of packets
-
Cache-control: Used to transmit Cache indications along with messages
Request header: The request header is meaningful only in the request packet. Used to describe who or what is sending the request, where the request is coming from, or the client’s preferences and capabilities
-
Host: Gives the Host name and port number of the server receiving the request
-
Referer: Provides the URL of the document containing the URI of the current request
-
User-agent: Notifies the server of the name of the application that initiated the request
-
Accept: Tells the server what media types can be sent
-
Accept-encoding: Tells the server which encodings can be sent
-
Accept-language: Tells the server which languages can be sent
-
Range: A specified Range of the resource is requested if the server supports Range requests
-
If-range: Allows conditional requests on a Range of documents
-
Authorization: Contains the data that the client provides to the server to authenticate itself
-
Cookie: Used by a client to send data to the server
Response header: The response header provides the client with some additional information, such as who is sending the response, the responder’s functions, and even some special instructions related to the response
-
Age :(since initial creation) response duration
-
Server: The name and version of the Server application software
-
Accept-ranges: The range types that the server can Accept for this resource
-
Set-cookie: Sets data on a client so that the server can identify the client
Entity header: Describes the length and content of the body, or the resource itself
-
Allow: Lists the request methods that can be executed for this entity
-
Location: Tells the client where the entity is actually located and directs the receiver to the Location (URL) of the resource
-
Content-base: The Base URL used when resolving relative urls in the body
-
Content-encoding: specifies any Encoding for the principal
-
Content-language: the natural Language that is best used to understand the subject
-
Content-length: indicates the Length of the body
-
Content-type: Specifies the object Type of this subject
-
ETag: The entity tag associated with this entity
-
Last-modified: The date and time when this entity was Last Modified
The body of the entity: This part, which is essentially what HTTP transmits, is optional. HTTP messages can carry many types of digital data, such as pictures, videos, HTML documents, E-mail, software applications, and so on.
Common HTTP methods and status code will not be introduced in detail, the end of this article! I think it’s amazing that you’re so patient!