Summary of TCP/IP basic knowledge

 

Computer networking is a basic course, but what the teacher says is just a kick in the ass. However, for those who need to learn by themselves, it is undoubtedly more difficult. The road ahead is long ~ ~ computer network is originally more boring, the content of the article is more, it is suggested that readers read this article patiently, I hope you can have a harvest after reading. Let me give you the general structure of the article.

 

Preparatory knowledge xie Xiren that “computer network” is the plan network teaching material that a lot of universities choose, in the first chapter is a general theory, roughly told the development of computer network, also can say is the little common sense that everyone must understand. Here, I do a summary summary, it as a study net preparation knowledge. A Brief History of the Internet

  • Stage I: data communication technology and basic research of network theory in 1950s
  • Stage 2:1960s, ARPANET and packet switching technology
  • The third stage: the standardization of network architecture and network protocols in the mid 1970s
  • Stage 4:1990s, Internet, high-speed network, wireless network, mobile Internet and network security technology development

The development of the Internet “The development of computer networks mainly goes through the following seven stages.” “Batch” : In order to allow more people to use computers, the emergence of batch systems. The so-called batch processing is to load the user program data into tape or tape in advance, and read by the computer in a certain order.

 

  • “Time sharing system” : After batch system, time sharing system appears. It refers to multiple terminals connected to the computer at the same time, allowing multiple users to use the computer at the same time.

 

  • “Computer communication technology” : In time-sharing systems, we see the connection between terminals and computers, but this does not mean that computers are also connected to each other. With the popularization of the number of computers, more and more attention has been paid to the convenience of data interaction between computers. At the beginning, the process of data interaction between two hosts is quite tedious, so the computer communication technology (the communication line between computers) arises at the right moment. People can easily read data from another computer in real time, greatly reducing data transfer time.
  • “The Birth of Computer Networks” : In the 1970s, people began to experiment with computer networks based on packet switching technology, and began to study the technology for computers from different manufacturers to communicate with each other. In the 1980s, a network was created that could connect multiple computers. Network communication technology has entered the superhighway of development.
  • “The Popularization of the Internet” : In the 1990s, the popularization of the computer became more and more high as the price of the computer decreased, the performance improved, and all kinds of applications emerged. In the face of this trend, manufacturers not only need to ensure the interconnection of their products, but also strive to make their network technology continuously compatible with Internet technology (TCP/IP).
  • “Internet age” : With the popularity of the Internet, nowadays, people are becoming more and more inseparable from the Internet. Life, study and work also have to rely on network information, the era of interconnection of all things has already arrived.
  • “Network security era”, the Internet has brought the world disruptive change, brought great convenience to People’s Daily life, the Internet presents a highly convenient modern information network environment, in front of the state, just as water and electricity gas, became the national indispensable important resources, with all things, of the Internet network security is the most important part of national security. In the early days of the Internet, people were more focused on pure connectivity, building connections without any restrictions. But now, people are no longer satisfied with “simple connection” but more “secure connection”.

Network performance indicators

  • “Bit” : A bit is a unit of the amount of data in a computer and of the amount of information used in information theory. The word bit comes from the binary digit, meaning a “binary number”. The rate in network technology refers to the rate at which a host connected to a computer network transmits data over a digital channel. It is also known as the data rate or bit rate.

  • “Bandwidth” : In computer networks, bandwidth is used to describe the ability of the communication lines of the network to transmit data, so network bandwidth is the “highest data rate” that can pass from one point of the network to another in a unit of time. Bandwidth in this sense is measured in bits per second.

  • Throughput: Throughput indicates the amount of data through a network (channel or interface) in a unit time. It indicates the data transmission capability of the current network.

  • Delay:

    • 1. “Sending delay” : refers to the time required by the host or router to send a data frame, that is, the time required from the first bit of sending a data frame to the completion of sending the last bit of the frame.
    • 2. “Propagation delay” : refers to the time it takes for electromagnetic wave to travel a certain distance in the channel.
  • Latency bandwidth product: the latency bandwidth product indicates the number of bits that can be supported by a link. Therefore, the latency bandwidth product of a link is also called the link length in bits.

  • Round-trip time RTT: Indicates the round-trip time RTT. It is the total time between the time when the sender sends the data and the time when the sender receives the acknowledgement from the receiver (the receiver sends the acknowledgement immediately after receiving the data). The round trip time will generally include various delays of packets in the network.

  • Utilization: The utilization can be classified into channel utilization and network utilization. Channel utilization indicates what percentage of a channel’s time is used (with data passing). A completely idle channel has zero utilization. The network utilization is the weighted average of the channel utilization of the whole network. The higher channel utilization rate is not the better, because according to the queuing theory, when the utilization rate of a channel increases, the delay caused by the channel will also increase rapidly. Channel or network utilization is too high, resulting in very large delay.

Classification of computer networks by ** “geographic coverage” ** computer networks can be divided into three parts:

  • A Local Area NetWork (LAN) is a NetWork within a few meters to 10 kilometers in an office, dormitory or Internet cafe. Its characteristics are: narrow connection range, few users, easy configuration, high connection rate.
  • Metropolitan Area NetWork (MAN) is used to connect lans of enterprises, institutions, and schools in a city or region to share resources in a region
  • Wide area Network: Wide Area Network (WAN), also known as the remote Network, is a LAN or MAN Network interconnection between different cities. Because of the long distance and serious information attenuation, this kind of Network usually needs to rent special lines and is connected through special protocols to form a mesh structure. Because there are many users connected to the WAN, Therefore, the connection rate per user is generally low.

The topology of a computer network

  • Bus structure:

    • Advantages: low cost, easy to expand, high utilization rate of the line; Disadvantages: Low reliability, difficult maintenance, low transmission efficiency.

 

  • Annular structure:

    • Advantages: token control, no line competition, strong real-time, easy transmission control;
    • Disadvantages: Difficult maintenance and low reliability

 

  • Star structure

    • Advantages: High reliability, easy management, easy expansion, high transmission efficiency.
    • Disadvantages: Low line utilization, high reliability and redundancy of the central node.

There are three different hierarchical models of computer networks:

  • “OSI Seven-layer Model”

 

  • “Five-layer structure Model”

 

  • “TCP/IP Hierarchical Structure Model”

 

TCP/IP protocol is the protocol that the Current Internet follows. It is not simply composed of TCP or IP, but composed of various layers of protocols, which constitute the TCP/IP protocol stack that we usually talk about. But for a better understanding, the following articles are also written according to the five-layer protocol. The physical layer


Here is a suggestion, we should not study each network protocol separately when learning computer network, should understand its cause and the role in the whole computer network. Digital signal and analog signal its role is: shielding different transmission media and communication means of difference. As we all know, there are only two kinds of signals in nature, one is digital and the other is analog. So what kind of analog signal? What is a digital signal? To put it plainly, the so-called analog signal is a continuously changing physical quantity, and the analog signal is characterized by continuous amplitude (continuous meaning that infinite number of values can be taken within a certain value range). Analog signal, its signal waveform is continuous in time, so it is a continuous signal. If we sample the continuous signal, we will get the sampling signal, but the abstract signal is discrete. However, digital signal is different from analog signal, which is discrete in the time domain. It has two physical quantities of different states, which are respectively represented by “0” and “1”. It’s like a light switch. It comes in two different states. Of course, digital and analog signals can be converted to each other. Analog signals are usually quantized and converted to digital signals using PCM (Pulse code modulation) methods, which correspond to different binary values for different ranges of analog signals. In general, digital signals are derived from analog signals by carrier phase shift. As we all know, data is transmitted through different media at the physical layer. What works at the physical layer are ** hubs **. However, there are roughly two categories:

  • Guided transmission media: There are different types of guided transmission media, such as coaxial cable, optical cable, and twisted pair. The twisted pair can be further divided according to whether it is shielded or not.
  • “Unguided transmission medium” : Unguided transmission medium refers to the propagation of radio waves through space, using different frequency bands to transmit different signals.

Channel speaking of channel, the previous basic mentioned channel utilization rate, but as for the channel more detailed introduction, did not mention, now to take a closer look. According to transmission media, it can be divided into three categories:

  • Wired channel: A wired channel uses wires as the transmission medium. Signals are transmitted along wires and the signal energy is concentrated near wires. Therefore, the transmission efficiency is high, but the deployment is not flexible. The transmission media used in this type of channel include overhead wires, telephone lines, twisted pair cables, symmetric cables, coaxial cables, etc., and optical fibers for the transmission of modulated optical pulses.
  • “Wireless channel” : the wireless channel mainly includes the radio channel which transmits radio waves and the underwater acoustic channel which transmits sound waves. Radio signals are radiated over the entire free space by the antenna of a transmitter. Radio waves in different frequency bands travel in different ways.
  • “Storage channel” : In a sense, data storage media such as magnetic tape, optical disc and disk can also be regarded as a communication channel. The process of writing data into the storage medium is equivalent to the process of transmitting signals to the channel by a transmitter, and the process of reading data out of the storage medium is equivalent to the process of receiving signals from the channel by a receiver.

A channel is a channel that transmits information. Channel capacity describes the maximum capacity of a channel to transmit information without error and can be used to measure the quality of a channel. As for the channel, there is another important parameter, that is the signal-to-noise ratio. The larger the signal-to-noise ratio is, the larger the channel capacity will be. Here the famous Shannon formula is given:

 

Where C is the channel capacity, B is the bandwidth, and S/N is the signal-to-noise ratio. Channel multiplexing We know that when there is no data to transmit, the channel is very idle. But when the volume of requests is high, as it was recently on 618, the flow of information is slowed. So what is channel multiplexing? Multiplexing means reuse. Channel multiplexing can be divided into the following aspects:

  • “TDM” : the so-called TDM, is to do the whole channel into different time. When time division multiplexing is used, all users occupy the same bandwidth at different times (time division and frequency division). TDM may lead to waste of line resources
  • Frequency division multiplexing: Frequency division multiplexing divides signals into different frequencies. When frequency division multiplexing is adopted, all users occupy different bandwidth resources at the same time. When frequency division multiplexing is used, all users occupy different bandwidth resources at the same time.
  • “Statistical TDM” : so called statistical TDM system, we can also call asynchronous TDM system. It has a buffer-like mechanism, when the data reaches a certain amount, it will be forwarded, which greatly improves channel utilization.

 

Data link layer


Ethernet frame

 

The data link layer receives IP datagrams from the network layer and encapsulates them so that IP datagrams can be transmitted on the data link layer. Like this, loaded IP datagrams are called Ethernet frames, or MAC frames. A MAC frame consists of the following important parts:

  • Destination MAC Address: The destination MAC address of the MAC frame occupies 6 bytes and identifies the address of the target host.
  • Source MAC Address: The same as the destination MAC address, the source MAC address occupies six bytes and identifies the address of the source host.
  • Type: The type occupies two bytes and records the protocol 0X0800 used by the upper layer, indicating the IP protocol.
  • “Data part” : The data part is naturally the IP datagram from the upper layer.
  • “FCS” : The FCS takes up 4 bytes and is used for error detection. If a MAC frame has an error, it cannot be sent to the destination host.

Error detection Why error detection? Realistic communication links are not ideal. That is, a bit can go wrong in its transmission: a 1 can become a 0, and a 0 can become a 1. This is called a bit error. Over a period of time, the ratio of the transmitted Error bits to the total number of transmitted bits is called the Bit Error Rate (BER). Bit error rate (BER) is closely related to signal-to-noise ratio (SNR). It is impossible to reduce ber to zero in practical communication. Therefore, in order to ensure the reliability of data transmission, various error detection measures must be adopted in computer network data transmission. It is inevitable that MAC frames will produce errors in the process of propagation. In the Ethernet frame section, we mentioned the error detection sequence FCS. According to FCS, we can know whether the MAC frame is wrong or lost during transmission. We’ll talk about error detection later when we talk about the transport layer, so what’s the difference? To sum up, it can be summed up in one sentence:

  • The purpose of data link layer error detection is to achieve “incomparable error”.
  • The purpose of error detection at the transport layer is to achieve “transmission error free”. That is to compensate for frame loss, frame repetition, frame out of order.

There are two main error detection methods: parity check (PCC) and cyclic redundancy check (CRC). PCC is very simple and is not the focus of this article. The following is mainly about CRC cyclic redundancy check. Cyclic redundancy check is a method that generates a fixed digit check code based on transmitted or saved data. It is mainly used to detect or verify errors that may occur after data is transmitted or saved. The resulting number is calculated and appended to the data before transmission or storage, and then checked by the receiver to see if the data has changed. With CRC, we can calculate the FCS redundancy check code, which is at the end of the MAC frame. With FCS, we can tell if the MAC frame sent an error. Adapter When it comes to adapters, you can think of adapters in your life. For example, when we charge mobile phones, we need a power adapter. The power adapter is nothing more than a conversion function, or as a carrier, to achieve the transfer of energy. In fact, the same goes for adapters in computers. Consider this picture:

 

As we all know, data is transferred in serial mode through external media, while computers process internal instructions in parallel. How do you convert serial data into parallel data? This requires an adapter. The adapter acts as a bridge through which you can easily change the way data is transmitted. CAM Table We all know about switches, which are multi-port Bridges that use MAC addresses to forward data at the data link layer. The switch class does not actually store a table, called a CAM table. This table records the MAC addresses of hosts and their corresponding interfaces. Take a look at the following graph:

 

There are three hosts A, B, and C connected to the switch. Initially, no information is stored in the CAM. One day, host A (source MAC) wants to send A message to host B (destination MAC). At this time, the switch will check whether its CAM table stores the information of host A. Once it sees that there is no information of HOST A, the switch will write the information of HOST A into its CAM table. Now, the switch’s CAM table looks like this:

 

At this point, the CAM table of the switch already stores host A’s information, but host A wants to send A message to host B. What can I do about it? “First”, the exchange machine checks whether there is information about B in its CAM table, and “if there is”, forwards the information directly to B. What if it doesn’t exist? After some hesitation, the switch had another idea. It broadcast the message from host A to host B to all the hosts connected to it. Host C also received the message, but host C checked the destination address and discarded the message. Host B receives the message, also checks the recipient (destination address), finds that the message is for itself, and accepts the message. The switch then updates its CAM table with a message:

 

In this way, the CAM table stores information about host A and host B. The next time host A wants to send A message to host B, the switch does not need to broadcast. CSMA/CD protocol The use of CSMA/CD has been fairly minimal to date. It is used in the following two places:

  • It’s wired
  • Applied in 10M/100M half duplex wired network

“Networks using CSMA/CD have the following three characteristics:

  • “The network is a bus structure”, all computers are connected to the same bus, at the same time, only one computer is allowed to send (or receive) messages, that is, use half duplex communication.
  • “Carrier sense” : the channel must be continuously monitored before and during transmission. Messages can be sent only when the channel is idle.
  • Collision Detection: The host continuously checks the channel before sending a message. If two hosts send messages at the same time, the message transmission stops immediately. The backout algorithm is to wait a random period of time before sending a message.

“To add the characteristics of the fallback algorithm:”

  • Non-persistent CSMA: If the line is busy, wait for a period of time and then listen. When not busy, send immediately; Conflicts are reduced and channel utilization is reduced
  • “1 adhere to the CSMA” : the line is busy, continue to listen; When not busy, send immediately; Channel utilization rate increases and conflict increases.
  • “P insists on CSMA” : The line is busy and continues to listen. When not busy, send according to probability p, the other 1-p probability is to continue listening (p is a specified probability value).

Network Layer IP protocols Overview of IP Protocols IP protocols correspond to IP addresses. What is an IP address? ❝IP Address (Internet Protocol Address), also known as the Internet Protocol Address. When the device is connected to the network, the device is assigned an IP address for identification. With IP addresses, devices can communicate with each other. Without IP addresses, we have no way of knowing which device is the sender and which is the receiver. [2] IP addresses have two main functions: identifying devices or networks and addressing. The pile of text above ❞ actually explains two points, summarized as follows:

  • An IP address is used to mark a host. Without an IP address, a host cannot be identified. (Mark host)
  • Because hosts are uniquely tagged, they can be used to find hosts in the network. (address)

Now think about the MAC address we talked about earlier. The MAC address is the identity symbol of a host. The MAC address of a host is uniquely determined after delivery and cannot be changed. You can also change the MAC address through software, but ensure that two hosts with the same MAC address are not connected to the same LAN. “So why do you need an IP address when you have a MAC address, or a MAC address when you have an IP address?” In fact, this is a classic question, there are many answers online, here are two recommended articles:

  • Why use a MAC address when you have an IP address?
  • Why is there a MAC address and an IP address?

After reading the above two articles, I conclude as follows:

  • Historical cause: Ethernet predates the Internet, and MAC addresses were in use before IP addresses. Both protocols are used together to avoid affecting existing protocols
  • “Layered implementation:” After stratification of network protocols, the implementation of the data link layer does not need to consider the forwarding between data, the implementation of the network layer does not need to consider the impact of the data link layer.
  • The IP address varies with the network access of the host, but the MAC address does not change. In this way, we can use IP addresses for addressing and MAC addresses for data delivery when the datagram and destination host are on the same network.

IP datagrams IP data looks like this:

 

There are a few important things that have to be said:

  • Version: a 4-bit binary number, indicating the IP protocol version used by the IP packet. The IP protocol version 4 in the TCP/IP protocol family is mainly used on the Internet.
  • Header length: occupies four bits. This field specifies the length of the entire header, including options, in 32-bit binary units. The receiver can use this field to calculate where the header ends and the data is read from. Normal IP datagrams (without any options) The value of this field is 5 (that is, 20 bytes long).
  • Service Type: Service type (TOS, Type of Service) : 8 bits used to specify the processing mode of the datagram.
  • TTL (Time To Live) : Takes up 8 binary bits, which specifies the maximum length of Time a datagram can travel over the network. In practice, the TTL field is set to the maximum number of routers a datagram can pass through. The initial value of TTL is set by the source host (typically 32, 64, 128, or 256) and is reduced by one once it passes through a router that processes it. When this field is 0, the datagram is discarded and an ICMP message is sent to notify the source host, thus preventing the datagram from being transmitted endlessly when entering a loop.
  • Upper-layer Protocol ID: occupies 8 bits. The IP protocol can carry various upper-layer protocols. The target end sends the received IP data to the upper-layer protocols, such as TCP or UDP, based on the upper-layer protocol ID.

IP datagrams (IP datagrams) : IP datagrams (IP datagrams) : IP datagrams (IP datagrams) : Subnet masks and IP addresses Common IP addresses consist of network addresses and host addresses. So what is a network number? A network number is the name of the network on which the computer is currently located, and which consists of a number of hosts. How do you calculate the network number? This is where the subnet mask comes in handy. Generally, the IP address and subnet mask of a computer are paired. By comparing the subnet mask and IP address, you can know the host number and network number. To facilitate representation, a subnet mask is usually preceded by consecutive 1s and followed by consecutive 0s. Zeros and 1s cannot alternate. Take a look at the following example.

 

Now that you know the IP address and subnet mask of host A, convert them to binary form. Part 1 of the binary subnet mask corresponds to the network ID of the IP address, and part 0 of the subnet mask corresponds to the host ID. The picture below is clear:

 

ICMP as we know, IP is an unreliable transport protocol, TCP is the reliable transport protocol on the network, and we’ll talk about that when we talk about the transport layer. So, if the message is not delivered, how does the network layer solve the problem? In this case, you need to use ICMP. What is ICMP? ICMP is the Internet Control Message Protocol (ICMP). The data part of an IP datagram can be divided into ICMP error packets and ICMP query packets. Error messages are simply used to report errors, and what to do about them is the responsibility of higher level protocols. Also, error messages are always sent to the original data source (because the only available IP addresses in ICMP datagrams are source and destination IP addresses), and query messages are always sent in pairs. The IP address is used to address the ARP protocol, and the MAC address is used to deliver the datagram when the destination address and datagram are on the same network. Now, there is A problem. Host A wants to send A message to host B. The message goes through A series of forwarding and finally finds the IP address of host B. However, as we all know, data transmission at the link layer requires a MAC address, and only B’s IP address cannot be used for communication. Take a look at the chart below:

 

This is where ARP comes in handy. ARP is the Address Resolution Protocol. The basic function of ARP is to query the MAC Address of the target device based on the IP Address of the target device to ensure smooth communication. ARP is an essential Protocol at the IPv4 network layer. “Just as switches work at the data link layer, routers work at the network layer. Switches have CAM tables, and routers have routing tables.” Now the router needs to know the MAC address of host B to send a message to host B. In this case, the router sends an ARP request, which is broadcast to each host connected to the router. However, only host B can check its OWN IP address. Host B sends the router an ARP response telling the router its MAC address. As shown below:

 

Each time a router sends an ARP request, it adds a piece of data that records the MAC address corresponding to the IP address. In this way, the router does not need to broadcast a message to the host next time. Of course, just as data in a switch’s CAM table has a lifetime, data in a routing table also has a lifetime. If the data was always there, wouldn’t the router have to spend a lot of storage space caching invalid data? Internal Gateway Protocol There are two routing protocols on the Internet, NAMELY RIP and OSPF. The following describes the two protocols. Introduction to RIP:

  • Routing Information Protocol (RIP) is the first widely used internal gateway protocol IGP. RIP is a distributed routing protocol based on the distance vector. As a standard protocol on the Internet, RIP has the advantages of simple implementation and low cost.
  • Basic algorithm: The idea of vector distance algorithm (V-D algorithm for short) is that the gateway periodically broadcasts the path refresh message, and the main content is the sequence even table composed of several (V, D) sequence pairs; (V, D) In the sequence pair, V stands for “vector”, indicating the home address (gateway or host) that the gateway can reach; D stands for distance, indicating the distance from the gateway to home address V; Distance D is calculated by the number of stations. After receiving (V, D) packets from a gateway, other gateways refresh their routing tables according to the shortest path principle.
  • It is only suitable for small networks (15 hops is the limit), if the network is too large, when the network fails, it will take a long time to send this information to all the routers.

“Next, what is OSPF?”

  • Basic Definition: Open Shortest Path First (OSPF) is an Interior Gateway Protocol (IGP) used to determine routes within a single Autonomous system (AS).
  • Basic algorithm: Dixplus algorithm. The neighbor relationship is established by sending HELLO packets to neighbors, such as DR.

NAT technology is very simple, so WHAT is the role of NAT? Network Address Translation (NAT) was proposed in 1994. NAT can be used when some hosts on the private network have already been assigned local IP addresses (that is, private addresses for use only on the private network), but now want to communicate with hosts on the Internet (without encryption). This method requires the installation of NAT software on a router connected to the Internet on a private network. A router with NAT software is called a “NAT router” and “it has at least one valid external global IP address”. In this way, all hosts using local addresses must translate their local addresses into global IP addresses on the NAT router before they can connect to the Internet. In addition, this approach of using fewer public IP addresses to represent more private IP addresses “will help slow the depletion of the available IP address space” **. “Simply put, NAT technology is a protocol for LAN and Internet communication.” NAT can be divided into three different types:

  • Static NAT: Static NAT is the simplest and easiest to implement. Each host on the internal network is permanently mapped to a valid address on the external network. Static NAT is implemented when an internal host must be accessed as a fixed external address.
  • Pooled NAT(dynamic ADDRESS NAT) : Dynamic NAT defines a series of valid IP addresses (address pools) on the external network and maps them to the internal network through dynamic allocation. Dynamic NAT works like this: When an internal host needs to access the Internet, it allocates an available IP address from the public IP address pool to the host. When the communication is complete, the obtained public IP address is released back to the address pool. If an external public IP address is assigned to an internal host for communication, it cannot be assigned to another internal host.
  • Port-level NAT (NAPT) : NAPT maps internal IP addresses to different ports of the same IP address on the external network. Network Address Port Translation (NAPT) maps multiple internal IP addresses to one legitimate public IP Address. However, different protocol ports correspond to different internal IP addresses. This is the translation between < internal address + internal port > and < external address + external port >.

IPV6 addresses are IPV4 addresses, so why would you want an IPV4 address when you already have IPIV4? It turns out that as early as last century, people anticipated that IPV4 addresses would run out one day, and to solve this problem, IPV6 was developed. IPv6 (IP Version 6) is an Internet protocol that has been standardized to solve the problem of IPv4 address exhaustion. An IPv4 address contains four 8-bit bytes, that is, 32 bits. IPv6 addresses are four times as long as 128 bits, usually written as eight 16-bit bytes. As you can see, IPV6 addresses are inexhaustible, so why not replace all IPV4 addresses with IPV6 now? Switching from IPV4 to IPV6 takes time and requires resetting the IP addresses of all hosts and routers on the network. When the Internet becomes widespread, replacing all IP addresses will be an even more daunting task. On existing networks, there are both IPV4 and IPV6, so how do they communicate with each other? There are two technologies: “dual stack” and “tunneling”, which are described as follows:

  • Dual-stack: When the IP address header is changed, some information of the IPV6 header is lost during the translation of the header, and the loss of translation is inevitable.
  • “Tunnelling” : what is tunnelling? In fact, it can be understood literally. Let me draw a picture to help you understand. To put it plainly, tunneling is another encapsulation and decapsulation of data during transmission. As shown in the figure, when data is transferred from the IPV6 network to the IPV4 network, IPV6 packets need to be encapsulated in IPV4 packets.

 

Transport layer stop wait protocol What is stop wait protocol? Look at the picture below and you might get the idea

 

The stop waiting agreement can be composed of the following three parts:

  • “Error free” : In order to be error free, host A must continue to send messages to host B, as shown in the figure above, and host B must reply.
  • “Error” : If there is an error, such as host A never receiving A reply from Host B, then there is A mechanism for host A to send the message to host B again. There is A choice of ** “retransmission time” **, where the retransmission time should not be less than RTT (the sum of the time for host A to send A message to host B, and for host B to send A message to host A).
  • “Confirmed lost and Confirmed late” : Confirmed late and confirmed lost, as you might understand by looking at the picture below

 

Data may be lost or late during transmission. Lost data is retransmitted and late data is not processed. Speaking of stop waiting for an agreement, I have to add ARQ. What is the ARQ agreement? ARQ protocol is that the sender does not need to receive the confirmation of the previous message, and can send multiple packets at a time, which improves the utilization of channels and can transmit enough data in a certain time. UDP UDP is relatively simple compared with TCP. TCP is also the focus of the transport layer. The following is a brief explanation of UDP. UDP has the following characteristics:

  • Unreliable transport for connectionless protocols
  • Datagram oriented
  • No congestion control
  • UDP datagram header has low overhead
  • Support one-to-one, one-to-many, many-to-many, many-to-one data transmission

TCP TCP Overview TCP is another transport layer protocol. It has the following features:

  • TCP is a connection-oriented transport layer protocol
  • Provide reliable delivery
  • Use full duplex communication
  • Word oriented stream

The TCP datagram is shown in the following image (from the network).

 

Some fields in datagrams are defined as follows:

  • Source Port: indicates the port number of the sending host
  • Destination Port: indicates the port number of the receiving host
  • Ordinal: Each byte in the byte stream transmitted over a TCP connection is numbered sequentially. The start serial number of the byte stream must be set when the connection is established. The ordinal field value in the TCP datagram header refers to the ordinal value of the first byte of the data sent in the packet.
  • Acknowledge number: indicates the number of the first data byte in the next packet segment to be received from the peer. If the confirmation number is N, it indicates that all data up to n-1 has been correctly received.
  • Data Offset: Indicates the distance between the start of the TCP packet segment and the start of the TCP packet segment.
  • Window: The window field specifies how much data the other party is now allowed to send. The window value is always changing dynamically. The window refers to the receiving window of the party sending the message (not its own sending window).
  • Checksum: The range of checksum and field check includes header and data. When calculating the checksum, add a 12-byte dummy header to the front of the TCP packet segment (same as UDP).
  • ACK confirmation: The ACK number field is valid only when ACK=1. When ACK=0, the confirmation number is invalid. According to TCP, ACK must be set to 1 for all packet segments transmitted after a connection is established.
  • “PUSH PUSH” : When two application processes communicate interactively, sometimes the application on one side wants to receive the response immediately after typing a command, rather than waiting for the entire cache to fill up and deliver it up. In this case, the sender TCP sets PSH to 1 and immediately creates a packet segment to send. After receiving the packet segment with PSH=1, the receiving TCP delivers the packet to the receiving application process as soon as possible.
  • Reset RST: When RST is 1, it indicates that a serious error occurs in the TCP connection (for example, the host crashes or other reasons) and the connection must be released before the transport connection is re-established.
  • SNY: synchronization sequence number used when establishing a connection. When SYN=1 and ACK=0, it indicates that this is a connection request packet segment.
  • “FIN” : Releases a connection.

Sliding window TCP sends data. In order to improve the efficiency of data transmission, a mechanism called sliding window is used to send data. Below is a schematic of the sliding window at the sender. The size of the sliding window is the sequence length of the green part and the red part. The way it works is that as soon as the sender receives an acknowledgement, the sliding window moves to the right.

 

Flow control Flow control can be summed up in a short sentence. ❝ The receiver will send a negative feedback to the sender, through which the size of the sliding window of the sender can be controlled. ❞ can look at the following is how zhihu said, I found a say the most image, can be combined with understanding.

 

 

How does the sliding window of TCP control flow? Congestion control

  • “Slow start:” The value of slow start is that when a TCP connection is established, do not send a large amount of data at once to cause a surge in network congestion, but gradually increase the congestion window according to the feedback.

  • “Congestion avoidance:” Congestion avoidance means that the sliding window grows slowly, rather than exponentially, like a slow start.

  • Fast retransmission: If the sender receives three consecutive repeated acknowledgements, it should immediately retransmit the unreceived packet segment without waiting for the retransmission timer to expire.

  • Quick Recovery: Quick recovery has the following two features

    • When the sender receives three consecutive repeated acknowledgements, the multiplication reduction algorithm is performed to halve the slow start threshold. This is to prevent network congestion. Note that the slow start algorithm is not executed.
    • When implementing the fast recovery algorithm, change the value of the sliding window, and then start to implement the congestion avoidance algorithm, so that the congestion window slowly increases.

The three-way handshake and the four-way wave are pretty common in interviews, but before I get into the three-way handshake, I think it’s worth mentioning ** “The commonality of ideal transport conditions” ** :

  • The transmission channel does not produce errors
  • No matter how fast the sender sends data, the receiver can always receive it in time.

The ideal situation is ideal after all, the above two situations can not happen in the actual environment. So, let’s talk about how we can make our reality closer to the ideal, and that’s what we’re going to talk about with the three handshakes. First of all, the three-way handshake and the four-way wave are all for TCP. UDP is a connectionless protocol, and the three-way handshake and the four-way wave are impossible. The three-way handshake and the four-way wave are for better reliable transmission. Let’s look at the flow chart of the three-way handshake.

 

Since it is to carry out reliable transmission, it is nothing more than to ensure the normal transmission and receipt of data between the client and the server.

  • “First handshake” : The Client cannot confirm anything. The Server confirms that the Client sends messages correctly.
  • Second Handshake: The Client confirms that the sending and receiving are normal. The Server confirms that the Client and the Server can receive messages correctly.
  • Third handshake: The Client confirms that the sending and receiving are normal by itself and the receiving by the peer. The Server confirms that the sending and receiving function is normal, and the receiving function is normal.

Why do you need a third handshake? In short, the main purpose is to prevent invalid connection request packets suddenly sent to the server, resulting in errors. Through the above three steps, the Client and Server can carry out reliable transmission. Now that I understand the three handshakes, I think it is not difficult to do four handshakes. Please attach the flow chart first.

 

Like the three waves, the four waves are for reliable transmission. Four waves is the process of disconnecting the Client from the Server, so you might think that it takes three waves to establish the connection, which is understandable why it takes four waves to disconnect the connection. Isn’t once or twice enough? Well, since the three-way handshake needs to be confirmed by the sender and the receiver, the four-way wave also needs to be confirmed by the sender and the receiver.

  • First wave: The Client sends a request to the Server for disconnected connections.
  • Second wave: The Server sends the confirmation of disconnection to the Client. After the Client receives the packet, the TCP enters the half-connection state and the channel for sending data from the Client to the Server is closed.
  • Third wave: The Server sends a disconnection request to the Client.
  • Fourth Wave: The Client sends the confirmation of disconnection to the Server. After the Server receives this message, the TCP connection is completely disconnected.

Another way to think about it is, the question I mentioned above. In the second wave, the Server sends an ACK and a FIN request to the Client. Then, if the Server is still receiving data from the Client, the channel for receiving data will be closed for the Client’s next ACK, and the data will fail to receive as shown in the figure below.

 

Here recommend an article to help you better understand TCP connections to the process of establishing and disconnect: two dynamic figure – thoroughly understand TCP three-way handshake and four times to wave the TCP and UDP application scenario As for the relationship between TCP and UDP, finish see this picture below you might understand (image) is derived from the network:

 

 

TCP is reliable and UDP is unreliable, so why do we need to use unreliable UDP for data transmission? As we know, UDP does not need to establish a connection before transmitting data, and the remote host does not need to give any confirmation after receiving UDP packets. Although UDP does not provide reliable delivery, it is the most efficient way to work in certain situations (generally used for instant messaging), such as QQ voice, QQ video, live streaming, etc. TCP provides connection-oriented services. The connection must be established before the data is transferred and released after the data is transferred. TCP does not provide broadcast or multicast services. Because TCP to provide a reliable, connection-oriented transport service (TCP and reliable in TCP before passing data, there will be three times handshake to establish a connection, and in data transmission, are confirmed, the window, the retransmission, the congestion control mechanism, in after the data transfer, disconnected will also be used to save system resources), the hard to avoid increased a lot of overhead, Such as validation, flow control, timer and connection management. This not only makes the header of the protocol data unit much larger, but also takes up a lot of processor resources. TCP is used for file transfer, email sending and receiving, and remote login. The application layer


HTTP is a standard for requests and responses between clients (users) and servers (websites), usually using the TCP protocol. Using a Web browser, web crawler, or other tool, the client makes an HTTP request to a specified port on the server (the default port is 80). We call this client the User Agent. The answering server stores resources such as HTML files and images. We call this reply server the Origin Server. ❞ THE HTTP protocol is now widely used on the World Wide Web. HTTP will be covered in a separate article later, but for now I have to talk about HTTPS. HTTP and HTTPS are the same protocols, but HTTPS is encapsulated by Secure Socket Layer (SSL) or Transport Layer Security (TLS). From these two protocols alone, you know that HTTPS is secure and HTTP is not. FTP File Transfer Protocol (FTP) is an application layer Protocol in the TCP/IP Protocol family. Running on TOP of TCP, FTP is a reliable Transfer Protocol used to distribute and share files between users. In addition, the network administrator uses the FTP function when performing service operations such as device version upgrade, log download, and configuration saving. DNS protocol said that IP address is used to locate the host, but it is difficult for us to remember these irregular IP addresses in life, we only know the domain name of the website. So what do we do now? So the DNS protocol came into being.

 

DNS is a domain name resolution protocol. If we know the domain name but do not know the IP address of the server, we need to use DNS. DHCP protocol What is DHCP protocol? ❝ Dynamic Host Configuration Protocol (DHCP) is a communication protocol that enables network administrators to centrally manage and automatically assign IP network addresses. On an IP network, each device connected to the Internet needs to be assigned a unique IP address. DHCP enables network administrators to monitor and assign IP addresses from central nodes. When a computer moves to another location on the network, it automatically receives the new IP address. As is clearly explained on the ❞ wiki, THE role of DHCP is to dynamically assign IP addresses to hosts, greatly reducing the workload of network administrators.

Hello, I’m Cxuan, a technical person. I have written a total of six PDF “Java core technology summary” “HTTP core summary” “Programmers must know the basic knowledge” “Operating system core summary” “Java core foundation 2.0” “Java interview questions summary” now I put baidu link to everyone put out, You can click on the link below to receive links: pan.baidu.com/s/1mYAeS9hI… Password: p9rs