preface

To be a good Android developer, you need a completeThe knowledge systemHere, let’s grow up to be what we want to be.

Network optimization has always been considered moving to optimize water is one of the deepest, so want to further optimize the network, we must first lay a more solid foundation network, in this article, we will once again confirm the key knowledge of computer network, to establish in your mind a relatively comprehensive system of network knowledge.

The outline

First, re-recognize computer networks

1. What is computer network?

  • 1), mainly by general purpose, programmable hardware interconnection.
  • 2) With this hardware, different types of data can be transmitted.
  • 3) Computer network includes not only software concepts, but also hardware devices.
  • 4. Computer networks are more than just information and communication. They can support a wide range of and growing applications.

2. Classification of computer networks

1) According to the scope of action

Wide Area Network (WAN)

Tens of KM ~ thousands of KM, across the province, transnational.

MAN (MAN)

5 KM ~ 50 KM, between cities and within cities.

Local Area Network (LAN)

Within 1 KM, within the region, between families, within the company.

2) By Network user

Public network

All the networks you can join for a fee.

Private network

A special network set up by certain forces, organizations, or individuals to meet specific business needs. For example, armies, railroads, and banks all have their own private networks.

2. The Evolution of the Internet

1. Historical evolution of the World Internet

1) Single network

ARPANET, a network created by the U.S. Department of Defense in 1969 to connect surrounding computers.

Computers can exchange information directly through a switch.

2) Three-level structure

The prototype of the modern Internet, also known as the Internet, connected all the universities, research institutes and laboratories in the United States.

From top to bottom, it is composed of backbone network, regional network and campus network.

3) Multi-level ISP

Internet Service Provider (ISP) : Indicates an Internet Service Provider, such as China Telecom, China Mobile, and China Unicom. From top to bottom, it consists of backbone ISPs and regional ISPs.

The trunk ISP

China’s backbone ISPs include China Telecom, China Mobile and China Unicom, which can connect to backbone ISPs in the US and other countries.

In the ISP

For example, mobile networks, called Beijing Mobile in Beijing and Shanghai Mobile in Shanghai, belong to regional ISPs. Regional ISPs can connect companies, campuses, and home networks.

4) Understand the main lines of the modern Internet

We can learn about the main lines of the Internet through infrapedia.

As you can see, China’s backbone exports are mostly located in coastal areas such as Guangdong and Fujian, which are connected to other backbone networks around the world through their own undersea cables to form the Internet.

2. History of Internet Development in China

1) 1980

China’s Ministry of Railways begins experiments with the Internet.

2), 1989

Get the first public network up and running.

3) 1994

Access to the Internet.

4) Till now

The five largest public computer networks in China today:

  • China Telecom Internet (CHINANET)
  • China Unicom Internet (UNINET)
  • China Mobile Internet (CMNET)
  • China Computer Network for Education and Research (CERNET)
  • Science and Technology of China (CSTNET)

3. Chinese Internet companies

  • Zhang founded Sohu in 1996.
  • Ding lei founded netease in 1997.
  • In 1998, Wang Zhidong founded Sina and Ma Huateng and Zhang Zhidong founded Tencent.
  • Jack Ma founded Alibaba in 1999.
  • Robin Li founded Baidu in 2000.

Third, re-identify the network hierarchy

Why should the network be layered?

Because complex programs are layered. This is a general architectural design problem, not just a network protocol problem, but situations involving complex logic or software requirements that change frequently are usually addressed by layering.

Consider 🤔 : What problems need to be solved in designing a computer network?

  • 1) Smooth data path should be ensured during data transmission.
  • 2) Need to identify the target computer.
  • 3) It is necessary to know the state of the target computer.
  • 4) Whether the data is incorrect.

In short, the computer network needs to solve many and complex problems, so we need to use layered design to solve different problems, to achieve different functions.

1. Basic principles of hierarchical design

1) Independent of each other

Each layer implements only a relatively independent function, and you need to ensure that the coupling between layers is very low.

2) Flexibility

The design of each layer needs to have good flexibility and expansibility to adapt to future network changes.

3), coupling degree

Each layer is completely decoupled, and changes between layers do not affect each other.

2. OSI seven-layer model

OSI function
The application layer Provides interfaces and services to computer users.
The presentation layer Data processing: codec, encryption and decryption, etc.
The session layer Manages (establishes, maintains, reconnects) communication sessions.
The transport layer Manages end-to-end communication connections.
The network layer Data routing: Determines the path of data across the network.
Data link layer Manages data communication between adjacent nodes.
The physical layer Photoelectric physical characteristics of data communication.

1) The sad story of OSI

  • 1. From the beginning, OSI intended to be the standard for computers around the world.
  • 2) OSI, however, has encountered difficulties in marketability, as TCP/IP has been successfully implemented globally.
  • 3) Finally, OSI is not known as the widely used standard model.

2) Reasons for the failure of the OSI seven-layer model

  • 1) OSI experts do not fully combine theory with practice.
  • 2) The OSI standard formulation cycle is too long, and the equipment produced according to the OSI standard cannot enter the market in time.
  • 3) The design of the OSI model is unreasonable, and some functions are repeated in multiple layers.

3. TCP/IP four-layer model

We need to understand the conversion of protocols between different devices during data communication. As can be seen from the following figure, the router only includes the network layer and the network interface layer.

In terms of the number of protocols, the TCP/IP four-layer model forms a ⏳ hourglass shape with a narrow middle and large ends. As shown in the figure below:

Fourth, get to know modern network topology

1. Why do you need to know the network topology?

Because it helps us form a computer network of images in our mind.

2. Network topology classification

1), the edge part

family

It consists of terminal machines, routers, gateways and regional ISPs.

enterprise

Different from the home network topology, the gateways are divided into internal gateways and unified gateways.

2) Core parts

Consists of regional ISPs, backbone ISPs, routers, submarine cables, or cross-regional cables. The communication equipment (usually huawei) is mainly laid by Mobile and Unicom.

The network topology of the modern Internet forms a tree structure.

3) C/S mode

Consists of client/server patterns and can communicate with each other.

4) P2P mode

Not divided into the server and client, they are peer to peer connection, the advantage is that you can make the download speed faster, such as thunderbolt downloads in the application of this mode.

5. Network performance indicators

1, the rate of

The BPS < = = > bit/sCopy the code

The various units of network data transmission correspond to common equipment

Why is the test peak speed of the 100 mbit/s optical fiber of Telecom only 12 mbit/s?

The network unit is usually (Mbps), so the 100M here refers to 100Mbps.

100 M/S = 100 Mbps = 100 Mbit/s
100Mbit/s = (100/8MB/s =12.5 MB/s
Copy the code

2, time delay,

1) Send delay

Transmission delay = Data length (bit)/Transmission rate (bit/s)Copy the code

The data length is determined by the user, and the transmission rate is determined by the computer network card.

2) Transmission delay

Propagation delay = transmission path distance/propagation rate (bit/s)Copy the code

The transmission path distance is determined by the user, while the transmission rate is limited by the transmission medium.

3) Queue delay

The amount of time a packet waits to be processed in a network device, e.g. a router needs to process the previous packet one by one before it can process the next packet.

4) Processing delay

The time it takes for a packet to arrive at the device or the destination machine to be processed.

Total delay = transmission delay + queuing delay + propagation delay + processing delayCopy the code

3. Route Trip Time (RTT)

  • An important indicator for evaluating network quality.
  • Indicates the time when data packets go back and forth during end-to-end communication.

You can run the ping command to view the RTT

1) Ping IP addresses in the current city

quchao@quchaodeMacBook-Pro ~ % ping 119.29148.149.
PING 119.29148.149. (119.29148.149.) :56 data bytes
64 bytes from 119.29148.149.: icmp_seq=0 ttl=116 time=13.210 ms
64 bytes from 119.29148.149.: icmp_seq=1 ttl=116 time=19.118 ms
64 bytes from 119.29148.149.: icmp_seq=2 ttl=116 time=34.384 ms
Copy the code

2) Ping the IP address of the USA

quchao@quchaodeMacBook-Pro ~ % ping 191.101238.160.
PING 191.101238.160. (191.101238.160.) :56 data bytes
64 bytes from 191.101238.160.: icmp_seq=0 ttl=52 time=191.791 ms
64 bytes from 191.101238.160.: icmp_seq=1 ttl=52 time=180.278 ms
64 bytes from 191.101238.160.: icmp_seq=2 ttl=52 time=186.399 ms
Copy the code

6. Application layer

The transport layer already provides a complete communication service with the layer below. The application layer is the layer facing the user. It mainly defines the communication rules between applications, such as the packet type (request packet, reply packet), packet syntax and format, data sending time, and rules of the application process.

1. Domain Name System (DNS) Service

The domain is the corresponding network number and the name is the corresponding host name.

1) Function

By translating random dotted decimal IP addresses into comprehensible domain names.

2) Domain name

  • Using a domain name can help with memory.
  • Domain names can be translated into IP addresses through the DNS service.
  • Domain names are made up of dots, letters, and numbers.
  • Points divide different domains.
  • Domain names can be classified into top-level domains, second-level domains, and third-level domains. , for example, www.taobao.com => – Three-level domain. Level-2 domain. Top-level domain.

Top-level domains are commonly classified

  • countries
    • cn
    • us
    • uk
    • ca
  • general
    • com
    • net
    • gov
    • org

The secondary domain

For example: QQ, Aliyun, Taobao, Google, Facebook and so on.

The top-level domain, second-level domain, and third-level domain form a tree structure. There is also a root DNS server on top of the TOP-LEVEL DNS server.

3) Domain name server

As long as there is an external network server can build a domain name server.

2. Dynamic Host Configuratin Protocol (DHCP

1) What is it?

The network administrator only needs to configure a shared IP address. Each newly connected machine can apply for an IP address from the shared IP address through DHCP. Then the network administrator can automatically configure the IP address. And when we’re done, we can return it to other machines. Its features are as follows:

  • 1. DHCP is a LAN protocol.
  • 2. DHCP is an application-layer protocol that applies UDP.

2) Function

  • Plug and play networking.
  • On the IP configuration page, select Obtain IP Address automatically and Obtain DNS Server Address Automatically to enable DHCP to obtain a temporary IP address (usually an Intranet address).
  • There is a lease which can be renewed halfway through.

3) Process of DHCP

  • 1) The default listening port of the DHCP server is 67.
  • 2. The host broadcasts DHCP discovery packets using UDP.
  • 3) The DHCP server sends a DHCP packet.
  • 4) The host sends a DHCP request packet to the DHCP server.
  • 5) The DHCP server responds and provides the IP address.

4) The IP addresses leased from DHCP have a lease period. How to renew the lease of IP addresses?

When 50% of the lease expires, the client directly sends a DHCP request to the DHCP server that provides its IP address. After receiving the DHCP ACK message from the server, the client updates its configuration based on the new lease and other updated TCP/IP parameters provided in the newsletter.

3. HyperText Tranfsfer Protocol (HTTP)

1) What is it?

  • HyperText refers to text displayed on a computer that contains links to other text.
  • There is a unified path for each of these, such as:HTTP (s)://< host >:< port >/< path >.
  • The underlying protocol of HTTP is TCP, so it is a reliable data transfer protocol.

2) Web server

Divided into hardware part (computer or virtual equipment on the cloud) and software part (Nginx, Apache).

process

  • 1) Accept client connection
  • 2) Receive the request packet
  • 3) Processing requests
  • 4) Access Web resources
  • 5), structure response
  • 6) Send a reply

3) HTTP request method

header 1 header 2
GET Gets the specified server resource.
POST Commit data to the server.
DELETE Example Delete the specified server resource. (Rarely used)
UPDATE Update the specified server resource.
PUT Modify data.
OPTIONS Lists the request methods that can be applied to resources for cross-domain requests.
CONNECT Establish a connection tunnel for the proxy server
HEAD Gets meta information about a resource
TRACE Trace the transmission path of the request-response

Difference between GET and POST

  • 1. The GET parameter is passed through the URL, and the POST parameter is placed in the request body.
  • 2. Get requests pass parameters in the URL with a length limit, while POST does not.
  • 3. Get is less secure than POST because the parameters are directly exposed in the URL and therefore cannot be used to pass sensitive information.
  • 4, GET request can only be URL encoding, while POST support a variety of encoding.
  • 5. Get requests are cached actively by the browser, while POST supports multiple encoding methods.
  • 6. Get request parameters are completely preserved in the browsing history, while POST parameters are not.
  • 7. GET and POST are essentially TCP links, no different. However, due to HTTP regulations and browser/server restrictions, they are different in the application process.

4) resources specified by HTTP

1), specified in the address

www.wanandroid.com/repo/100.ht…

Repo /100.html is the specified request resource.

www.wanandroid.com/?sort=0&unl…

? This is used later to specify the request parameters.

2) specified in the request data

5) HTTP request packet

HTTP request packets and response packets meet the following structure:

Start line + header + blank line + entity

The blank lines are used to distinguish the header from the entity.

The format of an HTTP request packet is as follows:

Such as:

POST https:/ / www.wanandroid.com HTTP / 1.1Accept-Encoding:gzip Accept-Language:zh-CN ... {requested jsonString content}Copy the code

6) HTTP reply packet

7) HTTP reply status code

header 1 header 2
100 ~ 199 The intermediate status of protocol processing requires subsequent operations
200 ~ 299 successful
300 ~ 399 redirect
400 ~ 499 Client error
500 ~ 599 Server error

100 ~ 199

  • 101: Switching Protocols, sent when the server agrees to upgrade HTTP to WebSocket.

200 ~ 299

  • 200: Puts data in the response body.
  • 204: No Content, No body data after the response header.
  • 206: Partial Content, usually used for HTTP block downloads and breakpoint continuation, with the corresponding response header field content-range.

300 ~ 399

  • 301: Moved, Permanently redirected.
  • 302: Found, temporary redirection.
  • 304: Not Modified: returned on a negotiated cache hit.

400 ~ 499

  • 400: Bad Request, Request error.
  • 403: Forbidden: Indicates that the server is Forbidden to access due to legal prohibition or sensitive information.
  • 404: Not Found, resource Not Found.
  • 405: Method Not Allowed: The requested Method is Not Allowed.
  • B: What is the Acceptable resource?
  • 408: Request Timeout: the Request times out.
  • 409: Conflict, multiple requests have been in Conflict.
  • 413: Request Entity Too Large: the data in the Request body is Too Large.
  • 414: request-uri Too Long: The URI in the Request line is Too large.
  • 429: Too Many Request: the client sends Too Many requests.
  • 431: Request Header Fields are Too Large.

500 ~ 599

  • 500: indicates an Internal Server Error.
  • 501: Not Implemented: Requests are Not supported.
  • 502: Bad Gateway: The server itself is normal, but the data channel is faulty.
  • 503: Service Unavailable: The server is busy and cannot respond to the Service.

8) working structure of HTTP

Web caching

The 80-20 principle is usually followed: a website’s content is usually divided into 20% popular content and 80% unpopular content. Therefore, hot content can be cached first.

Memory hierarchy

Cache (CPU cache)/ Main Storage (memory)/ Secondary storage (disk)

Web agent

  • Web proxy allows you to mask the server deployment structure.
  • You can set up rules, such as firewalls, in the Web proxy to ensure security.
Classification of Web proxies
  • 1) Forward proxy: Proxy the client to access the Server.
  • 2) reverse proxy: The proxy Server returns the data to the client. For example, Nginx and HAProxy are some famous proxy software.

Content Delivery Network (CDN)

  • Use to keep a backup of large content on a nearby server.
  • CDN can be used to accelerate multimedia content.

The basic principle of CDN is to widely use various cache servers and distribute these cache servers to areas or networks where users’ access is relatively centralized. When users visit websites, the global load technology is used to point users’ access to the nearest cache server that works normally, and the cache server directly responds.

The crawler

It is used to collect information on the Internet. For example, Baidu and Google are essentially a crawler. They take down the data of the whole network, make an index, and then provide these contents to everyone.

Disadvantages of a bad reptile:
  • Increase network congestion.
  • Consumes server resources.

4. HTTPS Indicates the Secure HTTP protocol

Https:// < host >:<443>/< Path >

HTTP is transmitted in plain text, but we need to transmit account number, password, personal information, account amount, transaction information and sensitive information on the network, which will lead to the middleman illegally intercepting information, resulting in information leakage.

1) Encryption model

  • Symmetric encryption: Use the same secret key for both encryption and decryption.
  • Asymmetric encryption: Public key encryption, private key decryption, and the public key and private key are a set of secret keys with a certain mathematical relationship.
    • Private key: for personal use, not for public use.
    • Public key: for all to use, open to the public.

2) Digital certificate signature verification

A digital certificate is an authentication issued to a specific object by a trusted organization. A trusted organization is one that both the client and the server consider secure.

Digital Certificate Format

  • Format and version of the certificate
  • Certificate Serial number
  • Signature algorithm
  • The period of validity
  • The name of the object
  • Object to expose the secret key

3. Secure Sockets Layer (SSL)

SSL is a sub-layer between the transport layer and the application layer. It serves two main functions:

  • 1) Data security (to ensure that the data will not be leaked) and data integrity (to ensure that the data will not be tampered with).
  • 2) Encrypt the data before transmission.

HTTPS communication process

  • TCP connection to port 443.
  • 2) SSL security parameter handshake.
  • 3) The client sends data.
  • 4) The server sends data.

Secure Sockets Layer (SSL) handshake process

1) The process of generating random numbers 1, 2, and 3

2) According to random numbers 1, 2 and 3, the double terminal generates symmetric secret keys with the same algorithm for encrypted communication

HTTPS combines symmetric encryption and asymmetric encryption. In the stage of random number check, asymmetric encryption is used for communication. After both parties determine three random numbers, the same algorithm can be used to generate symmetric secret keys for encryption communication. The advantage of HTTPS is that the secret keys are generated on both ends without transmission, reducing the possibility of secret key leakage.

5, Http2

Its features are as follows:

  • 1) Head compression.
  • 2) Multiplexing: Multiplexing allows multiple request-response messages to be sent simultaneously over a single HTTP/2 connection. Improved: in HTTP1.1, browser clients have a limited number of requests (connections) for the same domain at the same time, beyond which they will be blocked.
  • 3) Improve access speed: it takes less time to request resources than HTTP1.1, and access speed is faster.
  • 4) Binary frames: HTTP2.0 will split all the transmitted information into smaller messages or frames and encode them in binary.
  • 5) Set the request priority.
  • 6) Server push.

6, cookie,

HTTP is a stateless protocol, so the biggest use of cookies is to store sessionids that uniquely identify users. Also, a Cookie is essentially a small text file stored in a browser, stored internally as key-value pairs.

Life cycle

Set the Expires and max-age properties:

  • Expires: indicates the expiration time.
  • Max-age: indicates a period of time (in seconds) from the time the browser receives the packet.

scope

We can use the Domain and path properties to bind a Cookie to a Domain name and path. If the domain name or path does not match the two attributes before the request is sent, the Cookie is not included. Note that any path with/in it is allowed to use cookies.

security

  • Secure: Indicates that cookies can be transmitted only through HTTPS.
  • HttpOnly: Indicates that only HTTP is used for transmission.
  • Bring SameSite: Prevent CSRF attacks.

disadvantages

  • 1) Security defects: Cookies are easy to be intercepted by illegal users, and then tampered with in a series, and finally re-sent to the server within the validity period of cookies.
  • 2) Capacity defects: the upper limit of the volume is only 4KB, which can only be used to store a small amount of information.
  • 3) Performance defects: Cookies follow the domain name, so the requests under the domain name will carry complete cookies. With the increase of the number of requests, it will cause huge performance waste, because the requests carry a lot of unnecessary content. This can be resolved by specifying the scope via Domain and Path.

7. Common Problems in HTTP transport

  • 1) Cross-domain problems
  • 2) Data transmission
  • 3) The head of the team is blocked

7. Transport layer

Now, when device A and device B communicate with each other, we can assume that they are connected through A virtual interconnect network. The problems of network topology and data routing have been solved in the virtual interconnection network. The transport layer focuses on how the two devices communicate directly.

1. Main functions of the transport layer

1) Process-to-process communication

Unlike interprocess communication (Unix domain sockets, shared memory) used within a single operating system, network communication can communicate across devices and across networks.

2) The concept of ports

  • Use ports to mark different network processes.
  • Ports are 16 bits (0 to 65535).

Common protocol ports are as follows:

agreement port
FTP 21
HTTP 80
HTTPS 443
DNS 53
TELNET 23

2. User Datagram Protocol (UDP)

1) Function

UDP does not process datagrams, that is, it does not merge or split the data.

2) features

1) No connection

You do not need to establish a connection in advance to communicate.

2) Reliable data delivery is not guaranteed

Send as soon as you want. There is no guarantee that data will be lost during network transmission.

3) Oriented to message transmission

Instead of doing any processing on the data, the application layer data is directly inserted into the packet.

4) No congestion control

It delivers data whether the network is congested or not.

5), the first overhead is very small

The header takes only 8 bytes.

3) Message structure

  • The minimum UDP length is 8, that is, only the UDP header is included.
  • The checksum is used to check whether a UDP datagram is faulty during transmission.

4), 5 examples of UDP-based customization 🌰

1. Visits from web pages or apps

At present, HTTP usually adopts the strategy of multiple data channels sharing a connection, which is originally intended to speed up the transmission, but TCP’s strict sequential policy makes it necessary to wait even if the previous packet does not arrive on the shared channel, and the later packet has to wait even if it is unrelated to the previous one, which increases the delay.

And QUIC (Quick UDP Internet Connection, Fast UDP Internet Connection) protocol is proposed by Google a kind of improved communication protocol based on UDP, its purpose is to reduce the latency of network communication, to provide better user interaction experience.

QUIC is a representative of application layer customization by rapidly establishing connections, reducing retransmission delay and adaptive congestion control on the application layer.

2. Streaming protocol

Live streaming usually uses RTMP (Real Time Messaging Protocol), which is based on TCP. However, for live broadcasting, real-time performance is more important, so it is better to lose packets than to lose time.

For video playback, some packages can be lost, some packages can’t, because in the consecutive frames of video, li some frames are important, there is not important, if must packet loss, throw a, every few actually see a video of the one who is, won’t perceive, but if it is a continuous cast frame, can be sensed, therefore under the condition of the network is bad, applications tend to selectively lost frames.

When the network is bad, TCP will actively slow down the transmission speed, which is to add insult to injury to the existing video. TCP should allow the application layer to retransmit immediately, rather than give in. Therefore, many live streaming applications have implemented their own video transmission protocols based on UDP.

3. Real-time gaming

Maintaining TCP connections requires maintaining some data structures in the kernel, but there is a limit to the number of TCP connections a machine can support. Since UDP is connectionless, before asynchronous I/O was introduced, UDP was often a policy for dealing with massive client connections.

In the case of strict real-time requirements of the game, you can use custom reliable UDP to transmit packets. By using custom retransmission strategy, you can minimize the delay caused by packet loss and minimize the impact of network problems on the game.

Internet of Things

Nest, owned by Google, has set up Thread Group, a protocol for the Internet of Things, which is based on UDP.

5. Mobile communications

In 4G networks, gTP-U, the protocol for transmitting data through mobile communications, is based on UDP. We will talk about the mobile web in the next article.

3. Initial understanding of Transmission Control Protocol (TCP)

1. Detailed description of TCP packets

1) features

  • 1) Connection-oriented: Connection-oriented is like making a phone call when you need to dial the phone first.
  • 2) Point-to-point communication.
  • 3) Reliable transmission service.
  • 4) Full-duplex communication: when two devices are connected, they can both send and receive data simultaneously.
  • 5) Bytestream-oriented protocols: TCP deals with byte by byte, so IT is likely that TCP will pick up a segment of the data for transmission, and the rest of the data will be put in the second and subsequent TCP packets for transmission. Therefore, TCP may combine or split user data.

The downside of TCP is that it is slow because it requires establishing connections, sending acknowledgement packets, and so on.

2) Packet header field

The serial number
  • The range of representation is0 to the 32nd minus 1.
  • Because TCP is byte streams oriented, each byte has an ordinal number corresponding to it.
  • The serial number of a TCP datagram is the serial number of the first byte in the datagram.
Confirmation no.
  • The range of representation is0 to the 32nd minus 1.
  • Indicates the number of the first byte of the data to be received. If the confirmation number is S, it indicates that the data with the number of S-1 has been received.
Data migration
  • The value contains four characters, ranging from 0 to 15. The unit is a 32-bit word. => The header ranges from 20 to 60 bytes.
  • The distance of the TCP data offset header, which is required because the size of the TCP option is uncertain.
TCP tag

Six, each with a different meaning.

tag meaning
URG(Urgent) The critical bit, URG = 1, indicates the critical data.
ACK(Acknowledgement Confirmation bit, ACK = 1, the confirmation number takes effect.
PSH(Push) The push bit, PSH = 1, indicates that data needs to be delivered to the application layer as soon as possible.
RST(Reset) Reset bit, RST = 1, reconnect.
SYN(Synchronization) Synchronization bit. SYN = 1 indicates a connection request packet.
FIN(Finish) The stop bit. FIN = 1 indicates that the connection is released.
window
  • Accounted for 16:0 to the 16th minus 1.
  • The window indicates the amount of data that is allowed to be sent. For example, if the confirmation number is 201 and the window is 300, the number that can be received ranges from 201 to 500.
The checksum

Similar to UDP, it is used to detect whether an error occurs during the transmission of TCP data.

Pointer to an emergency
  • Urgent data (URG = 1).
  • Specifies the location of the emergency data in the packet.
TCP option
  • Up to 40 bytes.
  • Support for future extensions.

4. Basic principles of reliable transmission

1) Stop waiting for the agreement

When the sender sends a message, the receiver receives and sends the acknowledgement to the sender. The sender needs to stop waiting for the acknowledgement from the receiver.

Timeout retransmission

If the sender does not receive the acknowledgement message from the recipient within the timeout period after the message is sent or the message is received after the timeout period, the message is resended to the sender. Timeout retransmissions typically handle three exception cases, as shown below:

  • 1) The message was lost on the way.
  • 2) The confirmation message is lost on the way.
  • 3) The confirmation message has timed out.

Timeout timer (timeout retransmission timer)

  • 1) Set a timeout timer each time a message is sent.
  • 2) It is mainly used in the reliable transmission protocol TCP. It is designed to control the loss of packets. When the TCP sender sends a packet, it will set a timeout timer for the packet.
  • 3) If the timeout timer receives the acknowledgement of the packet from the receiver before it ends, the timer is cancelled.
  • 4. If no acknowledgement of the packet is received from the receiver before the timeout timer expires (timeout), the sender considers that the packet may have been discarded. The sender resends the packet and sets a timeout timer.
  • 5) It should be noted that before the timeout timer is revoked, the sender must continue to cache the sent unacknowledged packets until the sender receives the acknowledgement from the receiver.

The characteristics of

  • 1. The stop-wait protocol is the simplest reliable transport protocol.
  • 2. The utilization efficiency of the channel is not high.

Since single sending and confirmation is inefficient, can we send and confirm in batches?

2) Automatic Repeat Request (ARQ) Automatic Repeat Request (ARQ)

ARQ is an improvement of the stop and wait protocol, which can greatly improve the channel utilization.

The sliding window

  • 1) Data in the window can be sent.
  • 2) The confirmation message is not received by moving the window.
  • 3) The method of cumulative confirmation is adopted, and it is not necessary to confirm every message.

Cumulative confirmation

As long as I receive confirmation of the fifth message, it means that the first to fifth message recipients have received.

5. Reliable transmission of TCP protocol

The reliable transmission of TCP is based on the continuous ARQ protocol.

  • 1) Sliding window
  • 2) Cumulative confirmation
  • 3) Select retransmission

Select the retransmission

  • 1) To select retransmission, specify the bytes to be retransmitted.
  • 2) Each byte has a unique 32-bit sequence number (4 bytes).
  • 3) The data to be retransmitted is stored in the TCP option, in which a maximum of 10 sequence numbers can be stored, that is, the information of 5 range segments.
  • 4) What is selected to be retransmitted is an information boundary, that is, a byte stream, for example: to transmit information in the range of 1000 ~ 1200, 2000 ~ 3000.

6. TCP traffic control

Flow control refers to keeping the sender from sending too fast. TCP uses sliding Windows for flow control.

1) Sliding window

  • RWND = 300: indicates that the window size is 300.
  • It has 16 digits: 0 ~ 2 ^ 16-1.
  • The window indicates the amount of data that is allowed to be sent. For example, if the confirmation number is 201 and the window is 300, the number that can be received ranges from 201 to 500.
  • The receiver can adjust the size of the sliding window to control how efficiently the sender sends data.
  • When the receiver adjusts RWND from 0 to 1000 and sends this information to the sender, the message is lost, which causes both sender and receiver to wait, creating a deadlock situation.

How to resolve this deadlock situation?

2) Stick to the timer

The persistence timer is set when the slide window is used for flow control.

  • 1) When receiving the message that the window is 0, start the persistence timer.
  • 2) Adhere to the timer to send a window detection packet at intervals.

7. TCP congestion control

The problem

  • A data link goes through a large number of devices.
  • Each part of data link may become the bottleneck of network transmission.

Difference between flow control and congestion control

Unlike flow control, which considers point-to-point traffic control, congestion control considers the entire network and is a global consideration.

How can you tell if congestion has occurred?

Congestion occurs when the packet times out.

TCP congestion control

1) Slow start algorithm
  • Gradually increase the amount of data sent from small to large (exponential growth, for example: 1, 2, 4, 8, 16).
  • Each acknowledgement packet is added by one.
  • Beyond the slow start threshold (SSthresh), growth stops.
2) Congestion avoidance algorithm
  • Maintains a congestion window variable.
  • As long as the network is not congested, try to expand the congestion window.

The congestion control of TCP uses the slow start algorithm to increase the window size exponentially in the early stage, until the slow start threshold (SSthresh) is exceeded, and then starts the congestion control algorithm to avoid the linear growth of the window.

TCP link establishment – three-way handshake

Why would the sender send a third acknowledgement message?

  • 1) The invalid link request packet is sent to the other party, causing an error: Assuming that two handshakes are sufficient, the invalid link request packet is received and a duplicate connection is established. When the three-way handshake is used, the packet that reaches the receiver slowly is also acknowledged to the sender. However, the sender has already executed the third handshake. Therefore, the sender ignores the second acknowledgement and does not perform any operation.
  • 2) Because channels are unreliable, and TCP wants to establish reliable transmission over unreliable channels, three times is the theoretical minimum. (UDP does not need to establish a reliable transmission, so UDP does not require a three-way handshake.)
  • 3) Because both parties need to confirm that the other party has received the serial number sent by them, the confirmation process requires at least three communications.

9. TCP Link release – Four waves

1) Wait timer

  • Wait 2MSL => 4 minutes.
  • The timer is set by the party that actively closes the TCP connection during the fourth wave to ensure that the final FIN packet (the third wave) sent by the party that actively closes the TCP connection can reach the recipient.
  • Max Segment Lifetime (MSL): indicates the maximum Lifetime of a message Segment. The MSL recommended setting this parameter to 2 minutes.

2) Why wait for 2MSL?

  • 1) Since the last message was not confirmed, we need to ensure that the sender’s ACK can reach the receiver. If it is not received within 2MSL time, the receiver will resend it.
  • 2) the 2MSL time can ensure that when the sender does not receive the acknowledgement, the receiver can send the FIN packet again, and the receiver can receive and resend the acknowledgement again, so the 2MSL time can ensure the normal end of the connection.
  • 3) Ensure that all packets on the current connection have expired.

10. Differences between TCP and UDP

  • 1) UDP is usually used for multimedia information distribution, that is, video, voice, real-time information, and so on. TCP is usually used for the transmission of reliable information, including financial transactions, reliable communication, MQ, and so on.
  • 2) TCP is connection-oriented, while UDP is connectionless.
  • 3) TCP provides reliable service, that is to say, the data transmitted through TCP connection, no error, no loss, no repetition, and in order to arrive; UDP does its best to deliver, that is, it does not guarantee reliable delivery.
  • 4) THE TCP logical communication channel is a full-duplex reliable channel; UDP is an unreliable channel.
  • 5) Each TCP connection can only be point-to-point; UDP supports one-to-one, one-to-many, many-to-one and many-to-many interactions.
  • 6) TCP is byte stream oriented (sticky packet problems may occur), in fact, TCP treats data as a series of unstructured byte streams; UDP is packet oriented (there is no sticky packet problem).
  • 7) UDP has no congestion control, so network congestion does not reduce the transmission rate of the source host (useful for real-time applications such as IP phone, real-time video conferencing, etc.)
  • 8) The overhead of TCP header is 20 bytes; The UDP header has a small overhead of only 8 bytes.

11. Socket

We can use ports to mark different network processes, and ports use 16-bit representations (0 to 65535).

1) Socket concept

  • A socket is an abstract concept that represents one end of a TCP connection.
  • Data can be sent or received through a socket.
  • TCP = {Socket1:Socket2} = {{IP:Port}{IP:Port}}, as you can see, TCP consists of two sockets.

2) Socket programming

Server programming

  • 1) Create a Socket.
  • 2) Bind the Socket.
  • 3) Listen to the Socket.
  • 4) Receive and process information.

The code looks like this:

import socket


def server() :
    # 1, create Socket
    s = socket.socket()
    host = "127.0.0.1"
    port = 5678

    # 2. Bind the Socket
    s.bind(host, port)

    # 3. Listen
    s.listen()

    # 4. Send data
    while True:
        c, addr = s.accept()
        print("connect addr", addr)
        c.send(b'Socket Study.')
        c.close()
Copy the code

Client programming

  • 1) Create a Socket.
  • 2) Connect the Socket.
  • 3) Send messages.

The code looks like this:

import socket


def client(i) :
    # 1, create Socket
    s = socket.socket()

    # 2: Connect to the Socket
    s.connect(('127.0.0.1'.5678))

    # 3. Receive the message
    print("Received message:%s, client Id:%d" % (s.recv(1024), i))
    s.close()


if __name__ == '__main__':
    for i in range(10):
        client(i)
Copy the code

It is recommended to use domain Socket for stand-alone communication. Compared with network communication data that needs to go through the whole protocol stack, domain Socket has a simpler process and less system consumption. In addition, if you are interested in the Socket IO implementation mechanism, you can click here.

12, TCP protocol details of the TCP protocol four timers

  • 1) Timeout timer
  • 2) Stick to the timer
  • 3) Time waiting timer
  • 4), keep alive timer: the server will generally set a keep alive timer, it is designed to keep alive TCP connection, can prevent TCP connection at both ends of the idle for a long time, when one side changes or failure, the other side is not aware of the situation.

Each time the server receives data from the peer, it resets the timer. If the timer times out, it sends an eject segment to detect whether the client is online. If no response is received, the client is considered disconnected, so the server terminates the connection. Today, many distributed systems use a keepalive timer to check whether other nodes are online or down, or other nodes send heartbeat messages to the primary node at regular intervals to prove that they are online.

13. What is the IP layer and MAC(Medium Access Control) layer doing during the three-way handshake?

In fact, every time TCP sends a message, it carries the IP layer and the MAC layer. Because every time TCP sends a message, all mechanisms at the IP and MAC layers are run.

Note that as long as the packet is running on the network, it is complete. You can have a lower level without an upper level, and you can’t have an upper level without a lower level. So, for TCP, whether it’s a three-way handshake or a retry, if you want to send a network packet, you have to have the IP layer and the MAC layer, otherwise you can’t send it.

Reference links:


  • 1, moOCs “Programming essential Basic computer principles + Operating System + computer Network” network part of chapter 9-13
  • 2. Chapter 1-5, An Interesting Discussion on Network Protocols
  • 3. First 6 chapters of the Top-down Approach to Computer Networking
  • 4, (recommended intensive reading)HTTP soul question, strengthen your knowledge of HTTP
  • The soul of the TCP protocol, to consolidate your network base
  • A little sister of the front end of 50,000 words interview bible
  • Interview with you fly: this is a comprehensive summary of the basic computer network strategy
  • 8. JavaGuide – Networking
  • 9. Interview – Network
  • 10. Summarized the interview experience of one hundred front-end interviews from the beginning of 17 to the beginning of 18 (including answers)
  • 11. Cs-notes – Network

Thank you for reading this article. I hope you can share it with your friends or tech groups. It means a lot to me.

I hope we can be friends beforeGithub,The Denver nuggetsLast time we shared our knowledge.