Computer networks too difficult? It's enough to know this one

Computer networks and computer operating systems are the two “brothers” that all development roles need to be sworn enemies, whether you are Java, C++ or testing. For back-end development of children’s shoes, the importance of the computer network is no less than the language foundation, after all, usually development and network often deal with, such as: catch a bag and so on. So the preparation of this piece of knowledge or to hold a heart of awe, do not let go of any of the problems. Here is my learning process:

1. Reading: For the basic modules of the computer, I recommend finding a classic book to study well, rather than just looking at the interview. I read two books altogether: Computer Operating System and Illustrated HTTP by Dennis Tang. “Computer operating system” is a textbook, so knowledge point is relatively foundation, coverage range is wider, the student of non division class is still very necessary to see. Illustrated HTTP is a recommended book that uses lots of illustrations to make things easy to understand and look fast.

2. Take notes: there are a lot of knowledge points in the computer network, so take notes when you need to read for easy review. And when taking notes, you can go to Baidu on this knowledge point to see if there is a point you miss, and then to supplement. By the way, why do I keep emphasizing note-taking? Advantage 1: taking notes is the first time you review the knowledge points in the book, deepen memory; Advantage 2: and if you are published in the public relations community to ensure the maximum accuracy, you need to go to see this knowledge point, check whether they have understanding bias and omission, so as to complete the knowledge point of deep digging; Benefit 3: you are not likely to re-read a book when it is time to review the interview, so notes are very important, their own notes, review quickly, and it is best to have some of their own different from the understanding of the book.

3. Read your resume: Check out the interview to see what interviewers are asking about computer networks. You may know a lot of questions, but you won’t be able to answer them because you don’t know how the interviewer is asking you. Have you prepared the questions you will be asked? And for the computer network and computer operating system will be different because of the company and job focus, look at the interview will find that there is still a little rule, but this is not absolute, and finally to see your interviewer preferences.

1. What is your understanding of the five-layer network protocol architecture?

When we study computer networks, we usually take the middle ground, which is to neutralize the advantages of OSI and TCP/IP, and adopt an architecture with only five protocols, which is concise and can explain the concepts clearly.

1. The application layer

The task of the application-layer is to complete a specific network application through the interaction between application processes. Application layer protocols define the rules for communication and interaction between application processes (processes: running programs in a host). Different application layer protocols are required for different network applications. There are many application layer protocols in the Internet, such as domain name system DNS, HTTP protocol to support world Wide Web applications, SMTP protocol to support E-mail and so on. The data units that the application layer interacts with are called packets.

2. The transport layer

The main task of the Transport layer is to provide a common data transfer service for communication between two host processes. Application processes use the service to transmit application-layer packets. “Generic” means that multiple applications can use the same transport layer service rather than targeting a specific network application.

Because a host can run multiple threads at the same time, the transport layer has the functions of reuse and reuse. The so-called reuse means that multiple application layer processes can simultaneously use the services of the lower transport layer. On the contrary, the transport layer delivers the received information to the corresponding processes in the upper application layer.

3. The network layer

Two computers communicating in a computer network may pass through many data links and may also pass through many communication subnets. The task of the network layer is to select the appropriate internetwork routing and switching nodes to ensure the timely transmission of data. When sending data, the network layer encapsulates the packet segments or user datagrams generated by the transport layer into packets and packets for transmission. In the TCP/IP architecture, packets are also called IP datagrams, or datagrams for short, because the network layer uses the IP protocol.

4. Data link layer

The Data Link Layer is often referred to as the link layer for short. Data transmission between two hosts is always transmitted over a segment of the link, which requires the use of a special link-layer protocol. When transmitting data between two adjacent nodes, the data link layer assembles IP datagrams handed over by the network layer into frames, and transmits frames on the link between the two adjacent nodes. Each frame contains data and necessary control information (e.g. synchronization information, address information, error control, etc.).

When receiving data, control information enables the receiver to know which bits a frame begins and ends in. In this way, when the data link layer receives a frame, it can extract the data portion from it and submit it to the network layer. The control information also enables the receiver to detect errors in the received frame. If an error is detected, the data link layer simply discards the errant frame to avoid wasting network resources further across the network. If errors in data transmission at the link layer need to be corrected (that is, the data link layer must not only detect errors, but also correct errors), then the reliability transport protocol is used to correct the errors. This approach complicates the link layer protocol.

5. The physical

The units of data transmitted at the physical layer are bits. The role of the physical layer is to realize the transparent transmission of bitstreams between adjacent computer nodes, shielding the differences between specific transmission media and physical devices as much as possible. The data link layer above it does not have to consider what the specific transmission medium of the network is. “Transparently transmitted bitstream” means that the bitstream transmitted by the actual circuit does not change, and the circuit appears to be invisible to the transmitted bitstream.

2. What are the network protocols corresponding to each layer?

There are many protocols involved in the computer five-layer network system, and the following are commonly used to list them:

3. How does ARP work?

The ARP protocol at the network layer maps IP addresses to physical addresses. First, each host creates an ARP list in its ARP buffer to represent the mapping between IP addresses and MAC addresses. When the source host needs to send a packet to the destination host, it checks whether the MAC address corresponding to the IP address exists in the ARP list. If yes, the source host directly sends the packet to the MAC address. If no, the device sends an ARP broadcast packet to the local network segment to query the MAC address of the destination host.

The ARP request packet contains the IP address of the source host, hardware address, and destination host. After receiving the ARP request, all hosts on the network check whether the destination IP address in the packet is the same as their own IP address. If not, ignore the packet; If the MAC address and IP address of the sender are the same, the host adds the MAC address and IP address of the sender to its OWN ARP list. If the IP address already exists in the ARP table, the host overwrites the IP address and sends an ARP response packet to the source host to inform the source host that it is the MAC address to be searched for. After receiving the ARP response packet, the source host adds the IP address and MAC address of the destination host to its OWN ARP list and uses this information to start data transmission. If the source host does not receive ARP response packets, ARP query fails.

4. What is your understanding of IP address classification?

An IP address is a unified address format provided by the IP protocol. It assigns a logical address to each network and each host on the Internet to shield physical address differences. The IP address addressing scheme divides the IP address space into five classes: A, B, C, D, and E. A, B, and C are basic classes, and D and E are special addresses for multicast and reserved use.

Each IP address contains two IDS, namely, the network ID and the host ID. All hosts on a physical network use the same network ID. A host on the network (including workstations, servers, and routers) has a corresponding host ID. Class A to E addresses have the following characteristics:

Class A address: starts with 0. The first byte ranges from 0 to 127.

Class B address: starts with 10. The first byte ranges from 128 to 191.

Class C address: starts with 110. The first byte ranges from 192 to 223.

Class D address: starts with 1110. The first byte ranges from 224 to 239.

Class E address: reserved address starting with 1111

5. What are the main features of TCP?

1. TCP is connection-oriented. (Just like making a phone call, you need to dial a number to establish a connection before the call and hang up to release the connection when the call is over);

2. Each TCP connection can have only two endpoints, and each TCP connection can be point-to-point (one-to-one).

3. TCP provides reliable delivery services. Data transmitted through TCP connection, error-free, not lost, not repeated, and arrive in sequence;

4. TCP provides full-duplex communication. TCP allows applications on both sides of the communication to send data at any time. Both ends of the TCP connection are configured with the send cache and receive cache to temporarily store the data exchanged between the two parties.

5. Byte oriented stream. A “Stream” in TCP refers to a sequence of bytes flowing into or out of a process. Byte stream oriented means that while an application interacts with TCP one data block at a time (of varying sizes), TCP treats the data handed over by an application as just a series of unstructured byte streams.

6. What are the main features of UDP?

1. UDP is connectionless.

2. UDP uses best effort delivery, i.e. reliable delivery is not guaranteed, so hosts do not need to maintain complex linked states (which have many parameters);

3. UDP is packet oriented.

4. UDP does not have congestion control, so congestion on the network does not reduce the sending rate of the source host (useful for real-time applications, such as live broadcasting and real-time video conferences).

5. UDP supports one-to-one, one-to-many, many-to-one, and many-to-many interactive communication.

6. UDP header overhead is small, only 8 bytes, shorter than TCP 20 bytes header.

7, TCP and UDP?

TCP provides connection-oriented services. The connection must be established before the data is transferred and released after the data is transferred. TCP does not provide broadcast or multicast services. Because TCP is supposed to provide reliable, connection-oriented transportation services (TCP’s reliability is demonstrated by the three-way handshake to establish a connection before transmitting data, and the acknowledgement, window, retransmission, congestion control mechanism during data transmission, and disconnection after data transmission to save system resources), This inevitably adds a lot of overhead, such as validation, flow control, timers, and connection management. This not only makes the header of the protocol data unit much larger, but also takes up a lot of processor resources.

UDP does not need to establish a connection before transmitting data. After receiving a UDP packet, the remote host does not need to confirm the connection. Although UDP does not provide reliable delivery, it is the most efficient way to work in certain situations (generally used for instant messaging), such as QQ voice, QQ video, live streaming, etc.

8. What are the common application layer protocols corresponding to TCP and UDP?

1. Application layer protocol corresponding to TCP

FTP: defines the file transfer protocol and uses port 21. Often said that such and such a computer opened FTP service is to start the file transfer service. FTP is used to download files and upload home pages.

Telnet: A port used for remote login. A user can remotely connect to a computer as himself. Through this port, a communication service based on DOS mode can be provided. If the previous BBS is a pure character interface, the server supporting BBS will open port 23 and provide services externally.

SMTP: defines the simple mail transfer protocol. Many mail servers use this protocol to send mail. For example, the common free mail service is used in this mail service port, so in the email Settings – often see such SMTP port Settings bar, the server is open port 25.

POP3: corresponds to SMTP. POP3 is used to receive mails. Normally, the POP3 protocol uses port 110. That is to say, as long as you have relevant programs using POP3 protocol (such as FO-Xmail or Outlook), you can directly use the mail program to receive mail without logging in to the mailbox interface through the Web (for 163 mailbox, there is no need to enter the website of netease first and then enter your own mailbox to receive mail).

HTTP: Transport protocol for transferring hypertext from a Web server to a local browser.

2. UDP indicates the application-layer protocol

DNS: used for the domain name resolution service to translate domain names into IP addresses. DNS uses port 53.

SNMP: Simple network management protocol. It uses port 161 to manage network devices. With so many network devices, connectionless services have an advantage.

Trival File Transfer Protocal (TFTP) : a simple File Transfer protocol that uses UDP on port 69.

9, detail TCP three handshake process?

1. Three handshakes

The process of establishing a TCP connection is called handshake. Three TCP segments are exchanged between the client and the server.

Initially both the client and server are in a CLOSED state. In this example, Client A actively opens the connection, and Server B passively opens the connection.

At first, B’s TCP server process creates the transport control block TCB, which is ready to accept connection requests from the client process. The server process is then in the LISTEN state, waiting for a connection request from the client. If yes, respond immediately.

First handshake: A’s TCP client process is also the first to create the transport control block TCB. Then, when a TCP connection is to be established, a connection request packet segment is sent to B with SYN=1 in the header and an initial sequence number seq = x. According to TCP, the SYN segment (SYN = 1) cannot carry data, but consumes a sequence number. At this point, the TCP client process enters the SYN-sent state.

Second handshake: After receiving the connection request packet and agreeing to establish the connection, USER B sends an acknowledgement message to user A. Set the SYN bit and ACK bit to 1, ACK = X + 1, and select an initial sequence number seq = Y. Note that this message segment also does not carry data, but also consumes a sequence number. The TCP server process enters the SYN-RCVD state.

Third handshake: The TCP client process sends an acknowledgement to B after receiving the acknowledgement from B. The ACK of the packet segment is set to 1, the ACK number is Y + 1, and its serial number is seq = x + 1. In this case, the ACK packet segment can carry data. If no data is carried, the sequence number is not consumed. In this case, the sequence number of the next data packet segment is still seq = X + 1. The TCP connection is ESTABLISHED, and A enters the ESTABLISHED state.

10. Why not shake hands twice?

An error occurs in case an invalid connection request segment is suddenly sent to B. For example, the segment of the first connection request packet sent by A is not lost, but is stuck at the network node for A long time, so that it is delayed until A certain period after the connection is released before reaching B. Originally, this is an invalid packet segment. However, after receiving the invalid link request packet segment, USER B mistakenly thinks that User A has sent A new connection request. Then it sends A confirmation message to A, agreeing to establish A connection.

In the above case, if there is no third handshake, B considers that the new transport connection has been established after sending confirmation and waits for data from A. Many of B’s resources are thus wasted.

If A three-way handshake is used, since A has not actually made A connection request, B’s confirmation is ignored and data is not sent to B. Since B receives no confirmation, it knows that A has not requested A connection.

11. Why not four handshakes?

Some people may say that AFTER sending the third handshake message, A has entered the connection state without receiving the request from B. What if the confirmation packet of A is lost or retained?

We need to understand that there is no such thing as a completely reliable communication protocol. After three handshakes, the client and server can confirm the previous communication status and both receive the confirmation message. So even increasing the number of handshakes does not guarantee the reliability of subsequent communication, so it is not necessary.

12. Why does the Server return a SYN from the Client?

The SYN is sent back by the receiver to tell the sender that I have received the SYN.

SYN is a handshake signal used by TCP/IP to establish a connection. In the establishment of a normal TCP network connection between the client and the server, the client first sends a SYN message, the server uses the SYN-ACK response to indicate receipt of the message, and the client Acknowledgement Acknowledgement. Acknowledgment character, a transmission control character sent from a receiving station to a sending station in a data communication transmission. In this way, a reliable TCP connection can be established between the client and server, and data can be transferred between the client and server.

If SYN is passed, why ACK?

The communication between the two parties must be correct. A SYN is passed, proving that the channel from sender to receiver is ok, but the channel from receiver to sender needs an ACK signal for verification.

14, detail TCP four wave process?

After the transmission, both parties can release the connection. Now both A and B are in ESTABLISHED state.

First wave: the application process of A sends A connection release packet segment to its TCP, stops sending data, and closes the TCP connection. A sets the stop control bit FIN at the head of the connection release packet segment to 1, and its serial number seq = U (equal to the serial number of the last byte of the previously transmitted data plus 1). Then A enters the fin-WaIT-1 state and waits for B’s confirmation. Note that TCP specifies that a FIN packet segment consumes a sequence number even if it does not carry data.

Second wave: After receiving the connection release segment, B sends an acknowledgement with ack = U + 1 and the sequence number of the segment is V (equal to the sequence number of the last byte of the data previously transmitted by B plus 1), and then B enters close-wait state. The TCP server process notifies the higher-level application process, and the connection from A to B is released. The TCP connection is half-closed, that is, A has no data to send, but IF B sends data, A still receives it. That is, the connection from B to A is not closed, and may remain so for some time. After receiving the acknowledgement from USER B, user A enters the FIN-WaIT-2 state and waits for the connection release packet segment sent by user B.

Third wave: If B has no more data to send to A, its application notifies TCP to release the connection. In this case, the FIN segment sent by USER B must be set to 1. Suppose B has sequence number W (in the half-closed state, B may send some more data). B must also repeat the confirmation number ack = u + 1 that was sent last time. In this case, B enters the last-ack state and waits for A’s confirmation.

Fourth wave: USER A must acknowledge the connection release packet sent by user B. Set the ACK number to 1, ACK = W + 1, and its sequence number seq = U + 1 (The FIN segment that is sent before consumes one sequence number). Then enter the time-wait state. Note that the TCP connection has not yet been released. A can enter the CLOSED state only after 2MSL (MSL: maximum packet segment life), which is set by the timer, and then revoke the transport control block to end the TCP connection. Of course, if B enters the CLOSED state as soon as it receives A’s confirmation, and then cancellations the transmission control block. Therefore, when releasing the TCP connection, B ends the TCP connection earlier than A.

15, Why must time-wait WAIT 2MSL?

1. Ensure that the last ACK packet sent by USER A reaches USER B. This ACK packet segment may be lost. As a result, USER B in the last-ACK state cannot receive the confirmation of the sent FIN + ACK packet segment. User B retransmits the FIN+ACK packet segment timeout, and user A receives the retransmitted FIN+ACK packet segment within 2MSL (timeout + 1MSL transmission). Then A retransmits A confirmation and restarts the 2MSL timer. Finally, both A and B enter the CLOSED state normally. If USER A in the time-wait state releases the connection immediately after sending the ACK packet, user B cannot receive the FIN + ACK packet retransmitted by user B and does not send the acknowledgement packet again. In this case, user B cannot enter the CLOSED state as required.

2. Prevent invalid connection request packet segments from appearing in the connection. After sending the last ACK packet segment, A can make all the packet segments generated during the duration of the connection disappear from the network after 2MSL. This prevents the old connection request segment from appearing in the next connection.

16, why the second with the third can not merge, the second and third time between the waiting is what?

After the server performs the second wave, at this time prove that the client will not request any data to the server, but the server may also is to give the client sends data (which may be the last time the client requested resource has not been sent), so this time the server will wait for finish before transmission of data transmission to complete before send off the request.

17, the role of the timer?

In addition to the time-wait timer, TCP also has a Keepalive timer. Imagine a scenario where the client has actively established a TCP connection with the server. But then the client’s host suddenly failed. Obviously, the server can no longer receive data from the client. Therefore, something should be done to keep the server from waiting in vain. This is where the survival timer comes in.

Every time the server receives data from the customer, it resets the keepalive timer, usually for two hours. If no data is received from the client within two hours, the server sends a probe segment, then every 75 seconds. If there is no response from the client after 10 consecutive probe segments are sent, the server assumes that the client is faulty and closes the connection.

18. How does TCP ensure reliable transmission?

1. Packet verification: The purpose is to detect any changes in the data transmission process. If an error is detected, the packet segment is discarded and no response is given.

2. Reordering out-of-order packets: TCP packets are transmitted as IP datagrams, and the arrival of IP datagrams may be out of order, so the arrival of TCP packets may also be out of order. TCP will reorder the out-of-order data before handing it to the application layer.

3. Discard duplicate data: Duplicate data can be discarded.

4. Reply mechanism: When TCP receives data from the other end of the TCP connection, it sends an acknowledgement. This confirmation is not sent immediately and is usually delayed by a fraction of a second;

5. Timeout retransmission: After TCP sends a segment, it starts a timer and waits for the destination to confirm receipt of the segment. If an acknowledgement cannot be received in time, the packet segment is resend.

6. Flow control: Each side of the TCP connection has a fixed amount of buffer space. The TCP receiver only allows the other end to send as much data as the buffer on the receiving end can accept. This prevents the fast host from overrunning the buffer on the slow host. This is flow control. TCP uses a variable-size sliding window protocol for flow control.

19. What is your understanding of the agreement to stop waiting?

The stop-wait protocol is to realize reliable transmission. Its basic principle is to stop sending each packet and wait for the confirmation of the other party. Send the next packet after receiving confirmation; In the stop-wait protocol, if the receiver receives a duplicate packet, it discards it, but also sends an acknowledgement. It mainly includes the following situations: no error, error (timeout retransmission), confirmation loss and confirmation lateness, confirmation loss and confirmation lateness.

20. What is your understanding of ARQ agreement?

Automatic retransmission request ARQ protocol

Timeout retransmission in the stop-wait protocol means that the packet that has been sent before is retransmitted if no confirmation is received within a period of time (the packet that has been sent before is considered lost). Therefore, a timeout timer should be set after each packet is sent, and the retransmission time should be longer than the average round trip time of packet transmission. This type of automatic retransmission is often called automatic retransmission request ARQ.

Continuous ARQ protocol

Continuous ARQ protocol can improve channel utilization. The sender maintains a send window in which packets can be sent consecutively without waiting for confirmation. The receiver generally uses cumulative acknowledgements, sending acknowledgements to the last packet arriving in sequence, indicating that all packets up to that point have been correctly received.

21. What do you know about sliding Windows?

TCP uses sliding Windows to implement flow control mechanisms. Sliding window is a flow control technique. In the early days of network communication, communication parties directly sent data regardless of network congestion. Because everyone did not know the network congestion, at the same time to send data, resulting in the middle node blocking switch, no one can send data, so there is a sliding window mechanism to solve this problem.

TCP uses a sliding window for transmission control. The size of the sliding window means how much buffer the receiver can use to receive data. The sender can use the size of the sliding window to determine how many bytes of data to send. When the sliding window is 0, the sender can no longer send datagrams, except in two cases where urgent data can be sent, for example, allowing the user to terminate a running process on the remote machine. Alternatively, the sender can send a 1-byte datagram to inform the receiver to redeclare the next byte it wishes to receive and the size of the sender’s sliding window.

Talk about your understanding of flow control?

TCP uses sliding Windows for flow control. Traffic control is to control the sending rate of the sender and ensure that the receiver can receive data in time. The window field in the acknowledgement packet sent by the receiver can be used to control the size of the sender window, thus affecting the sending rate of the sender. If the window field is set to 0, the sender cannot send data.

What is your understanding of TCP congestion control? What algorithms are used?

Congestion control is different from flow control in that the former is a global process, while the latter is the control of point-to-point traffic. At some point, if the demand for a resource in the network exceeds the available portion of the resource, the performance of the network deteriorates. This condition is called congestion.

Congestion control is designed to prevent too much data from being injected into the network so that routers or links in the network do not become overloaded. Congestion control must be done on the premise that the network can withstand the existing network load. Congestion control is a global process that involves all hosts, all routers, and all factors associated with reducing network traffic performance. In contrast, traffic control is often a point-to-point traffic control, an end-to-end problem. The purpose of flow control is to suppress the rate at which the sender sends data so that the receiver can receive it in time.

For congestion control, the TCP sender maintains a congestion window (CWND) state variable. The size of the congestion control window depends on the congestion level of the network and changes dynamically. The sender makes its send window the smaller of the congestion window and the receiver’s accept window.

TCP congestion control adopts four algorithms: slow start, congestion avoidance, fast retransmission and fast recovery. At the network layer, routers can also adopt appropriate packet discarding policies (such as active queue management AQM) to reduce network congestion.

Slow start:

The idea behind the slow-start algorithm is that when the host starts sending data, if a large number of bytes of data are immediately injected into the network, it may cause network congestion because network compliance is not yet known. Experience shows that the better method is to first detect, that is, gradually increase the sending window from small to large, that is, gradually increase the value of congestion window from small to large. The initial value of CWND is 1, and the CWND is doubled after each transmission cycle.

Congestion avoidance:

The idea of congestion avoidance algorithm is to make the congestion window CWND increase slowly, that is, every round trip time RTT increases the CWND of the sender by 1.

Fast retransmission and fast recovery:

In TCP/IP, Fast Retransmit and Recovery (FRR) is a congestion control algorithm that can quickly recover lost packets.

Without FRR, TCP will use a timer to suspend the transmission if the packet is lost. During this pause, no new or duplicated packets are sent. With FRR, if the receiver receives a piece of data out of order, it immediately sends a repeat acknowledgement to the transmitter. If the transmitter receives three duplicate acknowledgements, it assumes that the data segments indicated by the acknowledgements are missing and immediately retransmits those missing data segments.

With FRR, there is no delay due to the pause required on retransmission. Fast retransmission and Fast Recovery (FRR) work most effectively when a single packet is lost. It does not work very effectively when multiple data packets are lost over a short period of time.

What is sticky bag?

When learning Java NIO, you may find that if the client continuously sends packets to the server, the data received by the server will have two packets stuck together.

1. TCP is based on byte streams. Although the data interaction between the application layer and TCP transport layer is data blocks of different sizes, TCP only regards these data blocks as a series of unstructured byte streams without boundaries.

2. As can be seen from the FRAME structure of TCP, there is no data length field in the header of TCP.

Based on the above two points, packet sticking or unpacking is possible only when TCP is used to transmit data. A packet contains the information of two packets sent by the sender, which is called sticky packet.

The receiver receives two packets, but the two packets are either incomplete or there is an extra one, which is called unpacking and sticky packets. The problem of unpacking and sticky packets makes it very difficult for the receiving end to process because it cannot distinguish a complete packet.

How is TCP sticky packet generated?

The sender generated sticky packets

The client and server that use THE TCP protocol to transmit data often maintain a long connection (no sticky packets exist when data is transmitted once a connection is established). The two sides can transmit data continuously when the connection is constantly open. However, when the number of packets sent is too small, TCP by default enables the Nagle algorithm to merge these smaller packets and send them (buffer data sending is a heap pressure process). This merge takes place in the send buffer, which means the data is sent out in a sticky packet state.

Sticky packets are generated on the receiver

When the receiver uses TCP protocol to receive data, the process is as follows: the data to the receiver is transferred from the bottom of the network model to the transport layer. The TCP protocol of the transport layer is to put it into the receiving buffer, and then the application layer takes the initiative to obtain it (C language uses recV, read and other functions). The problem is that the data reading function we call in the program can’t take the data out of the buffer in time, and the next data comes and part of it is put into the end of the buffer, and when we read the data, it is a sticky packet. (Data loading speed > data loading speed of the application layer)

How to solve the problem of unpacking and sticking the package?

Subcontracting mechanism generally has two general solutions:

1. Special character control;

2. Add the length of the packet in the packet header.

If netTY is used, there are special encoders and decoders to solve the unpacking and sticky packet problems.

UDP does not have sticky packet problems, but has packet loss and disorderly order. Incomplete packages will not be available, and the correct packages will be received. The transmitted data unit protocol is UDP packets or user datagrams, which are neither combined nor split.

27. Do you know anything about HTTP status codes?

1 xx information

1.100 Continue: Indicates that everything is fine so far and the client can Continue sending the request or ignore the response.

2 xx success

1. 200 OK

2. 204 No Content: The request has been processed successfully, but the response message returned does not contain the body of the entity. Typically used when you only need to send information from the client to the server without returning data.

3. 206 Partial Content: Indicates that the client sends a Range request. The response packet contains the entity Content specified by content-range.

3 xx redirection

1. 301 Moved Permanently: Permanently redirects.

2. 302 Found: Temporary redirection;

3. 303 See Other: Has the same function as 302, but 303 explicitly requires that the client should use the GET method to obtain resources.

4. 304 Not Modified: If the request header contains some conditions, for example: If-match, if-modified-since, if-none-match, if-range, if-unmodified-since, If the condition is not met, the server returns the 304 status code.

5. Temporary Redirect: indicates that the browser does not change the POST method to GET method.

4XX Client error

1. 400 Bad Request: Syntax errors exist in the Request packet.

401 Unauthorized: This status code indicates that the request to be sent requires authentication information (including BASIC and DIGEST authentication). If the request has been made before, the user authentication failed.

3. 403 Forbidden: The request is rejected.

4. 404 Not Found

5XX Server error

500 Internal Server Error: An Error occurs when the Server is executing a request.

2. 503 Service Unavailable: The server is temporarily overloaded or is down for maintenance and cannot process requests at this time.

28. What do THE HTTP status codes 301 and 302 represent? What’s the difference?

301,302 are the codes of the HTTP state, which represent the transfer of a URL.

The difference between:

Permanently Moved, Permanently Moved.

Redirect: Temporarily Moved

29, What is the difference between forward and redirect?

Forward and Redirect represent two request forwarding modes: direct and indirect.

Forward: The client and browser only send a request once. A Servlet, HTML, JSP, or other information resource responds to the request by a second information resource. In the request object request, the saved object is shared by each information resource.

Redirect: The server responds to the first HTTP request by directing the browser to another URL.

Here’s a colloquial example:

Direct forwarding is equivalent to: “A borrows money from B, B says no, B borrows money from C, and the message will be passed to A if it borrows or fails”;

Indirect forwarding is equivalent to: “A borrows money from B, B says no, and A borrows money from C “.

What are HTTP methods?

The first line of the request packet sent by the client contains the method field.

1. GET: Obtains resources. Most networks use GET.

2. HEAD: obtains the packet HEAD, which is similar to GET but does not return the body part of the packet.

3. POST: transmits the entity body

4. PUT: uploads files. Since there is no authentication mechanism, anyone can upload files.

5. PATCH: Partially modifies resources. PUT can also be used to modify resources, but can only completely replace the original resources. PATCH allows partial modification.

6. OPTIONS: Queries methods supported by the specified URL.

7. CONNECT: Requires that a tunnel be established while communicating with the proxy server. Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols are used to encrypt communications and transmit them through network tunnels.

8. TRACE the path. The server returns the communication path to the client. When the request is sent, a value is filled in the max-forwards header field, minus 1 for each server passed, and the transmission is stopped when the value reaches zero. TRACE is not usually used, and it is vulnerable to XST attacks (cross-site Tracing).

What is the difference between GET and POST?

GET and POST are essentially HTTP requests, but their roles are defined and adapted to their respective scenarios.

Essential difference: GET is only one HTTP request, POST first sends the request header and then the request body, actually two requests.

1. In terms of functions, GET is used to obtain resources from the server, and POST is used to update resources on the server.

2. From the perspective of REST services, GET is idempotent, that is, read the same resource, always GET the same data, while POST is not idempotent, because each request to change the resource is not the same; Further, GET does not change resources on the server, whereas POST does;

3. From the form of request parameters, the data of GET request will be attached to the URL, that is, the request data will be placed in the request header of THE HTTP packet, so as to? Split URL and transfer data, parameters are concatenated with &. In particular, if the data is alphanumeric, send it as is; Otherwise, it will be coded as Application/X-www-form-urlencoded MIME strings (if Spaces, convert to +, if Chinese/other characters, BASE64 will be used to encode the strings, as follows: %E4%BD%A0%E5%A5%BD, where XX in % XX is the ASCII hexadecimal representation of the symbol); A POST request places the submitted data in the body of an HTTP request message.

4. In terms of security, THE security of POST is higher than that of GET, because the data submitted by GET request will appear in the URL in plain text, and the parameters of POST request are wrapped in the request body, which is relatively safer;

5. In terms of the size of the request, the length of the GET request is limited by the URL length of the browser or the server, allowing the amount of data to be sent is relatively small, while the SIZE of the POST request is not limited.

32, enter the URL address in the browser to display the homepage process?

1. DNS resolution: The browser queries the DNS to obtain the IP address corresponding to the domain name. The process includes searching the DNS cache of the browser, searching the DNS cache of the operating system, reading the local Host file, and querying the IP address from the local DNS server. If the domain name to be queried is included in the resources in the local configuration zone, the resolution result is returned to the client to complete domain name resolution (the resolution is authoritative). If the domain name to be queried is not resolved by the local DNS server, but the SERVER has cached the IP address mapping, invoke the IP address mapping to complete domain name resolution (the resolution is not authoritative). If the local DNS server does not cache the URL mapping, recursive or iterative queries are initiated according to its Settings.

2. TCP connection: After obtaining the IP address corresponding to the domain name, the browser sends a request to the server for establishing a connection and initiates a three-way handshake.

3. Send an HTTP request: After a TCP connection is established, the browser sends an HTTP request to the server.

4. The server processes the request and returns an HTTP packet: The server receives the request, maps the path parameters to a specific request processor, and returns the processing result and corresponding view to the browser.

5. The browser parses and renders the page: the browser parses and renders the view. If it encounters references to static resources such as JS files, CSS files and images, repeat the above steps and request these resources from the server. The browser renders the page according to the resources and data it requests, and finally presents a complete page to the user.

6. The connection ends.

DNS resolution process?

1. Hosts perform recursive query to the local DNS server. Recursive query: If the local DNS server queried by the host does not know the IP address of the domain name to be queried, the local DNS server sends query packets to the root DNS server as the DNS client instead of letting the host perform further query on its own. Therefore, the recursive query returns the IP address to be queried or an error message indicating that the REQUIRED IP address cannot be queried.

2. The local DNS server iterates the query to the root DNS server. The features of iterative query are as follows: When the root DNS server receives an iterative query request packet from the local DNS server, it either gives the IP address to be queried or tells the local server which DNS server you should query next. It then lets the local server perform subsequent queries. The root DNS server usually informs the local DNS server of the IP address of the TOP-LEVEL DNS server that it knows, and then the local DNS server queries the TOP-LEVEL DNS server. After receiving the query request from the local DNS server, the TOP-LEVEL DNS server either provides the IP address to be queried or tells the local server which permission DNS server to query next. Finally, the local DNS server gets the IP address or an error and returns the result to the host that initiated the query.

What do you know about domain name caching?

To improve the EFFICIENCY of DNS query, lighten the load on the server and reduce the number of DNS query packets on the Internet, domain name servers widely use cache to store the records of the recently queried domain names and where to obtain domain name mapping information.

Because the name-to-address binding does not change very often, to keep the content in the cache correct, the DNS server should set a timer for each item and process items that take longer than a reasonable amount of time (for example, two days per item). When a DNS server is asked to query an item of information after it has been removed from the cache, it must revert to the DNS binding information authorized to manage the item. When the permission server answers a query request, the response specifies the valid time value of the binding. Increasing this value reduces network overhead, and decreasing this value improves the accuracy of domain name resolution.

Caching is not only required in the local domain name server, but also in the host. Many hosts download their entire database of names and addresses from the local server at startup, maintain a cache of their most recently used domain names, and use DNS only when names cannot be found in the cache. The host that maintains the local DNS database should periodically check the DNS server for new mapping information, and the host must remove invalid entries from the cache. Since domain name changes are infrequent, most nodes can maintain database consistency with little effort.

What is your understanding of HTTP long and short connections? Which scenarios do they apply to?

Short connections are used by default in HTTP/1.0. That is, each time the client and server perform an HTTP operation, a connection is established and broken at the end of the task. When the client browser accesses an HTML or other type of Web page that contains other Web resources (such as JavaScript files, image files, CSS files, etc.), the browser re-establishes an HTTP session each time it encounters such a Web resource.

From HTTP/1.1 onwards, long connections are used by default to preserve the connection feature. HTTP with long connections adds this line of code to the response header

Connection:keep-aliveCopy the code

In the case of a long connection, when a web page is opened, the TCP connection between the client and the server for the transmission of HTTP data is not closed. When the client accesses the server again, it continues to use the established connection.

Keep-alive does not hold a connection forever, but has a hold time that can be set in different server software such as Apache. To implement persistent connections, both clients and servers must support persistent connections.

What are the major changes between HTTP 1.0 and 1.1 and 1.2?

Key changes to HTTP1.1:

1. HTTP1.0 has evolved over the years, and improvements are proposed in 1.1. The first was the introduction of long connections, where HTTP can send requests continuously over a TCP connection.

2. Then HTTP1.1 supports sending only headers without sending bodies. The reason for this is to use the header first to determine whether the request is successful and then send the data to save bandwidth. In fact, this is what post requests do by default.

3. Host field of HTTP1.1. Since a virtual host can support multiple domain names, the domain name is resolved to host.

Key changes to HTTP2.0:

1. HTTP2.0 support multiplexing, the same connection can concurrently process multiple requests, the method is to disassemble HTTP packets into multiple frames, concurrent orderly send, according to the serial number on the other end of the reorganization, without the need for a HTTP request order to arrive;

2. HTTP2.0 support server push, is the server in the HTTP request after the arrival, in addition to return data, but also push additional content to the client;

3. HTTP2.0 compresses the request header, while the basic unit is binary frame stream, which takes up less space;

4. HTTP2.0 is suitable for HTTPS scenarios because it adds an SSL layer between HTTP and TCP.

How does HTTPS work?

1. The client sends its supported encryption rules to the server, indicating that the server needs to connect.

2. The server selects an encryption algorithm, hash algorithm, and its own identity information (such as its address) to send a certificate to the browser. The certificate contains the server information, encrypted public key, and certificate method authority.

3. After receiving the certificate from the website, the client should do the following:

3.1 Verifying the validity of the certificate;
3.2 If the certificate is authenticated, the browser will generate a series of random numbers and encrypt them with the public key in the certificate.
3.3 The handshake message is calculated using the hash algorithm, encrypted with the generated key, and then sent to the server together.

4. After receiving the message from the client, the server does the following:

4.1 Use the private key to parse the password, use the password to parse the handshake message, and verify whether the hash value is the same as that sent by the browser.
4.2 Encrypting Messages with keys

5. If the hash values are the same, the handshake succeeds.

What is the difference between HTTP and HTTPS?

1. Cost: HTTPS requires the CA to apply for a certificate. Generally, there are few free certificates and fees are required.

2. Resource consumption: HTTP is a hypertext transmission protocol, and information is transmitted in plain text. HTTPS is a secure SSL encryption transmission protocol, which consumes more CPU and memory resources.

3. Different ports: HTTP and HTTPS use completely different connection methods and use different ports, the former is 80, the latter is 443;

4. Security: HTTP connections are simple and stateless; HTTPS is a network protocol that uses TSL+HTTP to encrypt transmission and authenticate identity. HTTPS is more secure than HTTP.

39. Advantages and disadvantages of HTTPS?

Advantages:

1. Use HTTPS to authenticate users and servers to ensure that data is sent to correct clients and servers.

2. HTTPS is a network protocol constructed by SSL and HTTP for encrypted transmission and identity authentication. It is more secure than HTTP, preventing data from being stolen or changed during transmission and ensuring data integrity.

3. HTTPS is the most secure solution under the current architecture, and while it is not absolutely secure, it significantly increases the cost of man-in-the-middle attacks.

Disadvantages:

1. HTTPS handshake is time-consuming, which lengthens the page loading time by nearly 50% and increases power consumption by 10% to 20%.

2. HTTPS connection caching is not as efficient as HTTP, which increases data overhead and power consumption, and even affects existing security measures.

3. SSL certificates need money, the more powerful the certificate costs more, personal websites, small websites are not necessary generally will not use;

4. SSL certificates usually need to be bound to IP addresses. Multiple domain names cannot be bound to the same IP address, because IPv4 resources cannot support such consumption.

5. The HTTPS protocol has a limited range of encryption and plays little role in hacker attacks, denial of service attacks, server hijacking and other aspects. Most importantly, the SSL certificate credit chain system is not secure, especially in cases where some countries can control the CA root certificate, man-in-the-middle attacks are just as feasible.

What is a digital signature?

In order to avoid data being replaced during transmission, for example, hackers modify the content of your message, but you do not know, so we ask the sender to do a digital signature, encrypt the digest message of the data, such as MD5, get a signature, and send it together with the data. The receiver then encrypts the digest with MD5 encryption. If it is the same as the signature, the data is true.

What is a digital certificate?

In symmetric encryption, both parties use public keys for decryption. Although digital signatures can ensure that data is not replaced, the data is encrypted by the public key, and if the public key is also replaced, the data can still be forged because the user does not know that the public key provided by the other party is actually fake. Therefore, to ensure that the public key of the sender is genuine, the CA certificate authority issues a certificate that is guaranteed to be genuine. When the user requests the server, the server issues the certificate to the user. The certificate is recorded by the built-in certificate of the system.

What is symmetric encryption and asymmetric encryption?

Symmetric key encryption means that encryption and decryption use the same key. The biggest problem in this mode is key transmission, that is, how to securely send the key to the other party.

Asymmetric encryption refers to the use of a pair of asymmetric keys, that is, a public key and a private key. The public key can be distributed freely, but the private key is known only to itself. The party that sends the ciphertext uses the other party’s public key for encryption. After receiving the encrypted information, the other party uses its own private key to decrypt the encrypted information.

Asymmetric encryption is secure because it does not need to send a private key for decryption. But it’s very slow compared to symmetric encryption, so we still have to use symmetric encryption to send messages, but symmetric encryption uses keys that we can send through asymmetric encryption.

Computer networks too difficult? It’s enough to know this one