Computer network is a required course of computer science and technology major, but also mobile terminal, front-end, back-end will involve and use the knowledge point, you can imagine its importance. As a result, it has become a common question in iOS interviews. If you are preparing for an interview, you must not miss the knowledge points related to the Internet. Here is a summary of some I think useful and recent interview network related knowledge.

I wrote an article last year on the summary of Illustrated TCP/IP, which you can also read.

How are computer networks layered

There are two hierarchical models for the network. One is the OSI (Open System Interconnect) model formulated by ISO (International Organization for Standardization), which divides the network into seven layers. One is TCP/IP four-layer network model. OSI is an academic international standard, ideal concept, TCP/IP is a de facto international standard, is widely used in real life. The relationship between the two can be seen in this diagram:

Note: The five-layer model is different from the four-layer model. In the OSI model, the data link layer is used as two layers, and the physical layer is combined into one layer called the network interface layer. Generally, as an interview question, it is necessary to speak out the OSI seven-layer model.

The meaning of each layer and the relationship between them is shown in this diagram:

The Http protocol

HTTP Protocol Features

  • HTTP is an application-layer protocol based on TCP/IP. The default port number is 80
  • Flexibility: HTTP allows the transfer of data objects of any type. The Type being transmitted is marked by content-type.
  • Stateless: Connectionless means that only one request can be processed per connection. When the server finishes processing the customer’s request and receives the customer’s reply, it disconnects.
  • Stateless: THE HTTP protocol is stateless. Stateless means that the protocol has no memory for transactions. The lack of state means that if the previous information is needed for subsequent processing, it must be retransmitted.

Request method

  • GET: Requests to obtain the resource identified by the request-URI. Request parameters are appended to the URL and displayed in plain text.

  • POST: Appends new data to the resource identified by the request-URI. It is used to modify the server resource or submit the resource to the server. The body of the POST request is placed in the body, and you can specify the encoding mode, which is more secure.

  • HEAD: Indicates the response message header that requests to obtain the resource identified by the request-URI.

  • PUT: The Request server stores a resource and identifies it with a request-URI.

  • DELETE: Requests the server to DELETE the resource identified by the request-URI.

  • TRACE: The request server sends back the received request information for testing or diagnosis.

  • OPTIONS: Requests queries about server performance, or queries about OPTIONS and requirements related to resources.

Request and response packets

In the link, for example: zhangferry.com/2019/08/31/…

View the Headers information for its request in Chrome.

General

Here the URL for the request is marked, and the request method is GET. If the status code is 304, the file is not modified and can be used directly. The remote address is 185.199.111.153:443, which is the Github server address because my blog is deployed on Github.

Besides 304, there are other status codes, which are:

  • 200 OKThe client request succeeds. Procedure
  • 301 Moved PermanentlyRequest permanent redirection
  • 302 Moved TemporarilyRequest temporary redirection
  • 304 Not ModifiedFiles are not modified. You can use the cached files directly.
  • 400 Bad RequestThe client request was not understood by the server because of a syntax error.
  • 401 UnauthorizedThe request is unauthorized. This status code must be used with the wwW-Authenticate header field
  • 403 ForbiddenThe server received the request but refused to serve it. The server usually gives a reason for not providing service in the body of the response
  • 404 Not FoundThe requested resource does not exist; for example, an incorrect URL was entered
  • 500 Internal Server ErrorThe server fails to complete client requests due to an unexpected error.
  • 503 Service UnavailableThe server is currently unable to process client requests and may recover after a period of time.

The Response Headers:

Content-encoding: specifies the compression algorithm

Content-length: Indicates the size of the resource, expressed in decimal bytes.

Content-type: indicates the media type of the resource. The content type shown in the figure is HTML text type, and the text encoding is UTF-8

It was last modified on June 8

Status: 304 The status code of the file is not modified

Note: Content-type indicates the format to be parsed in the response header. Represents the format of the content uploaded to the server in the request header.

The Request Headers.

:method: Indicates a GET request

:path: INDICATES the URL path

:scheme: HTTPS request

Accept: Notifies the server of the types of data that can be returned.

Accept-encoding: Encoding algorithm, usually compression algorithm, that can be used for the resources sent back

Accept-language: indicates the language that the notification server expects to send back. This is a reminder that the user is not necessarily in full control: the server should always be careful not to overwrite the user’s explicit choices (such as selecting a language from a drop-down list).

Cookie: Browser cookie

User-agent: A user agent that tags the system and browser kernel

For more field meanings of request headers, see HTTP headers

TCP the process of three handshakes and four waves and why three and four

Before we look at the TCP handshake, let’s look at the TCP packet style:

The Control Flag marks the states of the handshake phase.

TCP Three-way handshake

The schematic is as follows:

The three-way handshake means that when a TCP connection is established, the client and server need to send a total of three data packets.

(SYN=1, seq=x)

The client sends a TCP PACKET with the SYN flag at position 1, indicating the server port to which the client intends to connect, and the initial Sequence Number X, which is stored in the Sequence Number field of the packet header.

After the sending is complete, the client enters the SYN_SEND state.

1, ACK=1, seq=y, ACKnum=x+1

The server sends an ACK packet in response. That is, the SYN flag bit and ACK flag bit are both 1. The server selects its ISN sequence Number, places it in the Seq field, and sets the Acknowledgement Number to the ISN plus 1 (X+1) of the client. After the sending is complete, the server enters the SYN_RCVD state.

(ACK=1, ACKnum=y+1)

The client sends an ACK packet with the SYN flag bit being 0 and the ACK flag bit being 1. The CLIENT sends an ACK packet with the SEQUENCE number field +1 sent by the server in the CONFIRMATION field, and writes the ISN +1 in the data segment

After the packet is sent, the client enters the ESTABLISHED state. When the server receives the packet, the client also enters the ESTABLISHED state. The TCP handshake ends.

Question 1: Why do you need three handshakes?

In “Computer Networks” by Xie Xiren, “in order to prevent the failure of the connection request packet segment suddenly sent to the server, and thus the error.” How do you think about it? Let’s say we have a situation where we have a segment of the first handshake that was used to establish a connection and then it gets sent to the server because it’s been stuck in the network for a long time. The server responds with an ACK. If only two handshakes are needed to establish a connection, the server waits for the client to send the data after the connection has been established. This is just a delayed and abandoned request, is not a waste of a lot of server resources.

From another perspective, TCP is a full-duplex communication mode. You need to ensure that both ends have established reliable and effective connections. In the course of three handshakes, we can confirm the status:

First handshake: The server confirms that it received OK, and the server confirms that the client sent OK.

Second handshake: The client confirms that it sent OK, the client confirms that it received OK, the client confirms that the server sent OK, and the client confirms that the server received OK.

Third handshake: The server confirms that it sent OK, and the server confirms that the client received OK.

To achieve full duplex, you need to shake hands three times to ensure that both you and the other party can receive and send messages.

TCP waves four times

The schematic is as follows:

Four waves indicate that four packets are to be sent, and the purpose of the wave is to disconnect.

1, first wave (FIN=1, seq=x)

Suppose the client wants to close the connection. The client sends a packet with the FIN flag of 1, indicating that it has no data left to send, but it can still accept data.

After the sending is complete, the client enters the FIN_WAIT_1 state.

(ACK=1, ACKnum=x+1)

The server acknowledges the client’s FIN packet and sends an acknowledgement packet indicating that it has received the client’s request to close the connection, but is not ready to close the connection.

After sending, the server enters the CLOSE_WAIT state. After receiving the acknowledgement packet, the client enters the FIN_WAIT_2 state and waits for the server to close the connection.

3, third wave (FIN=1, seq=y)

When the server is ready to close the connection, it sends a request to the client to close the connection. Set FIN to 1.

After the sending is complete, the server enters the LAST_ACK state and waits for the last ACK from the client.

(ACK=1, ACKnum=y+1)

The client receives a shutdown request from the server, sends an acknowledgement packet, and enters the TIME_WAIT state to wait for a possible ACK packet to be retransmitted.

After receiving the acknowledgement packet, the server closes the connection and enters the CLOSED state.

If the client does not receive an ACK from the server after a fixed period of time (2MSL, 2 Maximum Segment Lifetime), it considers that the server has CLOSED the connection and enters the CLOSED state.

Question 1: Why do you need four waves? Why can’t THE ACK and FIN packets be sent together?

When receiving a FIN packet, the server may not close the SOCKET immediately. Therefore, the server must first reply an ACK packet to the client, informing the client that the FIN packet you sent was received. The server can send a FIN packet only after all packets are sent. Therefore, the ACK and FIN packets must be sent separately. As a result, four waves are required.

Question 2: Why do you have to wait for 2MSL to enter the CLOSED state after TIMED_WAIT?

MSL is the maximum life cycle of TCP packets, because a TIME_WAIT of 2MSL ensures that the unreceived or late segments in both transmission directions have disappeared, and theoretically ensures that the last packet arrives reliably. If the last ACK is lost, the server will resend a FIN. If the client process is gone, but the TCP connection is still there, the server can still resend the LAST_ACK.

The flow of the HTTPS

HTTPS = HTTP + TLS/SSL (default port 443)

1. The client requests the server for the first time, tells the server its supported protocol version, supported encryption algorithm and compression algorithm, and generates a random number (client random) to inform the server.

2. The server confirms the encryption method used by both parties, and returns the client certificate and a server random number.

3. After receiving the certificate, the client first verifies the validity of the certificate, then generates a new random number (premaster secret), encrypts the random number using the public key in the digital certificate, and sends it to the server.

4. After receiving the encrypted random number, the server will use the private key to decrypt it and obtain the random number (premaster secret)

5. The server and client generate “session key” by using the first three random numbers (Client Random, Server random, premaster secret) according to the convention encryption method. Used to encrypt the rest of the conversation (symmetric encryption).

There is an article that introduces HTTPS from the shallow to the deep can read: Learn HTTPS by picture

Question 1: Why do the handshake process require three random numbers, and security only depends on the third random number?

The first two random numbers are transmitted in plain text, which may be intercepted. The third random number is encrypted by the certificate public key, so it ensures the security of the whole process. The purpose of the first two random numbers is to ensure that the final conversation key is “more random.”

Question 2: How does Charles implement HTTPS interception?

Charles implements HTTPS interception by installing Charles’s certificate on the client and trusting it. Charles then acts as a middleman, acting as a server to the client and a client to the server.

Question 3: Why some HTTPS requests (such as wechat) capture results are still encrypted, how to achieve?

I didn’t catch the session request during the chat, but I did catch an encrypted message when the applet started. I manually trigger the link to download an encrypted file, which I assume is content-level encryption, and its decryption is done by the client rather than during the HTTPS setup.

In addition, in the process of studying this problem, some interesting questions were found:

1. The three HTTPS requests shown in the figure correspond to three different types of ICONS. What do they mean?

Thanks iOS for that simple answer. The first icon means HTTP/2.0, the second icon means HTTP/1.1, and the third icon is locked because I only used Charles to grab the request on port 443, which is port 5228, so it is inaccessible.

2, the third request https://mtalk.google.com:5228 icon and content are added a lock, the lock is on the HTTPS and add a layer of the lock?

There is no definite answer to these questions, hope to understand the small partner to inform ha.

DNS Resolution Process

Domain Name System (DNS). DNS is a distributed database that maps domain names and IP addresses on the Internet, enabling users to access the corresponding server (IP address) through domain names. The detailed parsing process is as follows:

1. Enter the domain name of the website you want to access in the address box of the browser. The operating system checks whether the local hosts file has the mapping relationship with the url. If not, go to step two.

2. The client sends a query request to the local DNS server. If the local DNS server receives the request and finds the domain name in the local configuration area, the DNS server returns the query result to the client. If not, go to step three.

3. Based on the Settings of the local DNS server, recursive or iterative query is performed until the resolution is complete.

The recursive query and iterative query can be represented by the following two figures.

Recursive query

As shown in the figure, the recursive query is delivered by the DNS server level 1 query.

Iterative query

If yes, the iterative query is to find the specified DNS server, and the query is initiated by the client.

DNS hijacking

DNS hijacking occurs on a DNS server when a client requests to resolve a domain name and directs it to the wrong server (IP) address.

Common solutions are to use your own resolution server or to issue the domain name as an IP address to bypass DNS resolution.

The difference between cookies and sessions

HTTP is a stateless protocol, meaning that it cannot distinguish and manage requests and responses in terms of state. In other words, the server has no way of knowing the identity of the client from the network connection alone.

But what to do? Just issue a pass to each client, one for each client, and whoever visits must carry their own pass. So the server can identify the client from the pass. That’s how cookies work.

  • Cookie: Cookie is a mechanism by which the client saves user information. It is used to record some information about the user. In fact, Cookie is a small piece of text stored by the server on the local machine and sent to the server with each request. Cookie technology controls client status by writing Cookie information into request and response packets.

  • Session: The Session mechanism is a server-side mechanism that uses a hash table-like structure to store information. When a user requests to create a session, the server first checks to see if the client already contains a session ID. If so, it retrieves the session using the session ID. If not, create a Session ID corresponding to this Session. This session ID is returned to the client in this response.

There are the following differences:

1. Storage location: Cookies are stored on the client and Session data is stored on the server.

2. Sessions run on Session IDS, which are stored in cookies. If the browser disables cookies, the Session will be invalid

3. Security: Cookies are stored in the browser and may be copied or tampered with by some programs; It’s much safer to have a Session on the server.

4. Performance: The Session will be stored on the server for a certain period of time. When the number of visits increases, it will cause some pressure on the server. In order to reduce server stress, cookies should be used

What is a CDN for

Content Delivery Network (CDN) is designed to deliver the Content of a website to the “edge” of the Network closest to users, so as to improve the access speed of users. In summary: CDN = Mirror + Cache + GSLB.

At present, CDN mainly cache static data in the website, such as CSS, JS, pictures and static web page data. Users request dynamic content from the master server and then download these static data from the CDN, so as to accelerate the download speed of web page data content. For example, more than 90% of the data of Taobao is provided by CDN.

CDN workflow

If a user accesses a static file (such as CSS) and the domain name of the static file is www.baidu.com, the domain name will be pointed to the CDN load balancing server in the global CDN, and then the load balancing server will allocate the access user to the CDN node nearest to the access user. Then the user will directly go to the CDN node to access the static file. If the requested file does not exist in this node, the user will go back to the source site to obtain the file, and then return it to the user.

Reference: In-depth understanding of Http requests, DNS hijacking, and resolution

The role of the Socket

The socket is located between the application layer and the transport layer:

Its function is to enable the application layer to transmit data more conveniently through the transport layer. So its essence is to encapsulate TCP/IP, and then applications can directly call the Socket API to communicate. The three handshakes and four waves mentioned above are done through the socket.

We can find BSD Sockets from the network library layer in iOS, which is located under CFNetwork. There is also a CFSocket in the CFNetwork, which presumably encapsulates BSD Sockets.

What is WebRTC for

WebRTC is an API that can be used in Web apps such as video chat, audio chat or P2P file sharing. With WebRTC, you can add real-time communication capabilities to open standards-based applications. It supports sending video, voice, and generic data between peers, enabling developers to build powerful voice and video communication solutions. The technology is available on all modern browsers as well as native clients on all major platforms. The WebRTC project is open source and supported by Apple, Google, Microsoft and Mozilla.

If a request has a high failure rate at a particular time in a particular place, what are the reasons

This is an open question during the second interview of a company. I have summarized the following possible points:

1. The number of requests at this time is too large

2. The network nodes in this area are unstable

3. User behavior habits, such as the rush hour or the specific habits of a group

If there are more familiar with the network aspects of small partners can also be added.