preface

Recently, I decided to reorganize my knowledge system to fill in the gaps. After reading the book “Illustrated HTTP”, I think this book is easy to understand and simple for front-end students compared with boring network data. It is very suitable for front-end students to read on and record the knowledge and experience summarized after reading it

TCP/IP protocol base

1. Concept of TCP/IP protocol family

Communication between computer networks must be based on certain rules and languages, and these rules are protocols. There is more than one protocol. The collection of protocols associated with the Fish Internet is called the TCP/IP protocol family (from Illustrated HTTP).

2. The TCP/IP layer

Functions of network layer design: After layer design, when a layer is changed or modified, it is not necessary to replace the whole layer, but only need to modify the corresponding layer. After the interface part of each layer is deployed, each layer performs its own functions without interference.

Below is the OSI seven layer model and a comparison between the TCP/IP model and overview (from: https://juejin.cn/post/6844903510509633550, thanks to the author)

TCP/IP model is divided into four layers: application layer, transport layer, network layer, link layer from top to bottom

  • Application layer: a series of communication activities when providing application services to users. Common application service protocols include FTP,DNS, and HTTP
  • Transport layer: Used to transmit data packets between two network devices using TCP and UDP
  • Network layer: The main function is to choose a transmission line in the vast Internet for two devices to communicate
  • Link layer: The part of the network used to connect hardware, device drivers, network cards, optical fibers, etc

3.TCP/IP traffic

The communication process from the application layer to the receiver is illustrated in a complete process.

  • The sender of the data starts at the application layer, and we often access the server by domain name, that is, communicating with each other, rather than by IP address. So the application layer DNS service comes into play. It looks up the IP address by domain name, and determines the IP address of the corresponding person.

  • At the transport layer, TCP divides data into packet segments for transmission in order to transmit data easily. After sending out packets, TCP will confirm whether the packets are successfully delivered to the other party. Therefore, TCP adopts the three-way handshake policy, which will be explained in details later
  • Packets are available, but which route do they travel across the intricate network? The most prominent protocol in this layer, Internet Protocol (IP), comes into play, hence the IP layer. The IP protocol queries its own routing table according to the IP address of the peer.

On the CLI, run route print to view the routing table

If a route to the destination network cannot be found in the routing table, the device uses the default route (for example, 192.168.95.254 in the figure above) to obtain the MAC address of the default gateway of the next destination through THE ARP protocol. A MAC address is the physical id of a network adapter. In this way, the network adapter searches for the forwarding route of the next station and sends the MAC address to the communication target

  • Once you have the physical address, you can forward it to the link layer, and the communication request is ready

In contrast to the sending side, the receiving side processes from the link layer to the application layer. The following figure illustrates the complete flow (from Illustrated HTTP)

4. Three handshakes

TCP is a reliable transmission protocol, so a three-way handshake is used to make sure that both ends are ready before transmitting data.

Specific process:

  • The client sends a SYN packet (seq= A) to the server. The client enters the SYN_SEND state and waits for confirmation from the server.
  • After receiving a SYN packet, the server responds with a SYN+ACK packet with ACK = A +1 and seq = B, and enters the SYN_RECV state. The server places requests for connections in this state in a queue, which is called a half-connection queue.
  • After receiving a SYN packet from the server, the client sends an ACK packet with ACK id = B +1 and send SEQUENCE number seq = A +1. The client enters the Established state. Three handshakes complete.

The whole process can be colloquially understood as:

  • The client asks the server: Are you there?
  • Server: YES
  • Client: I’m going to start sending messages

Why do three-way handshake connections:

To prevent an error caused by an invalid connection request message reaching the server. The primary purpose of the three-way handshake is to ensure that the connection is duplex and reliable, more so through the retransmission mechanism.

TCP establishes a connection and does not wait for a reply. If a timeout occurs, the TCP initiates the request again, and the previous request is invalid. Before such as for some reason, the client sends a connection request arrives at the server, but the request has a timeout invalid, but the server receives this request will be answered after a connection is established, if is two shake hands, server to response the connection has been established, will have been waiting for the client to communicate, but this request has been canceled, The server does not want to establish a connection, so it does not care about the server, so the server’s resources are wasted

5. Wave four times

When disconnecting a TCP connection, the client and server need to send data packets four times to confirm the disconnection

Connection process:

  • First wave: The client sends a FIN packet with the serial number seQ = A to close the connection between the client and the server, and the client enters the FIN_WAIT_1 state.
  • Second wave: After receiving the FIN, the server sends an ACK to the client. Seq = a+1 (The same as SYN, one FIN occupies one sequence number). The server enters CLOSE_WAIT state.
  • Third wave: The server sends a FIN with the sequence number seq = B and ACK = A + 1 to disable data transfer from the server to the client. The server enters the LAST_ACK state.
  • Fourth wave: After receiving the FIN, the client sends an ACK to the Server with the serial number seq = B +1. The Server enters the CLOSED state, the client enters the TIME_WAIT state, and the Server enters the CLOSED state. However, the client still does not receive any reply after 2MSL waiting. It proves that the Server is shut down, and the client can close the connection.

Popular terms

  • Client: I’m going to close the connection
  • Server: Got it. I still have work to finish. Hold on
  • Server: Ok, I can close it on my side
  • Client: Ok, confirm closing.

Why do you wave four times

To ensure the integrity of data transmission, the ACK and FIN on the server are developed separately when the client initiates a request to close the connection. To ensure that data transmission is complete, the client can be informed to close the connection

Why does the client need 2MSL wait state

  • Ensure that the ACK sent by the client on the fourth handshake reaches the server. If it is not received within 2MSL, the client will resend the ACK
  • Ensure that packets that exist on the network disappear because of expiration (each specific TCP implementation must select a packet segment maximum lifetime MSL. It is the maximum time that any segment of a message is discarded in the network. 2MSL is enough to make all packets expire.

2. Understanding the HTTP protocol

  • HTTP is an application-layer protocol used for communication between clients and servers
  • When using HTTP communication, one end of a line must be a client. At the other end is the server.

The client sends requests and the server responds to achieve communication

1. The WEB server

HTTP communication requires a server, so a brief introduction to the Web server.

(1) with virtual host can achieve a web server deployment of multiple domain name website;

  • The use of virtual host allows a server to host multiple host names and domain names of the website, the domain name can be mapped to the IP address of the server through the DNS service. Therefore, in HTTP requests, the HOST in the request header must specify the url of the HOST name or domain name

(2) Data forwarding programs that cooperate with the server

  • Proxy: A forwarding application that acts as a middleman between a client and a server. There are forward proxies (proxy clients) and reverse proxies (proxy servers), commonly known as nginx reverse proxies
  • Gateway: Enables the client to communicate with non-HTTP services after processing through the intermediate gateway
  • Tunnel: A tunnel does not process HTTP requests but establishes secure communication lines between the client and server through encryption

2.HTTP is a stateless protocol, and cookie technology is introduced to manage state

With THE HTTP protocol, every time a new request is sent, a new response is generated, without retaining all information about the previous request or response. However, there are often situations in use where state needs to be preserved, such as identifying login status and identifying the identity of the user sending the request. Cookie technology was introduced. Principle of action:

  • The client sends a request to send login information such as the user name and password to the server
  • After receiving the login information, the service authenticates, and then records the authentication status and sessionID binding at the back end. In response, the response header contains the set-cookie field written into the sessionID
  • After receiving the response, the client saves or updates the local cookie. When sending a request to the server next time, the client automatically sends the cookie information to the server. The server can identify the user and the authentication status of the user through the information in the cookie field

3.HTTP supports persistent connections

In the early days, every HTTP communication will be a TCPL connection, and the next time to start a TCP connection, with the development of Web technology, a large number of text pictures and other content need to be transmitted and communication, so every time to carry out TCP connection can not meet, not only let the server load pressure, HTTP requests and so on can be slow

Therefore, the HTTP keep-alive function was added in HTTP/1.1 in 1999. As long as the connection is established, HTTP requests can be sent multiple times until the connection is disconnected

HTTP packets and status codes

1.HTTP request packets and response packets

Many projects now have a separate front and back end, which should be familiar with requests and responses,

The request message

The request message contains the request header and the content body. The request header contains the request line (request method, URL, HTTP version) and the request header field

(From Illustrated HTTP)

Response packet A response packet also contains a response header and the subject of the response content. The response header consists of the response line (HTTP version and status code) and the response header field

(From Illustrated HTTP)

The first field

There are many header fields, each has its own use, including the request header field and the response header field, familiar with the meaning of commonly used fields, not common can refer to the relevant documents, here will not be detailed. The following figure shows the request header field of a request in a Web site

Cookie related header field

Cookie technology is not included in the standardized RFC, but it has been widely used. There are only two header fields for cookie service

  • Set-cookie: in response to the header field, the server puts relevant authentication information in set-cookie and sends it to the client, and the client updates the cookies saved locally
  • Cookie: the cookie header field. When the client sends a request, it automatically carries the locally saved cookie value and sends it to the server

Cookie strings are typically composed of attribute names = XXX; Form of. For example the Set – cookies: status = enable; Expires =Tue, 05 Jul 2011 07:26:31, the values of the set-cookie field have the following meanings:

2. Status code of the HTTP response

Response code is also called status code. When a server responds to a request from a client, the status code is used to describe the returned processing result. Status codes can be divided into the following categories

(From Diagram HTTP.) Some of the most commonly used status codes are:

  • 200: Request succeeded

  • 204: Request successful, but response only response header without response content

  • 206: The client executes a range request, and the server executes it successfully

  • 301: Permanent redirect, the URL of the requested resource address has changed

  • 302: Temporary redirect. The requested resource is temporarily assigned a new URL

  • 303: does the same as 302 and indicates that the client should use the GET method to access another URI

  • 304: Tells the client that the requested content has not changed since the last access. The client can fetch the resource directly from the browser cache

  • 400: Syntax error, the server cannot parse

  • 401: Indicates that the sent request requires authentication or the authentication fails and expires

  • 403: The server denies services and access

  • 404: No requested resource on the server

  • 500: An error occurred during internal server execution

  • 503: The server is unable to process the request. The situation is temporary and will recover over time. If the delay time is expected, the response can include a retry-after: header to indicate the delay time. If this retry-after: information is not given, the client should process it as if it were a 500 response

HTTPS = HTTP + Encryption + Authentication + Integrity protection

We have looked briefly at HTTP, but there are still some shortcomings of HTTP, such as:

  • Communications use clear text (not encrypted) and the content is easily eavesdropped
  • The identity of the communicating party is not verified, so it is possible to encounter camouflage
  • The integrity of the communication packet cannot be proved, so the packet may have been tampered with

HTTPS is not a new protocol, but SSL and TLS are added to the HTTP and TCP communication interfaces. After SSL is adopted, HTTP has encryption, certificate and integrity protection functions.

(From Illustrated HTTP)

However, it is worth noting that SSL is independent of HTTP and can be used with other protocols running at the application layer. SSL is the most widely used network security technology at present

(From Illustrated HTTP)

1. Encryption and transmission keys

Encryption works in two ways

  • (1) communication encryption, HTTP communication after establishing a secure communication line
  • (2) Encrypt the communication content.

At present, the web service mainly encrypts the content. After encrypting the content, the communication party is required to have both encryption and decryption mechanism.

Shared key encryption:

The encryption and decryption mode that uses the same key is called shared key encryption or symmetric key encryption. In this mode, the encryption key must be sent to the peer party before the peer party can decrypt the content. So how do you ensure that the key is transmitted safely to the other party without being stolen, so there are two public key encryption

Public key encryption

A public key uses a pair of asymmetric keys, one private key and one public key. As the name implies, the private key is kept by itself. No one can know. The public key is public.

Communication, the sender use each other’s public key to encrypt the content processing backwardness to each other, each other to receive encrypted content after using your private key to decrypt, thus well solve the key problem for the transmission and now want to according to the cipher text try to restore to the original information and open information is difficult.

However, the problem is how to determine the correctness of the public key. You can use the public key certificate issued by the digital certificate Authority and its related organizations. Service provider to the certification agency put forward the application of public key, digital signature certification institution after confirm the identity of the applicant, will be carried out on the filed of public key digital signature, and the public key in the public key certificate after binding together to send to the client, the client receives to certificate the above after the digital signature verification, If the authentication succeeds, the public key is trustworthy. However, it is also difficult to safely transfer the issued public key to the client, so most client developers (such as browsers) will embed the public key of the commonly used certification authority in advance

HTTPS uses a mixture of shared key encryption and public key encryption

This is because public key encryption is slower than shared key encryption. Therefore, combined with the advantages of the two, in the stage of exchanging the key, the public key encryption is used to securely transfer the key to the other party, and then symmetric encryption is used to communicate.

(From Illustrated HTTP)

2. Identity authentication of the communication party

SSL uses certificates to determine the identity of the communication party, which are issued by a digital certificate Authority. We usually call them CA certificates. As mentioned above, most client developers (e.g., browsers) pre-embed public keys from commonly used certification authorities. The client uses the PUBLIC key of the CA to authenticate the certificate sent from the server. If the authentication succeeds, the certificate is sent from the server. Otherwise, it is an invalid certificate.

3. Ensure complete correspondence

HTTPS sends data at the application layer along with a packet digest called MAC. The digest uses a single Hash function to “digest” the plaintext to be encrypted into a string of fixed length (128-bit) ciphertext. Different plaintext digest results are always different, and the same plaintext digest must be the same.

After receiving the data, the receiver uses the HASH function to generate a summary of the received text and compares it with the decrypted text sent by the sender. If they are the same, the received text is complete and has not been modified during transmission. Thus the integrity of the data is verified.

For very important confidential data, the server also needs to validate the client to ensure that the data is transmitted to a secure and legitimate client. Authentication is also done by asking the client to send a certificate. For example, u-shield for online banking includes digitally signed certificates that verify the customer’s identity

summary

The above is to see the diagram HTTP this book combined with their own thinking, referring to other articles to do a summary, mainly want to understand from the front end of the client and server side of the communication process, and the real network bosses compared, write is not so in-depth. Beg light spray, if have wrong place, welcome everybody correct

Refer to the article

  • Illustrated HTTP
  • Learn the IP routing process step by step
  • This article will familiarize you with TCP/IP.
  • – Three handshakes and four waves
  • Learn the IP routing process step by step
  • – Three handshakes, four waves, why the third handshake, why the four waves
  • How does HTTPS ensure security?
  • Digital signature and HTTPS