This is the last article to summarize the online part of the interview preparation process. It mainly introduces the following knowledge:

  • HTTP Protocol Overview
  • POST and GET requests
  • The Cookie and Session
  • Encryption of data during transmission
  • Introduction of HTTPS

The HTTP protocol

In the OSI seven-tier model, the HTTP protocol is located at the topmost application layer. Accessing a web page through a browser directly uses the HTTP protocol. When using HTTP, the client establishes a TCP connection with port 80 of the server, and then requests, replies, and exchanges data on the basis of this connection.

There are two common versions of HTTP, 1.0 and 1.1. The main difference is that HTTP 1.0 used a new TCP connection for each request and reply, whereas HTTP 1.1 has multiple commands and replies running on a SINGLE TCP connection. Therefore, the establishment and disconnection of TCP connections are greatly reduced and the efficiency is improved.

HTML is usually used to describe web pages loaded by HTTP protocol, so HTML can also be understood as a data format of web pages. HTML is a piece of plain text that specifies text, images, audio and video images, links, as well as their colors, positions, and so on in a web page. Regardless of the underlying structure of the computer or the protocols used at the bottom of the network, the results presented using HTML are basically the same. In this sense HTML sits at the presentation layer of the OSI seven-tier model.

POST and GET requests

HTTP has eight types of requests (also known as methods), of which the most common are GET and POST requests.

GET requests are usually used to query and retrieve data, while POST requests are used to send data. In addition to the differences in usage, there are several other differences:

  1. GET requests can be cached and bookmarked, but POST can’t.
  2. GET requests remain in the browser’s history, POST does not.
  3. The length of GET requests is limited (depending on the browser, around a few kilobytes), the DATA type of URLS can only be ASCII characters, and there is no limit to POST requests.
  4. The parameters of a GET request are in the URL, so you can never use a GET request to transfer sensitive data. POST request data is written in the HTTP request header, which is slightly more secure than GET request data.

Note:

A POST request is only slightly more secure than a GET request in that its data is not in the URL, but is still stored in plain text in the HTTP request header.Copy the code

The Cookie and Session

HTTP is a stateless connection, and every time a client reads a Web page, the server thinks it’s a new session. But sometimes we need to keep some information, such as the user name and password when logging in, and the information when the user was connected last time. This information is stored by cookies and sessions.

The fundamental difference between the two is that cookies are stored on the client, while sessions are stored on the server. From this, we can also develop the following conclusions:

  1. Cookies are relatively insecure, and browsers can analyze local cookies for cookie spoofing.
  2. Session you can set the timeout period. If the timeout period expires, the session will be invalid. Otherwise, the server memory will be occupied for a long time.
  3. There is a limit to the size of a single cookie (4 Kb) and a general limit to the number of cookies per site (20).
  4. The client sends the cookie to the server each time, so the server knows about the cookie, but the client does not know about the session.

When the server receives the cookie, it will find the client session according to the SessionID in the cookie. If not, a new SessionID is generated and sent to the client.

encryption

There are two types of encryption, symmetric encryption and asymmetric encryption. Before we explain what these two things mean, let’s take a look at the simple encryption and decryption process:

The so-called symmetry means that the secret key is the same as the secret key and the secret key is the same, and the asymmetric nature means that the two are different.

Take an example of symmetric encryption. Suppose the encryption algorithm here is addition and the decryption algorithm is subtraction. If the plaintext data is 10 and the secret key is 1, then the encrypted data is 10 + 1 = 11. If the receiver does not know the secret key, it does not know what to subtract from ciphertext 11. Conversely, if the receiver knows that the secret key is 1, the plaintext data can be calculated with 11-1 = 10.

A common asymmetric encryption algorithm is RSA algorithm, which mainly takes advantage of the idea that “it is easy to take the product of two prime numbers, but difficult to decompose the product into two prime numbers”. How it works is beyond the scope of this article, but interested readers can check out the reference article at the end of this article.

In asymmetric encryption, data encrypted with a public key can and only be decrypted with the private key, and data encrypted with the private key can and only be decrypted with the public key.

Symmetric encryption has the advantage of fast speed, but assuming that the secret key is stored by the server, how to let the client get the secret key safely is a problem to be solved. Therefore, in actual network transmission, the combination of symmetric encryption and asymmetric encryption is usually used. The server sends the symmetric secret key to the client through asymmetric encryption. The two parties then communicate using this symmetric key.

HTTPS

We know that HTTP uses TCP directly for data transmission. Data is transmitted in plaintext without encryption. Therefore, there are three risks:

  1. Risk of eavesdropping: Third party nodes can learn the contents of communications.
  2. Tamper risk: Third-party nodes can modify communication content.
  3. Impersonation risk: Third-party nodes can impersonate others to participate in communication.

For example, when you open a web page in an app on your phone, you sometimes see ads pop up at the bottom of the page, which in effect means that your HTTP content has been bugged and tampered with.

The HTTPS protocol is designed to address these three risks, so it can:

  1. Ensure that all information is encrypted and cannot be stolen by third parties.
  2. Add a verification mechanism for information so that it can be detected if it is maliciously damaged by a third party.
  3. Equipped with an identity certificate to prevent third parties from participating in communication in disguise.

The structure of HTTPS looks like this:

It simply adds a TLS/SSL layer between HTTP and TCP, confirming the adage that “all computer problems can be solved by adding an intermediate layer”.

When HTTPS is used, the server sends its certificate to the client, which contains the public key of the server. The transmission process based on asymmetric encryption is as follows:

  1. The client uses the public key to encrypt the information and sends the ciphertext to the server
  2. The server decrypts it with its own private key and sends the returned data back to the client encrypted with the private key
  3. The client decrypts using the public key

The certificate here is a tool for the server to prove its identity and is issued to the applicant by an authoritative certificate authority (CA). If the certificate is fake, or if it is a self-issued certificate, the server will disrecognize the certificate and issue a warning:

To summarize how THE HTTPS protocol avoids the three risks mentioned above:

  1. Asymmetric encryption is first used to encrypt the transmission password, which is then used to encrypt the data symmetrically so that a third party cannot access the content of the communication
  2. The sender writes the hash result of the data to the data, and the receiver compares the hash result of the data after decryption. If the hash result is inconsistent, it indicates that the data is modified. Because the transmitted data is encrypted, a third party cannot modify the hash result.
  3. Certificate issued by the authority, coupled with the certificate verification mechanism, to avoid the third party disguised to participate in communication.

Refer to the article

  1. HTTPS popular science literacy post
  2. Overview of the SSL/TLS protocol operation mechanism
  3. RSA encryption
  4. HTTP method: Compare GET with POST