preface

As a programmer with pursuit, it is necessary to understand the development trend of the industry and expand their computer knowledge reserve, especially some basic computer content, such as the computer network knowledge to be discussed in this article. This article will give you a detailed overview of HTTPS implementation principles.

In recent years, HTTPS has become more and more popular as users and Internet enterprises become more aware of security and the cost of HTTPS decreases. Many Internet giants are also pushing HTTPS. For example, Google’s Chrome browser displays an unsafe warning in the address bar when visiting HTTP sites, and wechat requires all small programs to use HTTPS. Apple also requires that all apps on the App Store use HTTPS, and most major websites at home and abroad have already migrated to HTTPS, so it’s only a matter of time before HTTPS replaces HTTP.

Having said that, what exactly is HTTPS, and how does it compare to HTTP? How does the underlying principle work? Here’s a look at some of the drawbacks of HTTP.

1. The biggest drawback of HTTP is insecurity

The biggest reason HTTP has been replaced by HTTPS is that it is insecure, and this chart shows why.

Figure 1. HTTP data transfer process

As can be seen from the figure, all data in HTTP transmission is in plaintext, so there is no security at all. Especially, some sensitive data, such as user passwords and credit card information, will suffer terrible consequences once obtained by a third party. Some people here might say, I just encrypt sensitive data on the front page, like MD5 with salt encryption. It’s too easy to think of it that way. First of all, MD5 is not an encryption Algorithm. Its full name is Message Digest Algorithm MD5, which means information Digest Algorithm MD5. It is an irreversible hash Algorithm, that is, the data processed by FRONT-END MD5 cannot be restored on the server. Here take the password as an example. The front end processes the user password through MD5 and sends the hash value to the server. Since the server cannot recover the password, it will directly use the hash value to process the user request. Therefore, after obtaining the hash value, a third party can bypass the front-end login page and directly access the server, causing security problems. In addition, the SECURITY of MD5 algorithm itself also has defects, which will not be discussed here.

In short, hashing algorithms like MD5 and SHA-1 do not make HTTP more secure. The only way to make HTTP more secure is to use real encryption algorithms, which can encrypt or restore data with a key. Just make sure that the key is not accessed by a third party and the data is secure. This is HTTPS’s solution, so let’s take a look at the encryption algorithm.

2. Encryption algorithm

HTTPS uses encryption algorithms to solve data transmission security problems. Specifically, it is a hybrid encryption algorithm, that is, a mixture of symmetric encryption and asymmetric encryption. It is necessary to first understand the differences, advantages and disadvantages of the two encryption algorithms.

2.1 Symmetric Encryption

Symmetric encryption, as its name implies, uses the same key for encryption and decryption. Common symmetric encryption algorithms include DES, 3DES, and AES. Their advantages and disadvantages are as follows:

  • Advantages: Open algorithm, small amount of calculation, fast encryption speed, high encryption efficiency, suitable for encrypting large data.
  • Disadvantages:
    1. Both sides of the transaction need to use the same key, so it is impossible to avoid the transmission of the key. The key cannot be intercepted during transmission, so the security of symmetric encryption cannot be guaranteed.
    2. Each pair of users needs to use a unique key unknown to others every time they use a symmetric encryption algorithm, which makes the number of keys owned by both the sender and the receiver increase dramatically, and key management becomes a burden for both parties. Symmetric encryption algorithm is difficult to use in distributed network system, mainly because of the difficulty of key management and high cost.

This article does not give a detailed introduction to the specific encryption algorithm, interested students can refer to the symmetric encryption algorithm in detail, if directly used in HTTP symmetric encryption algorithm, will be the following effect:

Figure 2. Symmetric encrypted data transmission process

As can be seen from the figure, the encrypted data is random and garbled during transmission. Even if intercepted by a third party, the data cannot be decrypted without a key, thus ensuring data security. But there is a fatal problem. Since both parties want to use the same key, it must be passed from one party to the other before the data can be transmitted, in which case the key can be intercepted and the encrypted data can be easily decrypted. So how do you ensure that the key is secure during transmission? This is where asymmetric encryption comes in.

2.2 Asymmetric encryption

Asymmetric encryption, as its name implies, requires encryption and decryption using two different keys: a public key and a private key. The public key and private key are a pair. If the public key is used to encrypt data, only the corresponding private key can be used to decrypt data. If data is encrypted with a private key, only the corresponding public key can be used to decrypt it. The basic process of asymmetric encryption algorithm to realize confidential information exchange is as follows: Party A generates a pair of keys and discloses one of them as a public key; Party B who has obtained the public key encrypts the confidential information with the public key before sending it to Party A; Party A then uses its own private key to decrypt the encrypted information. If the public and private keys is not very understanding, can imagine as a key and a locks, but the world only you a man who had the key, you can put the locks to others, other people can use this lock lock important things up, and then sent to you, because only you a man who had the key, so only you can see things this lock lock up. The commonly used asymmetric encryption algorithm is RSA algorithm, want to know more about the students point here: RSA algorithm in detail, RSA algorithm in detail, its advantages and disadvantages are as follows:

  • Advantages: The algorithm is open, encryption and decryption use different keys, the private key does not need to be transmitted over the network, high security.
  • Disadvantages: large amount of calculation, encryption and decryption speed is much slower than symmetric encryption.

Due to the strong security of asymmetric encryption, it can be used to perfectly solve the key leakage problem of symmetric encryption. The effect picture is as follows:

Figure 3. The client sends the KEY to the server via asymmetric encryption

In the process of the above, the client to the server’s public KEY, generates a random code (both expressed with the KEY, the KEY is the follow-up for symmetric encryption KEY), then client using public KEY encryption and then sent to the server, the server using the private KEY to decrypt, so that both sides have the same KEY KEY, The two parties then use the KEY to symmetrically encrypt the interactive data. In the process of transferring keys in asymmetric encryption, even if a third party obtains the public KEY and the encrypted KEY, it cannot crack the KEY without the private KEY (the private KEY is stored on the server and the risk of disclosure is minimal), thus ensuring the security of the following symmetric encryption data. The above flow chart is the prototype of HTTPS, which combines the advantages of the two encryption algorithms to ensure communication security and data transmission efficiency.

3, HTTPS principle detailed explanation

Take a look at wikipedia’s definition of HTTPS

Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP). It is used for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protocol is encrypted using Transport Layer Security (TLS) or, formerly, its predecessor, Secure Sockets Layer (SSL). The protocol is therefore also often referred to as HTTP over TLS, or HTTP over SSL.

Hypertext Transfer Protocol Secure (HTTPS) is an extension based on HTTP. It is used for Secure communication on computer networks and is widely used on the Internet. In HTTPS, the original HTTP protocol is encrypted with TLS (Secure Transport Layer Protocol) or its predecessor, SSL (Secure Sockets Layer). Therefore, HTTPS is often used to refer to HTTP over TLS or HTTP over SSL.

HTTPS is not an independent communication protocol, but an extension of HTTP to ensure communication security. The relationship between HTTPS and HTTP is as follows:

Figure 4. Relationship between HTTP and HTTPS

That is HTTPS = HTTP + SSL/TLS.

Here is the most important principle of HTTPS.

Figure 5. HTTPS encryption, decryption, verification and data transmission process

Don’t be afraid to look dazzled, but listen to me. The entire HTTPS communication process can be divided into two phases: certificate verification and data transmission. The data transmission phase can be divided into asymmetric encryption and symmetric encryption. The specific process is explained according to the serial number in the figure.

  1. The client requests an HTTPS url and then connects to port 443 of the server (the HTTPS default port, which is similar to HTTP port 80).

  2. Servers that use HTTPS must have a Certification Authority (CA) certificate. Certificates must be applied for and issued by a dedicated DIGITAL Certificate Authority (CA) after strict verification. The higher the security level, the more expensive). A private key and a public key are generated when a certificate is issued. The private key is kept by the server itself and cannot be disclosed. The public key is attached to the information of the certificate and can be made public. The certificate itself also comes with a certificate electronic signature, which verifies the integrity and authenticity of the certificate and prevents the certificate from being tampered with.

  3. The server responds to the client’s request by passing the certificate to the client, which contains the public key and a lot of other information, such as certificate authority information, company information, and certificate validity period. For Chrome, click the lock icon in the address bar and then click the certificate to see the certificate details.

Figure 6. CA certificate of station B

  1. The client parses the certificate and validates it. If the certificate is not issued by a trusted authority, or the domain name in the certificate is inconsistent with the actual domain name, or the certificate has expired, a warning is displayed to the visitor and he or she can choose whether to continue the communication. Like this:

Figure 7. Browser security warning

If there is nothing wrong with the certificate, the client retrieves the server’s public key A from the server certificate. The client also generates A random code KEY and encrypts it using the public KEY A.

  1. The client sends the encrypted random code KEY to the server as the symmetric encryption KEY.

  2. After receiving the random KEY, the server decrypts it using the private KEY B. After these steps, the client and server finally establish a secure connection, perfect solution to the symmetric encryption key leakage problem, then you can use symmetric encryption to communicate happily.

  3. The server uses the KEY (random KEY) to symmetrically encrypt data and send it to the client. The client uses the same KEY (random KEY) to decrypt data.

  4. Both parties happily transfer all data using symmetric encryption.

HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS: HTTPS

4, summarize

To summarize the differences between HTTPS and HTTP and the disadvantages of HTTPS:

The difference between HTTPS and HTTP:

  • The most important difference is security. HTTP transmits in plaintext and is less secure without encrypting data. HTTPS (HTTP + SSL/TLS) data transmission is encrypted and secure.
  • To use HTTPS, you need to apply for a CA certificate. Generally, there are few free certificates, so some fees are required. Certificate authorities such as Symantec, Comodo, DigiCert and GlobalSign.
  • It makes sense that HTTP pages respond faster than HTTPS, but with the added layer of security, the connection process is more complex and more data is exchanged, which inevitably affects speed.
  • Since HTTPS is an HTTP protocol built on top of SSL/TLS, it is more costly to the server than HTTP.
  • HTTPS and HTTP use completely different connections and use different ports, 443 and 80.

Disadvantages of HTTPS:

  • In the same network environment, HTTPS has significantly higher response time and power consumption than HTTP.
  • HTTPS security has a range and is almost useless in the case of hacking, server hijacking, etc.
  • Under the existing certificate mechanism, man-in-the-middle attack is still possible.
  • HTTPS requires more server resources, which can lead to higher costs.

In addition, the detailed process of SSL/TLS handshake and related important concepts will be introduced in DETAILS in HTTPS 2. Well, that’s all for this article. If there are any mistakes, please correct them. Finally, post a few reference articles

  • wikipedia
  • Differences between HTTP and HTTPS
  • Details on SSL/TLS principles
  • In-depth understanding of HTTP1.x, HTTP 2, and HTTPS
  • Ruan Yifeng network Log: PRINCIPLE of RSA Algorithm (1)
  • Ruan Yifeng network Log: PRINCIPLE of RSA Algorithm (2)
  • Network security “attack and defense” — HTTPS protocol details