Why is Https secure? This is a very common interview question. But to fully understand this question, you need to have some prior knowledge. Therefore, the specific communication flow of Https will not be involved in this article.

Let’s start with a question: how do you transmit information securely?

  • Ensure the security of the transmitted content, that is, no plaintext is transmitted

  • Prevent transmission content from being tampered with, that is, tampering can be identified

  • To confirm that the other party is really the other party, that is, the authentication of the identity of the communication party

Around these points, let’s take a look at common encrypted communication methods and their problems.

The most classical encryption

Encryption has its roots in warfare. The famous Caesar code, for example, works very simply, as shown below:

A plaintext letter is translated to the right according to a certain number to obtain the corresponding ciphertext. In the picture above, it’s just moving each letter back two Spaces, plaintext BAG, transformed to DCH. That way, even if the message is intercepted, the enemy can’t get the real message.

The key to the Caesar code is the number of digits (2 in the image above) that the letter moves to the right. The key is as important as the plaintext, and there is no difference between losing the key and losing the plaintext. Obviously, the key strength is too low. Even with the subsequent emergence of an out-of-order alphabet, it was still easy to decipher.

With the progress of science and technology, modern cryptography, which relies on the development of computers, provides mathematically verified encryption methods and (pseudo) randomly generated keys, making it fast and safe to encrypt a piece of information.

Symmetric encryption

As the name suggests, symmetric encryption and decryption are symmetric, and the encryption key and decryption key are the same key. Typical symmetric encryption algorithms are DES and AES.

DES is a symmetric encryption adopted in 1977 in the United States Federal Information Processing Standard. It is now capable of brute force cracking, so it should not be used any more except for compatibility issues. In addition, there is the triple DES to enhance THE strength of DES, that is, DES is repeated three times. Due to its low processing speed, it is rarely used for new purposes except where special emphasis is placed on downward compatibility.

The most widely used symmetric encryption is AES encryption, which was selected publicly worldwide. After cryptographers all over the world, its security is beyond doubt. So, can’t we just use AES to encrypt communication content?

A fatal problem of symmetric encryption is key transmission. The encryption and decryption processes use the same key. Therefore, the communication party must first transfer the key to the other party so that the two parties can communicate normally. However, if there is a reliable way to transmit the key, then the content of the communication can be safely transmitted using the same method. Using symmetric encryption, it simply converts how to safely transmit the content of the communication into how to safely transmit the key, and does not solve any problems per se.

So, how to solve the key transfer problem?

Asymmetric encryption

To solve the key transfer problem, we can “not transfer” the key.

Asymmetric encryption perfectly solves the problem of key transmission through public and private keys. The sender uses the public key for encryption and the receiver uses the private key for decryption. The public key can exist publicly on the network, while the private key is kept by the receiver and cannot be disclosed. The private key is an important guarantee of communication security. Once leaked, encrypted communication will be cracked. The most common asymmetric encryption we use is RSA.

Asymmetric encryption, which seems to solve the key transfer problem perfectly, still has obvious problems.

Asymmetric encryption is only one hundredth of the performance of symmetric encryption. In a browser or instant chat scenario, this speed may be unacceptable to users. Therefore, symmetric encryption and asymmetric encryption are often used together in real use, as shown in the following figure.

Using asymmetric encryption to protect symmetric encryption keys not only solves the key transmission problem of symmetric encryption, but also solves the problem of slow asymmetric encryption.

Encryption of communication content has now been resolved. Even if the contents of the communication and the encrypted symmetric key are intercepted, it cannot be decrypted and viewed because there is no private key. So is the flow of communication secure today? No, there is still a core question, in the colorful and complicated Internet, is the other party really the other party? Just like the cute girl you’re voice chatting with might actually be a foot picker. A typical example is a man in the middle attack. The following diagram depicts the basic flow of a man in the middle attack.

The middleman intercepts the communication link between the two parties through specific technical means, and then switches the public key sent to the sender, intercepting and forging the communication content without being detected. The sender cannot determine whether the received public key belongs to the receiver, and the receiver cannot identify that the message has been tampered with.

The problem of preventing communication content from being tampered with and authenticating the identity of the other party still cannot be solved at this time.

The hash algorithm

When it comes to preventing communication from being tampered with, it’s easy to think of hashing algorithms. Input anything of any length and compute a fixed length hash value, also known as a hash value or message digest. Hash algorithm is not encryption algorithm, it is only used to verify the integrity of the message, for example, when downloading software on the official website, usually provide hash values for users to compare.

Common MD4/MD5, including SHA1, are no longer secure and are not recommended. Currently, SHA2/SHA3 is recommended.

Hashing is rarely used directly in encrypted communication alone, as it still fails to solve the problem described in the previous section. If the sender sends the message along with the hash value of the message, all the middleman has to do is replace the hash value as well, still to no avail. And then you have to work the hashing algorithm into the existing system.

Message verification code

Message captchas are similar to hashes in that they input arbitrary length of content to calculate a fixed length captchas. But this calculation requires a secret key shared by both sender and receiver. Message authentication code is a hash algorithm associated with a key.

When sending data, the sender calculates the message verification code using the shared key and sends the message together with the data. After receiving the data, the receiver uses the shared key to calculate the message authentication code and compares it with the message verification code sent by the sender. The more common message authentication code is HMAC algorithm.

Because the shared key is only available to both parties, even if the middleman intercepts and modifies the message, the receiver can identify the tampering by calculating the message authentication code.

What? Shared keys? How can shared keys be safely transmitted and not intercepted by middlemen? Yes, message authentication codes also have key transfer problems. This can be solved by introducing asymmetric encryption.

At the same time, because of the use of shared key, message authentication code has the problem of proving to the third party and preventing denial. Because both communication parties have a shared key, it is impossible to determine who sent a message, so it is impossible to prove and prevent denial to the third party. For example, a message like “I owe you 500W” can be said by the sender to be sent to me, or by the receiver to be sent to me.

To solve this problem, digital signatures are introduced.

A digital signature

Digital signatures sound fancy, but the principle is simple. I’m gonna start with the asymmetrically encrypted graph,

Asymmetric encryption is when the sender holds the public key and the receiver holds the private key. The public key can be transmitted directly over the network, while the private key is available only to the receiver.

Now imagine a scenario where the process shown above is reversed, where the receiver sends a message to the sender, the receiver encrypts the message using a private key, and the sender decrypts the message using a public key after receiving it. Since the private key is held only by the recipient, you can be certain that the received message came from the recipient. Student: Is that a way of verifying the identity of the other party, preventing denial, proving to a third party.

Encryption with the private key, decryption with the public key, this is actually a digital signature. But in digital signatures, the process of encrypting with the private key is called generating signatures and decrypting with the public key is called verifying signatures, which is the opposite of asymmetric encryption. Let me draw a picture.

This is just a simple schematic. In real use, the private key is not used to directly sign the original data. Instead, the original data is hashed first and then the hash value is signed, which can reduce the amount of data transmission. Here’s another picture:

The raw data sent directly in the figure, but this raw data is not the specified text. In practice, raw data can be protected together with asymmetric encryption to protect symmetric encryption keys.

Digital signature is quite functional. Let’s review the requirements for secure transmission mentioned at the beginning of the article:

  • Ensure the security of the transmitted content, that is, no plaintext is transmitted
  • Prevent transmission content from being tampered with, that is, tampering can be identified
  • Confirmation that the other party is really the other party, that is, the authentication of the other party’s identity

Digital signature has fully met these requirements. But…

Whether using asymmetric encryption alone or digital signatures, there is a problem with public keys. The public key exists publicly on the network. How to ensure that the public key used for asymmetric encryption or digital signature verification is not forged?

This depends on the last section of this article, certificates.

certificate

The problem that certificates address is the legitimacy of public keys. That is, the public key must be securely transferred from one party to another, and cannot be swapped or tampered with.

Wait, isn’t that what this article is about? How to transmit information securely? Now the information to be transferred is the public key. No doubt, all of the methods discussed above can be applied here, and digital signatures are a good choice.

Yes, a certificate is a digital signature of a public key.

To the sender of a public key, a public key is a common piece of data to be transmitted. It is represented by a public key to be transmitted to avoid confusion. The sender generates A pair of public and private keys (public key A and private key A), and uses private key A to digitally sign the public key to be sent, indicating that the public key is indeed from me. In this way, the third party (such as the browser) that receives the public key can use the sender’s public key A to verify the signature and verify whether the public key is valid.

I don’t know if I confused you. If not, it should be easy to spot logic bugs. In order to verify the validity of the public key to be transmitted, public key A is introduced. So what about the legitimacy of public key A? Introduce another pair of public and private keys? Such infinite dolls, still cannot solve the essence of the problem. But what can be done?

Yeah, there’s no way. In fact, that’s how it works. Take Github as an example. Click on the little lock on the left side of the Chrome url to view the certificate information.

In the certificate path, you can see that there are three layers. This is essentially a complete certificate chain. Github.com certificates are secured by DigiCert SHA2 High Assurance Server CA, DigiCert SHA2 High Assurance Server CA security is guaranteed by DigiCertA at the next level. The DigiCertA is guaranteed to be safe by itself, which means we must trust it unconditionally, or the doll will never end.

This DigiCertA is called the root certificate, and it is built into our computer system or browser. It is these root certificates that guarantee the security level down until the certificate used in a communication is secure. In addition to the built-in root certificate, users can also install their own trusted certificates.

In addition to the public key and signature, the certificate contains some additional information. Most certificates follow the X.509 standard specification, which is not described in detail here, but can be consulted by those interested. Let’s take a quick look at basic certificate information on Chrome.

conclusion

At this point, most of the problems of secure communication have been solved. Let’s go back a little bit.

Symmetric encryption is generally used for communication, but it has the problem of key transmission.

Asymmetric encryption has only one hundredth of the performance of symmetric encryption and is not used to directly encrypt communication content. However, asymmetric encryption can be used to protect symmetric encryption keys together with symmetric encryption to solve the key transmission problem.

Hashing algorithms are mainly used for information integrity.

Message authentication code is a hash algorithm associated with a key. By sharing the key, it can not only ensure the integrity of information, but also provide authentication function to ensure that the message comes from the expected communication object. However, there are also key transmission problems.

Digital signature technology uses private key to sign and public key to verify the signature. Meanwhile, it has the function of confirming the integrity of information, confirming the identity of the communicating party and preventing denial.

The purpose of a certificate is to ensure the validity of a public key, and its essence is to digitally sign the public key. Its security is guaranteed by the root certificate at the top of the certificate chain.

With these common technologies in mind, Https is nothing more than a combination of them. In the next chapter, we will explore the specific communication flow of Https and the application of these encryption technologies.

To learn about OfferKiller, poke here!