preface

All the contents of this article are described according to the content in the preface.

People: little blue, little red, little black.

Event: Communication

The encryption algorithm

Symmetric encryption

Symmetric encryption means that both sides of communication hold the same key, and encryption and decryption are performed using the same key. Here’s an example:

Blue and Red wrote love letters to each other, but they did not want their letters to be known. So how do they make sure their love letters are kept secret?

Since the letter was sent through the post office, there was no way to guarantee that it would not be read by Little Black (who has a crush on Little Red) in transit. But they only need to make sure that the letters between them are unintelligible, so they can agree on a rule in advance, such as taking every third word or only the middle word in each sentence.

Through the above rules, even if the letter is read by black, as long as black does not know this rule, then black does not know what the letter is to express. Thus, the real memory of the letter is not known by the small black.

advantages

The speed of encryption and decryption is fast and does not cause much loss in performance, especially in the case of encrypting a large amount of data, especially obvious.

disadvantages

The biggest drawback of symmetric encryption is that it is not secure when negotiating encryption rules.

Because little red and blue can only communicate by letter, so their encryption rules are to be transmitted through the mail, but the letters are not encrypted, so the encryption rules can be black letters on acquisition and interpretation, as long as the little black knew the rules of the encryption, then behind all the content of the letter, the little black can be read out.

Why should blue and Red be restricted to communicating only by mail? Can’t they first discuss the encryption rules privately? If in real life is indeed possible, but if in our Internet environment, our website is not possible to discuss private keys to all users, we can only send a request to generate a symmetric key to the user, subsequent communication both sides use the symmetric key to encrypt and decrypt the content, But the request to generate and send the key in the first place is cleartext, and as long as it is cleartext, it is risky.

Asymmetric encryption

Asymmetric encryption has two keys, one called the private key and one called the public key. You can use either a public key or a private key for encryption. But with public key encryption, only the private key can be decrypted; With private encryption, only public key decryption can be used.

Continuing with the above scene, Red and Blue find that their letters always show signs of being opened, and they realize that their letters have been read by a third party. Of course, they are not happy to have their little secret violated by a third party. Blue realizes that it is not safe to transmit the key in clear text, so he decides to pass a safe to Red, who then puts the written letter in the safe and locks it. Blue has the key to the safe, which has never been seen by anyone else.

The safe is the public key and the key is the private key. Contents encrypted by the safe (public key) can only be unlocked by the key (private key).

advantages

Public keys are available to anyone.

Only the private key can unlock the content encrypted by the public key.

disadvantages

The private key must not be disclosed, otherwise there will be no encryption at all. This malpractice is acceptable.

Asymmetric encryption has higher performance loss than symmetric encryption. Therefore, asymmetric encryption and symmetric encryption are generally used together, such as HTTPS.

But asymmetric encryption also has one of its biggest vulnerabilities: man-in-the-middle attacks, as shown here,

We can see that the most critical part is that Black intercepted Blue’s safe and replaced it with his own safe, but Blue and Red know nothing about this, because Red does not know what kind of safe blue sent, she can only be sure that she can receive a safe. Blue also didn’t know that the safe he sent was intercepted, because the reply he received was indeed his own safe and could be opened with his own key.

https

As we all know, HTTP packets transmitted over the network are in plain text, which means we are running naked through the network world. After obtaining network information using tools such as Wireshark, tcpdump, and Dsniffing, we can obtain the information sent by other people on the network.

HTTPS is a combination of HTTP + TLS/SSL.

https = http + tls / ssl

This is the network protocol layer we can access when using the HTTP protocol:

Here is the network protocol layer we can access when using HTTPS:

We can see that HTTPS adds a TLS/SSL layer between TCP and HTTP.

Shake hands

The following is a network request to visit the gold digger homepage. I used Wireshark to parse Https handshake process. I’ll skip the TCP three-way handshake and just focus on the TLS/SSL handshake.

Client Hello

IP protocol layer: The red box on the left is my Intranet address, which will be converted to the Internet address through the NAT protocol of the router when accessing the Internet, and the red box on the right is the Internet address of digging gold.

TCP layer: The red box on the left is the port I used to send this request (also called the source port), and we will use this port (preferably numbered) to find the message the server sent back to me. The red box on the right is the port I want to access, and you can see that Https uses port 443, and Https’s default port is 443, and HTTP’s default port is 80.

Roughly speaking, IP is about finding a host, while TCP is about finding a process on the host.

TLS Protocol Layer: What’s important is what’s in the blue box.

  • It contains a version number available to the client (version is backward compatible)
  • A random number
  • One ciphersuite (all ciphersuites supported by the client, including 17 ciphersuites in this case)

Server Hello

After receiving the Client Hello sent by the Client, the server immediately sends a message of this type.

By checking the address and port number in the red box, we can be sure that this message will reply to the above Client Hello.

We still see this in the blue box:

  • A version number (version backward compatibility)
  • A random number
  • A mutually supported encryption suite selected from the encryption suite sent from the client. The selection of the encryption suite is very important. The selection of different suites will have different actions for subsequent interactions.

At this point, the client and server have agreed on the version number, password suite, and both client and server have two random numbers.

Server Certificate

Following the Server Hello message, the Server will immediately send a Server Certificate message, which contains one very important thing — the CA Certificate, also known as the digital Certificate.

You can see the CA certificate in the red box.

This certificate usually serves two purposes:

  • One is authentication
  • One is that the certificate contains the server’s public key, which negotiates the preparatory master key in combination with the cipher suite’s key negotiation algorithm.

Server Key Exchange

This message is sent conditionally. As you can see from Server Hello, different actions are taken when you select different ciphersuites.

It sends on the condition that if the certificate does not contain enough information for a key exchange, that information must be sent. Typically, encrypted suites negotiated are of the following types and are sent:

  • DHE_DSS
  • DHE_RSA
  • ECDHE_ECDSA
  • ECDHE_RSA

We can see that the ciphersuite negotiated by the client and server is of type ECDHE_RSA. So here we can see the message being sent. And this information follows the Server Certificate information.

Each time the client connects to the Server, the Server sends dynamic DH information (THE DH parameters and the DH public Key, which is not the public Key in the certificate). The information does not exist in the certificate and needs to be transmitted through the Server Key Exchange information. The DH information to be transmitted must be signed with the private key of the server.

The DH public key is Pubkey, the signature is Signatrue, and the DH parameter consists of Curve Type, Named Curve, and Pubkey.

This information is used for key exchange. Because HTTPS encrypts content through symmetric encryption in the real communication stage, the security of symmetric encryption key exchange is very important. If the client and server can secure the symmetric encryption key, it is necessary to do the information.

For details, you can view the DH algorithm process. The following documents are provided:

  • www.zhihu.com/topic/20030…
  • Blog.csdn.net/lee24486814…
  • www.zhihu.com/question/28…

Server Hello Done

After sending the above information, the server immediately sends the information and waits for the response from the client.

There are two main functions of this information:

  • The server sends enough information to negotiate a preliminary master key with the client
  • After receiving the information, the client can verify the certificate and negotiate the key

Client Key Exchange

The client should send the Server Hello Done message immediately after receiving it from the Server. The main function of this information is to negotiate the preparatory master key. Generally, there are two ways:

  • The client uses RSA/ECDSA algorithm to encrypt and prepare the master key, and then sends it to the server
  • The DH public key of the client is calculated based on the DH parameters sent by the server and transmitted to the server.

Packet capture information shows that the second method is used here. This is because only one DH public key is sent to the server.

For details about how the client calculates the DH public key, see the implementation process of the DH algorithm. The reference link is provided above.

Eventually, the client and server compute the same preparatory master key. The primary key is the third random number.

As we know, we have already got two random numbers. Finally, the client and server will calculate the key needed for symmetric encryption according to the three random numbers. Because the key is calculated locally and not transmitted over the network, the key is secure.

Of course, the symmetric key is secure only when the third random number is secure. The DH algorithm is used to obtain the third random number. For details about how the DH algorithm ensures the security of the third random number, please check by yourself.

Change Cipher Spec

The agreement is not part of the handshake agreement. After calculating the key, the client and server inform the peer that the subsequent information needs to be encrypted by the TLS record layer protocol.

That is, this message is used to tell the other party that I have calculated the symmetric key to be used, and we need to use this key for encryption before sending the following communication.

It is important to note that the party sending the message does not know whether the other party has computed the key. Usually, the client sends this information first.

Finished

This information is the first information encrypted by the TLS record layer protocol. Both parties need to verify the Finished information sent by the other party to ensure that the negotiation key is available and is not tampered with during the negotiation.

Verify the content of the verify Data, which consists of three parts:

  1. Master key
  2. The label of the client is Client Finished and that of the server is Server Finished
  3. Handshake_messages, which contains all handshake protocol messages.

confidentiality

In fact, the cipher suite negotiated by the client and the server contains multiple algorithms at the same time.

Take the cipher suite negotiated in our article:

Cipher suite Key negotiation algorithm Authentication algorithm The encryption algorithm The Hash algorithm agreement
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 ECDHE RSA AES-128-GSM SHA256 TLS

HTTPS is how to ensure confidentiality, we know through ECDHE algorithm negotiation symmetrical password is out of a random number, and then through the random number 1, number 2, 3 by a random number calculation method to calculate a symmetric key together, because the key is to be calculated in the local, not through the network transmission, so the symmetric key is safe.

Each communication can then be encrypted and decrypted using the symmetric key.

But how does the DH algorithm ensure that the DH public key sent by the DH algorithm is sent by the client? Here we have the same problem as CA certificates. How do we ensure that CA certificates are valid? In the CA certificate, the CA private key is used to carry out a digital signature on the user’s CRS information + THE CA information + the public key provided by the user. As long as the browser can use the public key for decryption, it is equal to the certificate is valid.

The same applies to the DH algorithm. When sending DH information, the server uses the private key to digitally sign THE DH parameters and the DH public key. As long as the client can decrypt the information using the public key, the information is valid. The same is true for messages sent by clients.

Here is a reference article: How does the DH algorithm Defend against man-in-the-middle attacks

certification

How do browsers validate CA certificates?

First of all, we need to know how the CA Certificate is generated. First, the service provider S who applies for the Certificate needs to provide the CSR (Certificate Signing Request) to the third-party CA organization. After receiving the Request, the third-party organization will check the information in the Request. Online will validate, for instance, whether to belong to the service side provide domain, address is correct, etc., only the third party think the information you provide is accurate, it will according to the information (information + CA you provide + public key) generated a message, and then the information is an information summary, The information digest is then digitally signed using the CA private key. Then, the generated information and digital signature together generate a digital certificate and publish it to the service provider.

The process of verifying certificates for browsers is as follows: First on the local computer to look for a issued the certificate of the CA root certificate, root certificate is installed on the operating system in advance, if there is a will to validate the content on the certificate, such as the domain name is consistent, whether certificate expired, etc., and then use the CA institution’s public key to decrypt the digital signature, if the decryption is successful, then the certificate is valid.

integrity

How can the client ensure that the CA certificate is complete and has not been tampered with? For example, a third party intercepts a certificate and modifies its content.

We know that when generating a certificate, an information summary is generated for the information. As long as the client gets the CA certificate, it generates an information summary for the information in the certificate and compares it with the content decrypted by digital signature, it can know whether the content in the certificate has been tampered with.

The problem

Is it possible for the middleman to tamper with the certificate

If middlemen to get to the CA certificate, and modify the domain name of the certificate, but it does not have the CA private key, so it can’t get a new signature, then after the client receives the tamper with the certificate, by private key to decrypt the out a summary of information of original certificate, and the content of the certificate for receiving client profile information, find two inconsistent information, It will terminate the communication and prevent the information from being leaked.

Is it possible for the middleman to replace the certificate

If the middleman also has a valid certificate issued by the CA, the middleman intercepts the certificate sent by the server and sends its own certificate to the client. In this case, the client uses the public key of the middleman certificate to communicate.

In fact, this does not happen, because the domain name is verified during the authentication process, and if the domain name accessed is inconsistent with the domain name in the certificate, an unsafe link will be prompted. It’s a different story if the middleman can use your domain name to request a certificate, but that won’t happen because the certificate request is verified and not arbitrary.

Why is a digital signature a hash value

This is really just a performance issue because the hash value is a short value relative to the CA certificate information, and it is faster to encrypt and decrypt the value. This time is usually not a concern when the certificate is generated, but if it takes too long for the browser to decrypt, the user won’t be able to wait that long.

How to prove that the public key of a CA organization is trusted

In fact, the public keys in the CA are built into the operating system, and the public keys built into the operating system must be trusted. If we don’t trust them, we won’t be able to complete the CA certificate verification.

The CA certificate is not necessarily issued by the CA, but can also be issued by the agent authorized by the CA. That is to say, the CA trust the agent authorized by the CA. Therefore, we generally get a level 3 certificate. The level 1 certificate is the root certificate of THE CA institution, the level 2 certificate is the certificate of the agent authorized by the CA institution, and the level 3 certificate is our own certificate. These three certificates form a certificate chain.

The certificates we see in nuggets are a three-level chain of certificates.

Must HTTPS first handshake transfer keys at the SSL/TLS layer on every request

If every HTTPS request requires a TLS handshake, the TLS handshake is so complex that it will inevitably lead to a large delay in communication, which is unacceptable for websites that focus on user experience. So is there any way to avoid this?

This is done through a Session Identifier, which is the Session ID generated during the TLS handshake. The server can save the negotiated Session ID information, and the browser can also save the Session ID and carry it in the subsequent Client Hello handshake. If the server can find the matching information, it can complete a quick handshake.