Demystify HTTPS

Before WE talk about HTTP, it’s important to say a little bit about HTTP, which I don’t think you need to say much about, because you use it every day, and every HTTP request is a TCP connection. Unfortunately, the content of the request is transmitted in plaintext in TCP packets. Anyone who intercepts the request can read the content, which is embarrassing.

Data encryption

In order to prevent the content of the request from being stolen, we can’t do anything in the way of network transmission, so we can only do something in the data packets transmitted. Encrypting the message content is one of these methods.

There are encryption algorithms called symmetric encryption algorithms, in which encryption and decryption use the same secret key. With this algorithm, the requested data is encrypted and the middleman cannot read the contents because he does not have the secret key. However, to use this algorithm for encryption, you must agree with the secret key, which also has the risk of being stolen in the network transmission ah.

At this time, the emergence of a new encryption algorithm: asymmetric encryption algorithm, it has two keys, a private key, is only their own know, another is public key, can be sent to the Internet mountain, literally who can see, that is to say, transmission process even if others see it does not matter. When user A sends A message to user B, user B encrypts the data using user B’s public key. After receiving the message, user B decrypts the data using its own private key. Even if the message is stolen, the data cannot be decrypted because there is no corresponding secret key.

However, asymmetric encryption algorithms are much slower than symmetric encryption algorithms. A compromise is to use an asymmetric encryption algorithm to transmit the symmetrically encrypted secret key to ensure safe delivery, and then use the symmetric encryption algorithm to encrypt the data message.

Middleman hijacking

You think you can rest easy now? Do you think you can communicate safely? Naive.

Suppose, for example, that you are now communicating with A, A question from the soul: how can you be sure that the person you are communicating with is A?

We assume that communication is normal. The request is hijacked by middleman C as the public key is transferred. C uses A’s public key to replace its own public key and sends it to you. After you send the request, C decrypts your information, encrypts it with A’s public key, and sends it to A. So, it looks like you and A are communicating with each other, but actually all the information passes through C, and all the information is visible to C.

So how to ensure that the public key received is A? This brings us back to the problem of how to ensure the safe transmission of the secret key across the network. But this time, encryption doesn’t seem to save us, and we need to make sure that the secret key we receive is actually from “A,” which means the message hasn’t been tampered with.

The digital certificate

In fact, the key that cannot guarantee the contents of the message is that we cannot determine whether the public key received has been modified or not, so if there is a trusted intermediary S to transmit the public key, it is ok. The problem is that the public key of D is also modified in transmission. Shall we find others to transmit the public key of S? This is going to go on forever. It’s all three handshakes.

What is the root of the problem? We don’t have a trusted public key, so the solution is crude. We store a trusted public key locally, which is not obtained over the Internet, but is pre-installed in the system, which is the top-level CA certificate pre-installed by the system/browser.

These pre-installed trust contents are CA certificates. When obtaining the public key of A from the CA, the obtained digital certificate looks like this:

After receiving the certificate, we calculate the information digest through the hash algorithm of the children’s ballad and decrypt the digital signature with the CA’s public key. If the decrypted information digest is the same as the one we calculated, it can be considered that the information has not been modified. The verification process is as follows:

Doesn’t that make it easier for someone to tamper with it? Not afraid. Since we have got the CA’s public key, there is no problem. Because the middleman does not have the CA’s private key, it intercepts the information in time and cannot encrypt the modified content and generate the corresponding digital signature.

As a result, the transmission of information is a temporary end to the problem. (I don’t know when a new security problem arises, after all, the devil is higher than a foot)

HTTPS

That’s it, HTTPS, and that’s about it. The process of an HTTPS request is as follows:

The browser sends an HTTPS request
The server returns with its own digital certificate
The browser uses the preset CA to verify the certificate. If there is no problem, the browser gets the public key
The browser generates a symmetric encryption key, encrypts the key using the public key of the server, and sends packets to the server
The server uses its own private key to decrypt and obtain the symmetric encryption secret key
You can start communicating with the obtained symmetric encryption key

Data encryption

Middleman hijacking

The digital certificate

HTTPS

Related Posts

Interviewers like to ask about HashMap principles, so bookmark them!

Mybatis Generator database reverse generation tool

Elasticsearch: How to search for emoji