This article will discuss the encryption and decryption principle of HTTPS. Many people know RSA, think HTTPS = RSA, use RSA encryption and decryption data, but actually this is not correct. HTTPS uses RSA to authenticate and exchange keys, and then uses the exchanged keys to encrypt and decrypt data. Authentication is asymmetric encryption using RSA, while data transfer is symmetric encryption using the same key for both parties. So, what are symmetric and asymmetric encryption?

1. Symmetric encryption and asymmetric encryption

Suppose xiao Wang next door wants to ask Xiao Hong out, but he doesn’t want Xiao Ming to know, so he wants to use symmetric encryption to pass a small note to Xiao Hong, as shown in the picture below:

The data he wants to send is “Meet at 5:00 PM” (utF-8 encoding if it’s Chinese), encrypted by moving left or right directly in the ASCII table, his key is 3, which means moving 3 bits further in the ASCII table will become “Phhw#dw#8=33#SP”, So if the average person intercepted do not know what the meaning is. However, we can imagine that if he can capture your data, he can also intercept your key and decrypt it, as shown in the following figure:

Therefore, Wang intends to use asymmetric encryption. The characteristic of asymmetric encryption is that both parties have their own public key and private key pair, in which the public key is sent to the other party, but the key is not exchanged and kept by themselves and not leaked, as shown below:

The public key of Xiao Hong is:

public_key = (N, e) = (3233, 17)

She sent xiao Ming the public key. Her own private key is:

private_key = (N, e) = (3233, 2753)

Note that both the public and private keys are two numbers. N is usually a large integer, and e represents a power. Now Xiao Wang wants to send a message to Xiao Hong, so he uses Xiao Hong’s public key for encryption, how to encrypt? The first letter he wants to send is t = “M”, the ASCII encoding of “M” is 77, the encryption process of 77 is calculated as follows:

T = 77 ^ e % N = 77 ^ 17 % 3233 = 3123

Take 77 to the power of e and modulo N, and you get T = 3123. Then you give the number a little red (do the same with the other letters). After receiving T, Xiao Hong decrypts it with her private key. The calculation is as follows:

t = T ^ e % N = 3123 ^ 2753 % 3233 = 77

It’s the same thing, so it restores T to T, and as long as the public and private keys are paired, it turns out by some mathematical formula that the certificate says is true. This is how RSA encryption and decryption work. If you do not know the private key, you cannot decrypt it correctly. Conversely, it is possible to encrypt with the private key and decrypt with the public key.

So how HTTPS uses RSA to encrypt and decrypt, let’s start with the HTTPS connection establishment process.

2. Establish the HTTPS connection

HTTPS provides the following functions:

  • 1. Verify the identity of the service provider. For example, when I visit Google.com, I am connected to Google servers
  • 2. Prevent data hijacking, such as carriers inserting ads into HTTP pages
  • 3. Prevent sensitive data from being stolen or tampered with

As openSSL notes this is the only way to prevent man-in-the-middle attacks:

We take the MDN website (https://developer.mozilla.org) as an example, and then use wireshark caught, observe the HTTPS connection, the process of building as shown in the figure below:

Then the Client (browser) initiates an HTTPS connection establishment request. The Client sends a Client Hello packet, and the Server responds with a Server Hello packet, and then sends its certificate to the Client. Then the two parties exchange keys. Finally, the exchanged key plus row is used to encrypt and decrypt the data.

In Client Hello, the Client will inform the server of its current information, as shown in the following figure:

Includes the TLS version to be used by the client, the supported encryption suite, the domain name to be accessed, and a random number (Nonce) generated for the server. You need to tell the server which domain you want to access in advance so that the server can send a certificate for the corresponding domain name, since the HTTP request has not yet occurred.

The Server does something in Server Hello:

The encryption suite selected by the server is called TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256.

(1) Key exchange uses ECDHE

(2) Certificate signature algorithm RSA

(3) Data encryption using AES 128 GCM

(4) Signature verification uses SHA256

The service then sends the client four certificates:

The common name of the first certificate is the domain we are currently visiting, developer.mozilla.org. If the common name is *.mozilla.org then this certificate can be used by all secondary subdomains of Mozilla.org. The second certificate is the certificate of the issuing authority (CA) of the first certificate, which is Amazon, meaning Amazon will use its private key to sign developer.mozilla.org. And so on, the third certificate will sign the second certificate, the fourth will sign the third certificate, and we can see that the fourth certificate is a Root certificate.

To see what can be inside a certificate, we can expand the first certificate, as shown below:

A certificate contains three parts: tbsCertificate (to be signed certificate) content of the certificate to be signed, certificate signature algorithm, and signature given by the CA. This means that the CA will use its private key to sign the tbsCertificate and place it in the signature section. Why should a certificate be signed? Signatures are used to verify identity.

3. Authentication

Let’s take a look at the contents of tbsCertificate, as shown below:

It includes the public key of the certificate, the applicable public name of the certificate, the validity period of the certificate and the issuer of the certificate. Amazon certificate also has the above structure, we can copy the public key of Amazon certificate, as shown below:

There are some padded numbers in the middle, denoted by grey letters. You can see that N is usually a large integer (2048 bits in binary), while e is usually 65537.

We then decrypt the mozilla.org certificate signature using the CA’s public key in a similar way:

Take the last 64 bits of hexadecimal of the decrypted number, which is the binary 256-bit SHA hash signature. To manually calculate the SHA256 hash of tbsCertificate, use Wireshark to export tbsCertificate to a raw binary file:

Then use OpenSSL to calculate its hash value, as shown below:

liyinchengs-MBP:https liyincheng$ openssl dgst -sha256 ~/tbsCertificate.bin

SHA256(/Users/liyincheng/tbsCertificate.bin)= 5e300091593a10b944051512d39114d56909dc9a504e55cfa2e2984a883a827d

We found that the hash calculated manually is the same as the hash in the encrypted certificate! Note Only those who know the Amazon private key can correctly sign the Mozilla.org certificate because the public and private keys only match. So we verify that the first certificate mozilla.org is indeed issued by the second certificate Amazon, and in the same way we can verify that Amazon is issued by the third certificate and the third certificate is issued by the fourth root certificate. And the fourth certificate is the root certificate, which is built into the operating system (viewed through the Mac keychain tool) :

Suppose Hacker points your domain name to his machine through some kind of DNS spoofing, and then he forges a certificate. However, since the root certificate is built into the operating system, it cannot change the public key of the signature, and it does not have the correct private key, so it can only use its own private key. Since the public and private keys are not paired, it is difficult to ensure the consistency of the information after encryption and decryption. Or simply move the browser’s certificate to his own server? The certificate sent to the browser is exactly the same, but since he doesn’t know the private key of the certificate, he can’t do any more, so it doesn’t make sense.

This is how HTTPS can authenticate an identity.

Another example is SSH. You need to manually verify whether the signature is correct. For example, you need to make a phone call or send an email to the server to tell the server whether the signature is the same as that of the certificate calculated by yourself.

The above is the value calculated manually by yourself. Compare this value with the previous value to see if the certificate sent has not been modified.

So why not just use RSA’s key pair to encrypt data? The RSA key pair is too large to encrypt or decrypt data frequently, so a smaller key is required. Another reason is that the server does not have the browser or client key to send encrypted data to the browser (not with its own private key, because the public key is public). Therefore, a key exchange is required.

4. Key exchange

There are two kinds of RSA key exchange way and ECDHE, the way of RSA is simpler, the browser generates a key, and then use the certificate of RSA public key encryption sent to the server, the service and use its key decryption key, so can be Shared key, although its disadvantage is that the attacker cannot be cracked in the process of sending, But if it keeps all the encrypted data, it can use the private key to decrypt all the previously transmitted data if the certificate expires and is not maintained for some reason. Using ECDHE is a more secure key exchange algorithm. As shown in the figure below, both parties exchange keys through ECDHE:

Elliptic Curve Diffie — Hellman Key Exchange ECDHE is the Elliptic Curve Diffie — Hellman Key Exchange. It is an improvement on the Difei-Hellman key Exchange algorithm. The idea of this algorithm is shown in the figure below:

To obtain the shared secret key K, A calculates a number g ^ A with its private key and sends it to B, whose private key is B, so B gets K = g ^ A ^ b, and sends g ^ b to A, who also gets K = g ^ b ^ A. This should be easier to understand, and the introduction of elliptic curve encryption makes it harder to crack.

5. Elliptic curve encryption

There are two signature algorithms for certificates, RSA and the new EC, as shown in the following figure. Google.com is the ECC certificate used:

The difficulty in cracking RSA is that it is impossible to prime the N of the public key. If you can multiply the N of the certificate by two prime numbers, you can calculate the private key of the certificate, but this is impossible with current computing power. The difficulty of ECC cracking lies in finding the coefficient of the specified point.

Here is an elliptic curve equation: y ^ 3 = x ^ 2 + ax + b:

Given a starting point G(x, y), we now need to calculate the coordinates of point P = 2G by making a line tangent to the curve at point G and reflecting -2g relative to the X-axis. To calculate the coordinates of 3G, see the following figure:

Connect 2G and G to the curve at -3G, and then reflect to get 3G. Similarly, calculate 4G is to connect G and 3G to reflect. If the line between the last point and the starting point is perpendicular to the X-axis, all points are used up.

The difficulty of EC lies in the given starting point G and point K:

K = kG

It’s hard to get k (k is big enough). This k is the private key, and k = kG is the public key. How does ECC encrypt and decrypt data?

Assume that the data to be encrypted is M, use this point as the x coordinate to get a point M on the curve, set a random number R, calculate points C1 = rG, C2 = m + rK, send these two points as encrypted data to the other party, and the other party uses the private key K to decrypt the data after receiving it, and the process is as follows:

M = C2 – rK = C2 – rkG = C2 – rkG = C2 – kC1

Through the above calculation can restore to get M. and do not know the private key K is unable to decrypt. More details can be found in this Medium article, ECC Elliptic Curve Encryption.

So we understand the principle of ECC, then how to use ECC for key exchange?

6. ECC key exchange

The principle is simple, as shown below:

Instead of swapping two powers of numbers, we’re swapping points on two curves.

For example, the Curve equation used by Curve X25519 is y^2 = X ^3 + 486662x^2 + X. The Curve equation used in key exchange will be specified, as shown in the figure below:

The Curve equation used by Mozilla.org is secP256R1, another popular one, which has much larger parameters than Curve X25519. The private key of the certificate is also used to sign the key exchange to ensure that the exchanged key cannot be tampered with, but the private key here is Mozilla’s own private key.

That is, from the time the connection is established until now it is in clear text. Next, both parties send a Change Cipher Spec packet to inform the following packets to be encrypted in the specified way. At this point, the entire secure connection is established.

7. Application of HTTPS certificates

So who is doing HTTPS encryption? The server side is usually Nginx, Apache these reverse proxy server to do, and the specific business server does not need to deal with, the client side is usually the browser to do encryption and decryption, Chrome is using boringSSL this library, fork from OpenSSL.

You can use Let’s Encrypt to apply for a TLS certificate for free, which needs to be renewed manually every 3 months. There are three types of certificates: DV, OV, EV. DV is suitable for individuals, OV and EV require identity verification, and EV is the most advanced. The EV certificate will display the enterprise name of the certificate in the address bar of the browser:

But the new version of Chrome seems to have removed this, so when we open Medium’s console, we see a prompt:

As part of an experiment, Chrome temporarily shows only the lock icon in the address bar. Your SSL certificate with Extended Validation is still valid.

Alternatively, we can use OpenSSL to generate a self-signed certificate by executing the following command:

openssl req -x509 -nodes -sha256 -days 365 -newkey rsa:2048 -keyout test.com.key -out test.com.crt

Test.com.crt is the certificate and test.com.key is the private key of the certificate, as shown below:

Then give nginx access to both files using HTTPS, as shown in the following code:

    server {
        listen       443;
        server_name  test.com;
        ssl on;
        ssl_certificate    test.com.crt;
        ssl_certificate_key    test.com.key;
     }Copy the code

You can add this certificate to your system certificate so that browsers and others can trust it, or you can use the mkcert tool in one step.

8. Client certificate

Another type of certificate is called client certificate. You also need to apply for a client certificate from the CA. The TLS certificate is different from the TLS server certificate in that the server certificate is usually bound to the domain name, but the client certificate can sign any local executable file. The signature verification algorithm is the same as the TLS certificate discussed above. Why do executables need to be signed? Because if they are not signed, the system will block the installation or run, as the Mac double-clicks an unsigned DMG package:

Windows has a similar warning:

Windows will give a warning:

And when we run a signed exe file there will be a normal prompt, like Chrome’s prompt:

To sum up, this paper mainly discusses the principle of symmetric encryption and asymmetric encryption, and introduces how to use RSA to verify the identity of the connected server by checking the certificate signature, how to use ECC to encrypt data and exchange keys, how to generate and use HTTPS certificate, and introduces the client certificate. I believe that after reading this article, you will have a more comprehensive understanding of HTTPS encryption and decryption. I have written an introduction to HTTPS before, and this article adds more details.