I believe you are familiar with the HTTPS protocol, but there should be the following questions:

  1. How does the HTTPS protocol work?
  2. How does HTTPS address the insecure features of HTTP?
  3. HTTPS Website packet capture Why to trust the certificate?

The HTTP protocol

HTTP is an application-layer protocol that usually runs on top of TCP. It is a plaintext protocol in which the client initiates a request and the server responds with a response.

Since the network is not trusted, the plaintext feature of the HTTP protocol has the following risks:

  • Communications data are at risk of eavesdropping and tampering
  • Target sites are at risk of impersonation

The average site might not matter, but what about a bank site?

Fortunately, in the era of HTTP protocol, domestic banks developed ActiveX plug-ins for Internet Explorer to ensure security, which is worthy of praise.

The solution

Since HTTP is a plaintext protocol, if you encrypt data, can you guarantee security?

Before we answer that question, let’s take a look at two more common encryption algorithms.

The encryption algorithm

Symmetric encryption algorithm and asymmetric encryption algorithm are common.

Symmetric encryption

Encryption and decryption use the same key. The encryption and decryption efficiency is higher than that of asymmetric encryption. But once the key is compromised, the communication is not secure

Asymmetric encryption

There is a key pair, public key encryption private key decryption or private key encryption public key decryption, the public key cannot be used to push back the private key, and the public key cannot be used to push back the public key.

In general, asymmetric encryption is used to transmit the key used in communication, and symmetric encryption is used in the communication process, which can solve the security and performance problems of symmetric encryption.

HTTP encrypted communication process

  1. The browser generates A random string A as the communication key
  2. The browser uses the public key to encrypt random string A and then sends the ciphertext B to the server. This step is secure because the hacker cannot decrypt it without the server private key
  3. The server uses the private key to decrypt the random string A to get the communication key
  4. The server and client communicate using random string A and symmetric encryption algorithms

At first glance, this seems fine, since a hacker can’t break asymmetric encryption, but how does the browser get the public key?

There are two ways:

  1. Built-in browser (unlikely, with so many domain names, it’s unrealistic to have so many public keys built into the browser)
  2. The server sends a message to the browser (because it is sent in plaintext, there is a risk of eavesdropping and tampering, also known as a man-in-the-middle attack)

Man-in-the-middle attack

  1. The browser requests the server to obtain the public key
  2. The middleman hijacks the server’s public key and keeps it in his possession
  3. The middleman generates a key pair and sends the forged public key to the browser
  4. Browsers use forged public keys to communicate with middlemen
  5. The middleman communicates with the server

Because browsers communicate using forged public keys, the communication process is unreliable

Problems that need to be solved

As long as the public key obtained by the browser is the public key of the target website, the communication security can be ensured. Then, the problem is, how to transfer the public key safely over an unreliable network?

This is the problem HTTPS has to solve

The HTTPS protocol

HTTPS protocol involves a lot of knowledge, this paper only focuses on the key security exchange part, which is the essence of HTTPS protocol.

The HTTPS protocol introduces the concepts of CA and digital certificates.

The digital certificate

The information includes the issuing authority, validity period, public key of the applicant, certificate owner, certificate signature algorithm, certificate fingerprint, and fingerprint algorithm.

CA

The digital certificate issuing authority (CA) is trusted by the operating system and installed by the operating system.

A digital signature

The Hash algorithm is used to calculate the data to obtain the Hash value, and the private key is used to encrypt the Hash to obtain the signature.

Only the matching public key can decrypt the signature to ensure that the signature is signed by the private key

Certificate Issuing process

  1. The website generates a key pair, saves the private key, and submits the public key and website domain name to the CA
  2. A CA writes information about the certificate issuing authority (itself), certificate validity period, public key of the website, and website domain name into a text file in plain text
  3. The CA selects a fingerprint algorithm (usually the Hash algorithm) to calculate the content of the text file to obtain the fingerprint. The CA uses its private key to encrypt the fingerprint and the fingerprint algorithm to obtain the digital signature. The signature algorithm is contained in the plaintext part of the certificate
  4. The CA packages the plaintext certificate, fingerprint, fingerprint algorithm, and digital signature into a certificate and sends the certificate to the server
  5. In this case, the server has the digital certificate issued by the authoritative CA and its own private key

Certificate Verification process

How does the browser verify the validity of the site?

  1. The browser requests port 443 of the server using HTTPS
  2. The server issues its own digital certificate to the browser (in plain text)
  3. The browser first checks whether the CA, validity period, and domain name are valid. If not, the connection is terminated (the server cannot be trusted).
  4. If yes, obtain the public key of the CERTIFICATION authority from the operating system and decrypt the digital signature according to the signature algorithm to obtain the certificate fingerprint and fingerprint algorithm
  5. The browser uses the decrypted fingerprint algorithm to calculate the fingerprint of the certificate and compares it with the decrypted fingerprint. If the fingerprint is the same, the certificate is valid and the public key is secured
  6. The browser is already communicating with the real server, and the middleman cannot know what the communication is because the middleman does not have a private key for the web site

How was the problem solved

  1. The hacker impersonates the CA and gives a fake certificate to the browser

    When the browser extracts the CA public key from the operating system through the CA name, it decrypts the digital signature and finds that the decryption fails. When it proves that the private key used for the CA signature is not a pair with the one built in the operating system, the forgery is found

  2. The hacker tampered with the website’s public key in the certificate

    The website’s public key in the certificate can be tampered with, but the digital signature is calculated by the CA private key, and hackers cannot calculate the digital signature. When the browser decrypts the digital signature with the built-in CA public key, the fingerprint mismatch is detected, which also detects forgery

  3. Hackers can also access the website’s public key

    Indeed, a hacker himself accessing a web site through a browser can obtain a public key that matches a normal user’s public key.

    However, the symmetric key used by each client and server for communication is temporarily generated and random. Hackers can only know their own random key, but do not know other random keys

To sum up, the browser solves the problem of delivering public keys to websites through the built-in authoritative CA public keys of the operating system.

HTTPS man-in-the-middle attack

HTTPS protocol resolves the problem of man-in-the-middle attack in the HTTP era. However, man-in-the-middle attack can occur when HTTPS users actively trust forged certificates (for example, 12306 requires manual trust certificates). The process of HTTPS man-in-the-middle attack is as follows:

  1. The client connects to port 443 of the server using HTTPS
  2. The server sends its digital certificate to the client
  3. The hacker hijacked the server’s real certificate and forged a fake certificate for the browser
  4. The browser can discover that the resulting website certificate is fake, but the browser chooses to trust it
  5. The browser generates the random symmetric key A and sends it to the server encrypted with the public key in the forged certificate
  6. Hackers could also hijack the request to obtain the browser’s symmetric key A, allowing them to eavesdrop on or tamper with communications data
  7. The hacker sends the client’s symmetric key A encrypted to the server using the server’s real public key
  8. The server uses the private key to decrypt the symmetric key A and then communicates with the hacker
  9. The hacker uses symmetric key A to decrypt the data on the server, encrypts the data with symmetric key A, and sends it to the client
  10. The data received by the client is no longer secure

That’s how HTTPS man-in-the-middle attacks work, and that’s why HTTPS packet capture trusts certificates.

conclusion

  1. The operating system has a built-in authoritative CA public key to ensure the security of digital signatures and digital certificates
  2. An HTTPS man-in-the-middle attack requires manually trusting the attacker’s fake certificate