background

As we know, HTTP communication has the following problems:

  • Communications using plaintext may be eavesdropped
  • Failure to authenticate the identity of the communicating party may encounter masquerade
  • The integrity of the message could not be proved and may have been tampered with

Using HTTPS can solve data security problems, but do you really understand HTTPS?

When an interviewer asks a series of soul-searching questions about you, can you answer them fluently

  1. What is HTTPS and why is it needed
  2. HTTPS connection process
  3. What are the encryption methods of HTTPS, symmetric encryption and asymmetric encryption, why this design? Why use symmetric confidentiality for content transmission
  4. Is HTTPS absolutely secure
  5. Can HTTPS capture packets

If you can answer these questions well, congratulations, you’ve got enough of them for an interview.

What is the HTTPS

Simply put, HTTPS is HTTP + SSL, which encrypts HTTP communication content. It is the secure version of HTTP and uses TLS/SSL to encrypt HTTP

Https does:

  1. Content encryption establishes an information security channel to ensure the security of data transmission;
  2. Authentication verifies the authenticity of a website
  3. Data integrity Prevents content from being impersonated or tampered with by third parties

What is the SSL

SSL was created by Netscape in 1994 to create secure Internet communication over the Web. It is a standard protocol used to encrypt communication between browsers and servers. It allows private information such as account passwords, bank cards and mobile phone numbers to be securely and easily transmitted over the Internet.

An SSL certificate is a digital certificate issued by a trusted CA that complies with the SSL protocol.

How SSL/TLS works:

To understand how SSL/TLS works, we need to understand encryption algorithms. There are two encryption algorithms: symmetric encryption and asymmetric encryption:

Symmetric encryption: The communication parties use the same key for encryption. The feature is fast encryption, but the disadvantage is the need to protect the key, if the key is leaked, then the encryption will be pojie by others. Common symmetric encryption algorithms are AES and DES.

Asymmetric encryption: It needs to generate two keys: a Public Key and a Private Key.

Public keys are, as the name suggests, public and available to anyone, while private keys are held privately. Most programmers are already familiar with this algorithm: when we submit code to Github, we can use SSH key: Generate the private key and public key locally. The private key is stored in the local.ssh directory, and the public key is stored on the Github website. In this way, every time we submit the code, we don’t need to input the username and password, github will identify our identity according to the public key stored on the website.

Public key is responsible for encryption, private key is responsible for decryption; Alternatively, the private key is responsible for encryption and the public key for decryption. This encryption algorithm is more secure, but the calculation is much larger than symmetric encryption, encryption and decryption are very slow. A common asymmetric algorithm is RSA.

HTTPS connection process

The HTTPS connection process is divided into two phases: certificate authentication and data transmission

Certificate Verification phase

It’s basically three steps

  1. Browser requests
  2. When the server receives the request, it returns the certificate, including the public key
  3. After receiving the certificate, the browser checks whether the certificate is valid. If the certificate is invalid, an alarm is displayed. (How to verify the certificate is valid will be explained in details in the following paragraphs.)

Data transfer stage

After the certificate is validated

  1. The browser will generate a random number,
  2. Use the public key for encryption and send it to the server
  3. The server receives the value from the browser and decrypts it using the private key
  4. After the resolution is successful, encryption is performed using a symmetric encryption algorithm and transmitted to the client

Then the two sides will use the random number generated in the first step for encrypted communication.

What are the encryption methods of HTTPS, symmetric encryption and asymmetric encryption, why this design

From the above we can see that HTTPS encryption is a combination of symmetric encryption and asymmetric confidentiality.

In the certificate verification phase, asymmetric encryption is used. In the data transfer phase, symmetric confidentiality is used.

This design has the advantage of maximizing safety and efficiency.

In the certificate verification phase, asymmetric encryption requires public and private keys. If the public key of the browser is leaked, we can still ensure the security of random numbers because the encrypted data can only be decrypted with the private key. This ensures the security of random numbers to the greatest extent.

In the stage of content transmission, symmetric confidentiality can greatly improve the efficiency of encryption and decryption.

Why use symmetric confidentiality for content transmission

  1. Symmetric encryption is efficient
  2. A pair of public and private keys can only realize one-way encryption and decryption. Only the server saves the private key. If asymmetric secrecy is used, the client must have its own private key, so that each client has its own private key, which is obviously not reasonable, because private keys need to be requested.

Is HTTPS absolutely secure

It’s not perfectly safe. It can be a man-in-the-middle attack.

What is a man-in-the-middle attack

In a man-in-the-middle attack, the attacker creates an independent connection with the two ends of the communication and exchanges the data they receive, making the two ends of the communication think they are talking to each other directly through a private connection, but in fact the whole conversation is completely controlled by the attacker.

HTTPS uses the SSL encryption protocol and is a very secure mechanism. Currently, there is no direct attack against this protocol. Generally, when establishing an SSL connection, HTTPS intercepts the client’s request and obtains the CA certificate, asymmetric encryption public key, and symmetric encryption key through an intermediary. With these conditions, requests and responses can be intercepted and tampered with.

Process principle:

  1. Local requests are hijacked (e.g., DNS hijacking) and all requests are sent to the middleman’s server
  2. The middleman server returns the middleman’s own certificate
  3. The client creates a random number, encrypts the random number using the public key of the middleman certificate, and sends the random number to the middleman. Symmetric encryption is constructed based on the random number to encrypt and transmit the transmitted content
  4. Because the middleman has the random number of the client, it can decrypt the content through the symmetric encryption algorithm
  5. The middleman sends a request to the official website with the content requested by the client
  6. Because the process of communication between the middleman and the server is legal, the official website returns encrypted data through a secure channel set up
  7. Middlemen decrypt the content using symmetric encryption algorithms established with official websites
  8. The middleman encrypts and transmits the data returned by the official content through the symmetric encryption algorithm established with the client
  9. The client decrypts the returned result data through a symmetric encryption algorithm established with the middleman

Due to the lack of certificate verification, although the client initiates an HTTPS request, the client is completely unaware that its network has been intercepted and the transmitted content is stolen by a middleman.

How does HTTPS protect against man-in-the-middle attacks

Certificates are required in HTTPS to protect against man-in-the-middle attacks. If you have a middleman M intercepting the client request, and THEN M provides the client with his public key, and THEN M requests the server’s public key, acting as the “intermediary” so that neither the client nor the server knows that the information has been intercepted. In this case, you need to prove that the public key of the server is correct.

How do we prove that?

You need an authoritative third party to be impartial. This third party organization is CA. In other words, CA is specialized in authenticating and guaranteeing public keys, that is, a guarantee company specialized in guaranteeing public keys. There are more than 100 globally recognized CA’s, such as VeriSign, GlobalSign, and WoSign.

How does the browser ensure that the CA certificate is valid?

1. What information does the certificate contain?

Issuer information, public key, company information, domain name, validity period, fingerprint……

What is the basis of the validity of the certificate?

Above all, authoritative orgnaization should have attestation, not just an orgnaization is qualified to issue a certificate, otherwise also not be called authoritative orgnaization. In addition, the credibility of the certificate is based on the trust system, and the authority needs to endorse the certificate issued by the authority. As long as the certificate generated by the authority, we consider it legitimate. Therefore, the authority will review the information of the applicant, and the requirements of the authority of different levels are not the same, so the certificate is also divided into free, cheap and expensive.

3. How does the browser verify the validity of the certificate?

When the browser initiates an HTTPS request, the server returns the website’s SSL certificate. The browser needs to verify the certificate as follows:

  1. Verify that the domain name and validity period are correct. All certificates contain these information, which is easier to complete verification;
  2. Determine whether the certificate source is valid. Each issued certificate can be found based on the authentication chain. The operating system (OS) and browser store the root certificate of the authorized authority locally, and use the local root certificate to authenticate the source of the issued certificates of the authorized authority.
  3. Determine whether the certificate has been tampered with. Verification with the CA server is required.
  4. Determine whether the certificate has been revoked. Through Certificate Revocation List (CRL) and Online Certificate Status Protocol (OCSP), OCSP can be used in step 3 to reduce interaction with the CA server and improve verification efficiency.

The browser considers the certificate valid only if any of the preceding steps are met.

The last

Due to a lot of topics sorting out the answer to the workload is too large, so only to provide knowledge points, a lot of detailed questions and reference answers I have sorted into a PDF file, the need for small partners can be private letter TO me [interview] free or click on GitHub free access!