Details on HTTP can be found in the computer Networking series, HTTP

Most of this article is based on getting to the bottom of HTTPS encryption

The definition of the HTTPS

HTTPS is an HTTP channel aiming at security. On the basis of HTTP, HTTPS ensures the security of the transmission process through transmission encryption and identity authentication. HTTPS adds SSL to HTTP. The security of HTTPS is based on SSL. Therefore, the detailed work of encryption is completed by SSL protocol

HTTPS = HTTP + SSL during HTTPS data transmission

  • usingThe SSL protocolPerform on the dataencryptionanddecryption
  • usingThe HTTP protocolTo the encrypteddatafortransmission

You can see that HTTPS is done by HTTP and SSL working together

The definition of the SSL

SSL and its successor TLS provide a secure protocol to ensure data security and integrity for network communication. TLS and SSL encrypt network connections between the transport layer and the application layer

HTTPS’s improvement over HTTP

HTTPS improves HTTP in three ways:

  1. Data security: in this paper,Mixed encryptionTechnology, the middle manUnable to view directlyClear content
  2. The authenticationThrough:certificateAuthentication client accessIs your ownThe server
  3. Data integrity: Prevents transmission of content by a middlemanPretend to beorTampering with

For example, you can see that data is not transmitted in plaintext through packet capture

How does HTTPS ensure security

  1. The security of HTTPS is that it adopts asymmetric encryption and symmetric encryptionhybridIn the manner of data transmission
  2. First, the public and private keys are stored on the server, and the browser requests the URLRequest a public key from the serverThe browser encrypts the key with the public key and the server decrypts the private key in the process of transferring the key
  3. The security of HTTPS depends on the security of the asymmetric encryption and decryption processIssuer of a digital certificateTo check whether the packet is switched
  4. The browser also passesA digital signatureGo to theMapping outCompare the original content of the digital certificate with the obtained content to check whether it is tampered with
  5. Once the certificate is found to have been swapped/tampered with, the transaction is terminated and the key is not transmitted

Next, we go deeper from [cryptography], [symmetric encryption], [asymmetric encryption], [digital certificate], [digital signature], and finally summarize [the whole process of SSL handshake], so as to find out how HTTPS ensures data security, identity authentication, and data integrity:

Some concepts of cryptography

  • clear: The plaintext refers toWere not encryptedRaw data.
  • cipher: plaintext is encrypted by an algorithmAfter the encryptionIs converted into ciphertext to ensure the security of the original data. Cipher isCan be decryptedAnd get the original plaintext.
  • Key: A key is a parameter that is entered in the algorithm that converts plaintext to ciphertext or ciphertext to plaintext. Keys are classified into symmetric keys and asymmetric keys, which are used for symmetric encryption and asymmetric encryption respectively.
  • Symmetric encryptionSymmetric encryption is also called symmetric encryptionKey encryptionThat is, the sender and receiver of information use the same key to encrypt and decrypt data. Symmetric encryption is characterized by fast algorithm disclosure, encryption, and decryption. It is suitable for encrypting large amounts of data. Common symmetric encryption algorithms include DES, 3DES, TDEA, Blowfish, RC5, and IDEA.
  • The encryption and decryption process is as follows:

Plaintext + encryption algorithm + key => Ciphertext, ciphertext + decryption algorithm + key => plaintext

The private key used in encryption is the same as the key used in decryption, which is why encryption is called “symmetric”.

  • Asymmetric encryption:. Asymmetric encryption is more secure than symmetric encryption. Because both sides of symmetrically encrypted communication use the same key, if one party’s key is compromised, the entire communication can be cracked. Asymmetric encryption, on the other hand, uses a pair of keys, a public key and a private key, that come in pairs. The private key is kept by itself and cannot be disclosed. A public key is a public key that can be obtained by anyone. Encrypt with either the public or private key and decrypt with the other.

  • The ciphertext encrypted by the public key can only be decrypted by the private key. The process is as follows:

Plaintext + encryption algorithm + Public key => Ciphertext, ciphertext + decryption algorithm + private key => plaintext

  • The ciphertext encrypted by the private key can only be decrypted by the public key. The process is as follows:

Plaintext + encryption algorithm + private key => Ciphertext, ciphertext + decryption algorithm + Public key => plaintext

Asymmetric encryption is “asymmetric” because two different keys are used for encryption and decryption. Note that both keys can be encrypted and decrypted: public key encryption corresponds to private key decryption, and private key encryption corresponds to public key decryption

HTTPS uses both asymmetric and symmetric encryption

Why a mixture of asymmetric and symmetric encryption

Can symmetric encryption only work?

If both parties have the same key and no one else knows about it, the security of the communication between the two parties can be guaranteed (unless the key is cracked).

The biggest problem, however, is how the key can be known to both parties without being known to others. What if the server generates a key and sends it to the browser, and someone else hijacks it in the process? And then he could unlock whatever was being transmitted between them, so of course that wouldn’t work.

Another way to think about it? If the browser prestores the key of web site A, and you can ensure that no one outside the browser and web site A knows the key, symmetric encryption is theoretically possible, so the browser needs to prestore the key of all the HTTPS sites in the world! This is obviously not realistic. How to do? So we need asymmetric encryption

Is it possible to use only asymmetric encryption

In view of asymmetric encryption mechanism, we could have this line of thinking: the server being public and private keys in advance, the service side the public key in clear text mode first transmission to the browser, the browser to the server after the data before first use the public key encryption again, so that the security of data transmission by the browser to the server can guarantee! Because only the server has the corresponding private key to unlock the data encrypted by the public key.

But how is the path from the server to the browser secure? If the server encrypts the data to the browser with its private key, the browser can decrypt it with the public key that was originally sent to the browser in plaintext, and if the public key is hijacked by a middleman, it can also decrypt the message from the server. So this does not secure the path from the server to the browser

Is the improved asymmetric encryption scheme feasible

The browser and the server each have a public key and a private key, thus ensuring the security of both routes

  1. A web server has public key A and corresponding private key A ‘. The browser has the public key B and the corresponding private key B ‘.
  2. The browser transmits public key B in plaintext to the server.
  3. The server sends public key A in plaintext to the transport browser.
  4. The contents that the browser sends to the server are encrypted with the public key A, and the server decrypts them with the private key A ‘. Since only the server has the private key A ‘, this data is guaranteed to be secure.
  5. Similarly, the server sends information to the browser that is encrypted with the public key B and decrypted with the private key B ‘. The same as above can also ensure the security of this data

But why not? The reason is that asymmetric encryption is time-consuming, and symmetric encryption is relatively time saving and fast. So how to use the time saving of symmetric encryption to combine the security of asymmetric encryption?

The reason for the time consuming is that it is very difficult to identify the authenticity of the public key in asymmetric encryption. As we will see later, although the use of digital certificates can ensure that the public key is not switched, two points are caused at the same time: 1. How to tell whether the certificate is switched; 2. How to tell whether the certificate is tampered with

Advantages of asymmetric encryption + symmetric encryption scheme

  • fast
  • security
  1. A website has public key A and private key A ‘for asymmetric encryption.
  2. The browser requests the web server, which sends the public key A in plaintext to the browser.
  3. The browser randomly generates A key X for symmetric encryption, encrypts it with public key A and sends it to the server.
  4. The server gets the key X by decrypting it with the private key A ‘.
  5. So both sides have key X, and no one else can know it. After that, all data of both parties can be encrypted and decrypted by key X

To sum up: for security and efficiency, HTTPS uses both asymmetric encryption and symmetric encryption:

  • Symmetric encryption: useThe clientCreated key pairdataEncrypt, and the client needs to send the requestIn advance to informServer key
  • Asymmetric encryption: use the public and private keys of the certificate to pair separatelyThe keyEncryption and decryption to ensure that the secret key does notstolen
  • All subsequent data transfers are made using key pairsdataDo symmetric encryption, and then HTTP takes care of the transmission

There is still a security loophole – a man-in-the-middle swap of the public key

  1. The hacker intercepts the public key sent by the server
  2. withForged public keySend it to the client
  3. Unable to identify the source of the public key, the client uses the public key to encrypt the secret key
  4. The hacker intercepts the encrypted key sent by the client and passesYour own private keyDecrypt to obtain the secret key (Key step)
  5. The hacker uses the public key obtained by the interception to encrypt the secret key and send it to the server

In this way, the middleman swaps the public key from the server and gets the key X without either party noticing anything unusual. The root cause is that the browser cannot confirm that the received public key is the site’s own because the public key itself is transmitted in plaintext

So how do you prove that the public key received by the browser must be the public key of the site?

In fact, the truth is how to prove that the ID card xiaoming shows must be Xiaoming’s, rather than his forged ID card. The method is to see id card above imprint the seal that government organization recognizes to wait a moment. Similarly, how to issue an “id card” to a website? Could there be an institution that acts as “justice” for the Internet world?

Yes, it is the CA organization, which is the premise of the normal operation of the Internet world today, and the “ID card” issued by the CA organization is the digital certificate.

The digital certificate

Before using HTTPS, a website needs to apply for a digital certificate from the CA. The digital certificate contains information about the certificate owner, public key, server domain name, authority, certificate content signed by the CA private key, and signature calculation method. The server sends the certificate to the browser, which gets the public key from the certificate. The certificate, like the ID card, proves that “the public key matches the web site.” But there is a new problem, how to prevent the certificate itself from being tampered with in the process of transmission?

How can I prevent the digital certificate from being tampered with?

We generate a “signature” from the original content of the certificate. By comparing the content and signature of the certificate, we can determine whether the certificate has been tampered with. This is the digital certificate “anti-counterfeiting technology”, here “signature” is called digital signature

What is a digital signature

The process of making a digital signature is shown on the left and the verification process is shown on the right

Digital signature production process:

  1. CA bodies have asymmetric encryptionThe private keyandThe public key.
  2. The CA uses the hash algorithm to pairCertificate plaintext data TforhashgetHash value.
  3. The hash valueEncrypt with a private key,Digital signature S.

The plaintext T and the digital signature S together form a digital certificate, so that a digital certificate can be issued to a web site. How does the browser take the digital certificate from the server and verify that it is real? (Whether it has been tampered with or replaced)

Browser verification process:

  1. Get the certificate, get the clear text T, signature S.
  2. Using CAThe public key decrypts S(Since it is a browsertrustThe body, so browserkeepIts public key. See details below), and get S ‘.
  3. Hash plain T using the hash algorithm specified in the certificate to get T ‘.

Obviously through the steps above,

  • If S ‘Is equal to theT ‘, indicates thatCertificate of credible;
  • If S ‘Is not equal toT ‘, indicating that the certificate is possibleBeen tampered withIt’s not believable.

Why does this guarantee that the certificate is trusted?

How can you tell if the certificate has been tampered with by digital signature?

If the middleman tampers with the original text of the certificate, the corresponding hash value will change, but he cannot tamper with the signature S accordingly because he does not have the private key of the CA institution. After receiving the certificate, the browser finds that the original text is inconsistent with the decrypted signature value. In this case, the certificate has been tampered and cannot be trusted. In this way, information is stopped from being transmitted to the server to prevent information leakage to the middleman.

Since tampering is impossible, what if the entire certificate was swapped?

Is it possible for the middleman to switch the certificate? Suppose there is another web site B that also has A CA certificate and wants to hijack web site A’s information. So it acts as A middleman and intercepts the certificate that A sends to the browser, replaces it with its own certificate, and sends it to the browser, and then the browser gets the public key in B’s certificate by mistake. This does lead to the vulnerability mentioned in the above “manin the middle attack”?

This doesn’t happen because the certificate contains website A’s information, including the domain name, and the browser compares the domain name in the certificate with the requested domain name to see if it has been switched.

Why do I need to hash a digital signature once?

It seems that the hash process is a bit redundant, but removing the hash process also ensures that the certificate has not been tampered with.

The most obvious problem is performance. We have already said that asymmetric encryption is inefficient, and the certificate information is usually long and time-consuming. The hash results in a fixed length of information (for example, md5 hash results in a fixed 128-bit value), which makes encryption and decryption much faster.

Of course, there are security reasons, this part of the content is relatively deep, interested in their own to understand

How to prove that the public key of a CA organization is trusted?

You may have noticed that WHEN I mentioned the public key of the CA, I almost said, “the browser keeps its public key.” What is the holding method? How do I prove that the public key is trustworthy?

Let’s recall what digital certificates actually do. Yes, in order to prove that a public key is trustworthy, i.e. “is the public key correct for the website”, can the CA organization’s public key also be proved by digital certificate? Yes, the operating system and the browser themselves will pre-install some of their trusted root certificates, if there is a CA’s root certificate, so that it can get its corresponding trusted public key.

In fact, there can be more than one layer of authentication between certificates, A can trust B, B can trust C, and so on, we call it A trust chain or digital certificate chain. That is, a series of digital certificates, starting from the root certificate, through layers of trust, so that the holder of the terminal entity certificate can be granted trust to prove identity.

In addition, I wonder whether you have encountered the website can not access, prompting the need to install a certificate? This is where the root certificate is installed. The browser does not recognize the authority that issued the certificate to the site, so you will have to manually download and install the root certificate for that authority (at your own risk). Once installed, you have its public key, which you can use to verify that the certificate sent by the server is trustworthy.

To sum up: the specific process of HTTPS handshake

First, the algorithm used for symmetric encryption and decryption is connected in advance

  1. The client sends a set of symmetric encryption and decryption algorithms it supports to the server
  2. The server picks out a symmetric encryption and decryption algorithm from the collection, and then sends it back to the client, so that the client and the server can communicate with each other behind the symmetric encryption algorithm

Interconnect certificates used for asymmetric encryption

  1. And then the server sends itThe digital certificateTransfer to the client, keep the private key. (The certificate includes the server domain name, public key, and certificate validity period.)
  2. The clientAnalytical certificate, the contents to be detected include:The domain name in the certificatewithThe domain name requested by yourselfCompare (Prevents the certificate from being swapped) corresponding to the plaintext data hash of the certificateHash valueWith decryption obtainedA digital signatureCompare the two (Prevent certificate information from being tampered with),Expiration timeAnd so on. If an exception is found, a warning box will pop up indicating that there is a problem with the certificate.

Third, generate the secret key, and then asymmetric encryption transmission

  1. If there is no problem with the certificate, the client will be generatedThe key(a string of passwords for random numbers) and provided in the certificatePublic key encryption, and then send it all to the server
  2. The server also checks the certificate after receiving it: compareThe domain name in the certificatewithOwn domain nameAnd corresponding to the plaintext data hash of the certificateHash valueWith decryption obtainedA digital signatureIn order toPrevents the certificate from being tampered with or swappedAnd then unlock it with the private keyTo get the secret key

Random keys are generated to ensure uniqueness and randomness (security)

Iv. Confirm the sending and receiving capabilities of both parties

  1. The client encrypts a message with the session secret keysendTo the server, the main authentication serverWhether it is accepted normallyClient encrypted message.
  2. The server also encrypts a message using the session keyComes backTo the client, if the clientBe able to acceptThis indicates that the TLS layer connection has been established.

5. The handshake is completed and symmetric encrypted data is transmitted

  1. Sending data: Before sending data each time, the client and server use the symmetric encryption algorithm to encrypt the plaintext and the previously connected key to obtain the ciphertext
  2. Before receiving data, both parties decrypt the received ciphertext and the previously connected key using the previously connected symmetric decryption algorithm to obtain plaintext

Must the handshake transfer key be performed at the SSL/TLS layer every time an HTTPS request is made?

  1. The server maintains one for each browser (collectively, the client)sessionDuring the SSL handshake, after receiving the key from the client, the server saves the key to the corresponding session
  2. And generate the correspondingsession IDThen send it to the client
  3. And then every time the browser requests itWill carrysession ID
  4. The server will use the session IDFind the corresponding keyAnd decrypt and encrypt the data so that you don’t have to re-create and transfer the key every time the data is transmitted (asymmetric encryption is time-consuming).

Application and Summary

HTTPS ensures data security, authentication, and data integrity, but it can also cause some problems:

  • Cost: certificate fee
  • Time: Compared with HTTP, HTTPS handshake is time-consuming, which affects the response speed of the website and may affect user experience

So we used a divide-and-conquer strategy:

  • The portion of the user’s view is transmitted using HTTP
  • User information and user amount are transmitted using HTTPS

Issue review

  1. Why can’t we just use asymmetric encryption?
  2. Why use symmetric + asymmetric encryption?
  3. Why a digital certificate?
  4. Why a digital signature?

Refer to the article

  • HTTP and HTTPS protocols, just read one article
  • Https principle and process
  • Thoroughly understand HTTPS encryption
  • HTTPS handshake process (encryption and decryption process)