This article is written by Zongwei, one of the team members. We have authorized the exclusive use of Doodle Front, including but not limited to editing, original labeling and other rights.
preface
If you still don’t understand, you can add wechat directly and scold me for cheating and playing with women’s feelings. (A lot of details for ant Financial to be a friend of security) ~
The evolution of HTTPS process is very scary, there are a lot of ideas to learn from. I’m gonna stay away from safe friends
This article will take you deep into the principles of HTTPS encryption and decryption. I hope you can learn some things after reading this article:
- See what problem HTTPS solves
- Understand the principles and application scenarios of symmetric encryption and asymmetric encryption
- Understand the role of CA institutions and root certificates
Why HTTPS
In recent years, companies have been pushing ahead with HTTPS. Google Chrome labels non-HTTPS websites as “unsafe.” Apple requires APPS to use HTTPS for communication, and wechat mini programs also require HTTPS. So why do we have to do this?
Let’s look at HTTP first. Hypertext Transfer Protocol (HTTP) is an application layer Protocol used in distributed, collaborative and hypermedia information systems. It can be said that HTTP is the foundation of contemporary Internet communication.
However,, HTTP has a fatal defect, that is, the content is plaintext transmission, without any encryption, and these plaintext data will pass through WiFi, router, operator, computer room and other physical device nodes, if any of the nodes in the middle is monitored, the transmission content will be completely exposed, This attack technique is called The MITM (Man In The Middle) attack.
For example, it’s a little long, but it shows why I’m so obsessed with security
You can take the analogy of passing notes in class when you were a child. You sit on the side of the wall in the classroom and want to pass a message “I will wait for you at the playground after school at night” to Xiao Hong, who is sitting by the window, through six or seven people. Even if you fold the note in half, it’s easy for anyone in between to open the note and see what you’re trying to say.
Just look ok, if there are xiaogang also like Xiaogang, see you two will soon go home sweet, the heart is unwilling, changed a piece of paper, changed to ** “you go home after school at night, I’m going to Internet cafes to play games” **.
Xiao Hong saw you want to abandon her to play games, very sad, began to ask ** on the note “agreed to go home together, why do you want to play games, huh” **.
On the way back to Xiao Hong’s note, Xiao Gang changed the note ** “You play your game, I want to go home with Xiao Gang” **.
So, you and xiao Hong feel sad, Xiao Gang horizontal knife grabbed love, and you are confused.
Recall a few years ago hijacked, everywhere yes operators when you visit a web page was very normal, but on the page without rhyme or reason some labels, jump script, deceptive advertising red button, sometimes even had to download a file, the last down turned into another completely different things, These are all symptoms of HTTP plaintext data being hijacked by carriers.
Carrier hijacking
There is also a “Do not connect to strange WiFi” in the security training of employees of major companies. For similar reasons, the controller of malicious WiFi can see and tamper with the information transmitted through HTTP plaintext.
In 1994, Netscape proposed HyperText Transfer Protocol Secure (HTTPS) to solve the security problems caused by HTTP plaintext data transmission. Data communication is still HTTP, but SSL/TLS is used to encrypt data packets.
HTTPS Implementation Principle
HTTPS encrypts HTTP packets using SSL/TLS for transmission.
Secure Sockets Layer (SSL) and Transport Layer Security (TLS) are the same protocols.
When Netscape proposed the HTTPS protocol in 1994, it used SSL for encryption. Later, Internet Engineering Task Force (IETF) further standardized SSL and published the first version of TLS protocol file TLS 1.0 in 1999. The latest version of the TLS protocol is TLS 1.3, which was published in 2018.
The working process
Let’s take a look at HTTPS encryption and decryption process first.
HTTPS encryption and decryption process
- When a user initiates an HTTPS request (for example, juejin. Im /) in a browser, port 443 of the server is used for connection by default.
- HTTPS requires a set of CA digital certificates. The public key Pub is attached to the certificate, and the corresponding Private key Private is kept Private on the server.
- After receiving the request, the server returns the configured certificate containing the public key Pub to the client.
- The client receives the certificate, the check of legitimacy, mainly including whether in the period of validity, certificate and the request of the domain name matches, the next higher level certificate is valid (recursive judgment until the judge to the system built-in or browser configured root certificate), if not through, shows the HTTPS warning messages, if through continued;
- The client generates a random Key for symmetric encryption, encrypts it with the public Key Pub in the certificate, and sends it to the server.
- The server receives the ciphertext of the random Key and decrypts it using the Private Key paired with the public Key Pub to get the random Key that the client really wants to send.
- The server uses the random Key sent by the client to symmetrically encrypt the HTTP data to be transmitted and return the ciphertext to the client.
- The client uses a random Key to decrypt the ciphertext symmetrically to obtain the plain text of HTTP data.
- Subsequent HTTPS requests use the random keys exchanged before for symmetric encryption and decryption.
Symmetric and asymmetric encryption
Symmetric encryption and asymmetric encryption, a public Key and a private Key and a random Key, why so complex, a set of bad?
Symmetric encryption refers to that there is a key that can be used to encrypt a piece of plaintext. After encryption, only this key can be used to decrypt the plaintext. If both sides of the communication have a key, and it is known to each other and absolutely no one else, then communication security can be guaranteed (provided that the key is strong enough).
In the HTTPS transmission scenario, however, the server does not know who the client is in advance, you can’t in advance via the Internet and not every website administrator quietly arranged that a communication key, so there must be a key in the process of transmission on the Internet, if in the process of transmission is listening to others, Then all the subsequent encryption will be useless.
This is where we need another magical type of encryption, asymmetric encryption.
Asymmetric encryption has two keys, one public key and the other private key. Generally, the public key is used for encryption, while the ciphertext can only be unlocked with the private key.
Then, when the client initiates a connection request, the server transmits the public key, the client encrypts the information with the public key, and then sends the ciphertext to the server, which has the private key to decrypt.
But when the server to return to the data, if use the public key encryption, the client is not private key to decrypt, and if that is encrypted with the private key client, although there is a public key can decrypt, but before the public key transmission on the Internet, is likely to have been got, are not safe, so it is a process of asymmetric encryption is only can’t meet.
Note that, strictly speaking, private keys cannot be used for encryption, but can only be used for signature. This is because the mathematical requirements for different variables are different when generating public and private keys in cryptography. Therefore, the ability of public and private keys to resist attacks is different, and they are not interchangeable in actual use. The signature function is also useful in HTTPS, as described below.
Only one set of public and private keys can guarantee one-way encryption and decryption, so if we prepare two sets of public and private keys, can we solve this problem? Let’s look at this process.
- The server has public key A1 and private key A2 for asymmetric encryption.
- The client has asymmetric encryption public key B1 and private key B2.
- The client sends a request to the server, and the server returns public key A1 to the client.
- The browser receives public key A1 and sends the saved public key B1 to the server.
- After that, all data sent by the server to the client is encrypted using public key B1, and the client can decrypt the data using private key B2.
- All data sent from the client to the server is encrypted using public key A1, and the server can decrypt the data using private key A2.
In this case, the data in both directions is asymmetric encrypted to ensure security. Why not use this scheme?
The main reason is that the time of asymmetric encryption and decryption is much larger than that of symmetric encryption and decryption, which has a great loss of performance and poor user experience.
Therefore, we finally choose the asymmetric encryption + symmetric encryption scheme introduced above, and review again ↓↓↓
- The server has public key A1 and private key A2 for asymmetric encryption.
- The client initiates a request, and the server returns public key A1 to the client.
- The client randomly generates a symmetrically encrypted key K, encrypts it with public key A1 and sends it to the server.
- After receiving the ciphertext, the server decrypts it with its own private key A2 to obtain the symmetric key K. In this case, the secure symmetric key exchange is completed and the problem that the key transmission is stolen during symmetric encryption is solved
- After that, the key K is used for symmetric encryption and decryption.
It seems like a perfect solution for both security and performance, but is it really safe?
CA issuing Authority
In the case of man-in-the-middle attacks, asymmetric encryption algorithms are public, and anyone can generate a pair of public and private keys themselves.
When the server returns public key A1 to the client, the middleman replaces it with his own public key B1 and sends it to the browser.
At this time, the browser knows nothing about it. It uses public key B1 to encrypt key K and sends it out, which is intercepted by the middleman. The middleman decrypts the key WITH his private key B2, obtains the key K, and then encrypts the key A1 of the server and sends it to the server to complete the communication link.
Middlemen HTTPS
The core cause of this problem is that the client cannot confirm that the received public key is really from the server. To solve this problem, the Internet introduced a public trust agency, CA.
Server before using HTTPS to certified CA agency to apply for a digital certificate, digital certificate contains a certificate holders, the validity of the certificate, the public key information, such as the server certificate will be sent to the client, the client calibration certificate of identity and to access the site again after identity was consistent for subsequent cryptographic operations.
However, if the middleman is also smart and only changes the public key part of the certificate, the client still cannot confirm that the certificate has been tampered with, and some anti-counterfeiting techniques are needed.
In asymmetric encryption, the public key is used for encryption and the private key is used for decryption. Although the private key encryption is theoretically feasible, it is not suitable for the mathematical design to do so. Is the private key only used for decryption?
The real use of private key in addition to decryption is actually a digital signature, in fact, is a kind of anti-counterfeiting technology, as long as someone tampered with the certificate, then the digital signature is bound to verify failure. The specific process is as follows
- The CA has its own public key and private key
- The CA hashes the plaintext information of the certificate when issuing the certificate
- The hash value is signed with the private key to obtain a digital signature
The plaintext data and digital signature form a certificate and are passed to the client.
- The client gets the certificate, which is broken down into the plaintext part and the digital signature Sig1
- Unsign with the CA public key to obtain Sig2 (CA is a public identity, so the CA certificate and public key information are built into the system or browser)
- Hashing the plaintext section using the hash algorithm declared in the certificate yields T
- If the hash value H is equal to Sig2, the certificate is trusted and has not been tampered with
In this case, the signature is generated by the PRIVATE key of the CA organization. The middleman cannot obtain the private key of the CA organization after tampering with the information, ensuring the credibility of the certificate.
Note that there is a more difficult to understand, asymmetric encryption is the signature of the process, the private key will be a signature of the message, and then send the signature block with the message itself to the other side, after receiving the message part of signature using the public key attestation, if check out the content and the message itself, indicates that the message has not been tampered with.
During this process, the CA certificate and public key embedded in the system or browser become crucial links, which are also the public identity proof of the CA organization. If the SYSTEM or browser does not have the CA organization, the client can not accept the certificate returned by the server, and display HTTPS warning.
In fact, the CA certificate is A trust chain. A trusts B, and B trusts C. For example, the nuggets apply for A certificate to RapidSSL, and the IDENTITY of RapidSSL is authenticated by DigiCert Global root CA, forming A trust chain.
The private keys of CA organizations at all levels are absolutely private information. Once the private keys of CA organizations are leaked, their credibility will be destroyed. There have been several instances where CA private keys have been leaked, causing a crisis of trust, and major systems and browsers have to revoke the built-in root certificate of the corresponding CA.
Some old websites require users to download and install their own root certificate before using it. That is, the certificate used by this website cannot form a trust chain between the CA organization built into the system and the root certificate. They need to install their own root certificate to form a trust chain.
conclusion
The purpose of HTTPS is to solve the problem of information tampering and monitoring during HTTP plaintext transmission.
- In order to give consideration to performance and security, asymmetric encryption + symmetric encryption is used.
- In order to ensure that the public key is not tampered with during transmission, asymmetric encryption digital signature function is used, and the PUBLIC trust of HTTPS certificate is guaranteed by CA and system root certificate mechanism.
Like the small partner to add a concern, a praise oh, Thanksgiving 💕😊