This is the 10th day of my participation in Gwen Challenge
The basics of HTTPS security
HTTPS is more secure than plain text HTTP. Security is defined by confidentiality, integrity and identity verification.
Confidentiality — symmetric/asymmetric/mixed encryption
To ensure the security of information transmission, encryption algorithms must be used to encrypt information. Both encryption and decryption rely on a key, so the key is how to safely transfer the key in the first place, known as “key exchange.”
Symmetric encryption:
Encryption and decryption use the same key (key). But how to transfer the key at the very beginning becomes a problem. If the hacker intercepts the key, he can hijack and tamper with the information, which cannot ensure the security of the information
Asymmetric encryption:
Encrypt with public key, decrypt with private key. Assuming that the client is the sender and the server is the receiver, the server generates a pair of public and private keys. The private key is saved by the client, and the public key is sent to the client. The client encrypts the information using the public key each time, and then sends the information to the server, which then decrypts the information using the private key. Even if a hacker intercepted the public key, it would not matter much, because information must be decrypted using the private key
Hybrid encryption:
Asymmetric encryption solves the security problem of key exchange, but each communication requires time-consuming encryption and decryption, which further slows down the communication speed of both parties. Hybrid encryption is a combination of symmetric encryption and asymmetric encryption:
- The server generates a pair of public and private keys based on an asymmetric encryption algorithm and sends the public key to the client
- The client generates a session key for subsequent communication based on the symmetric encryption algorithm, encrypts the information to be sent with the session key, and encrypts the session key itself with the public key. In other words, the client sends two things to the server:
- One is a session key encrypted with a public key
- One is the message encrypted with the session key
- After the server gets it, it first decrypts it with the private key to get the session key, and then decrypts it with the session key to get the actual information.
After that, the problem of securely passing the key (session key) is actually solved. In each subsequent communication, both ends only need to use the same session key for encryption and decryption, thus ensuring security and performance.
Integrity and authentication — digest algorithms, digital signatures, digital certificates
So far only certain confidentiality has been guaranteed, but two problems remain: the first is authentication. The public key of the server is freely distributed, meaning that hackers can impersonate the client with the public key. Conversely, a hacker can impersonate a server to distribute a public key. On both the client and the server, verifying the identity of the other party becomes a problem; The second problem is data integrity. Since hackers can impersonate clients or servers, they can hijack and tamper with data, so how to ensure data integrity becomes a problem.
Digital Signature:
Digital signatures and asymmetric encryption are reversed, with private keys for encryption and public keys for decryption. Based on the digest algorithm, the server computes the hash of the data to be transmitted to obtain a digest corresponding to the data, encrypts the digest with the private key, forms a signature, and sends the signature and data to the client. After the client receives the signature and data:
- Solve the authentication problem: first, use the public key obtained before the signature check (decrypt signature), get the abstract. The public key and private key are a pair. The client thinks that since it can decrypt the signature received at this time with the public key sent by the server, it can ensure that the signature is indeed encrypted by the server itself and then sent, and the authentication is ok.
- Address data integrity issues: The client then hashes the same data to get another digest, which is compared to the digest in the signature, and if it’s consistent, the data is intact and not tampered with — because if it’s tampered with (even slightly), A client hash based on the tampered data will produce a completely different digest than the one in the signature.
Digital Certificates:
But all of this assumes that the public key held by the client is really the public key of the server. You know, the public key is initially the service side free distribution, thus may be hackers had hands and feet, and then the hacker to send your public key to the client, the client don’t know, may have been in their and hackers, rather than the server communication, after all its belief that it really received the server’s public key, and its public key can decrypt the other stuff that is encrypted with the private key, So there is no suspicion.
To solve this situation, we need to find a trusted third party (CA) to participate in the distribution of the server’s public key through it (rather than the server) to ensure that the client gets the public key from the server. To be specific, the server first submits information to the CA to apply for a certificate, which reads:
- Basic information: CA information, server information, server public key, and validity period
- Signature: Generate a summary of the above information and encrypt the summary with the CA’s private key to obtain a signature.
The certificate is now signed. The CA sends the generated certificate to the server, and the server sends the certificate to the client. The client checks the certificate using the public key obtained from the CA. Once the certificate passes, the client considers that the certificate is from the server, and can use the public key carried in the certificate. At this point, you have solved the problem of ensuring that the public key held by the client is indeed the public key of the server, and the flow is the same as before. So how do you ensure that a CA can be trusted? Small cas are signed and authenticated by large cas, forming a trust chain/certificate chain. At the end of the chain is a self-signed root certificate, which we must trust unconditionally.
As mentioned above, the server needs to send the certificate to the client, so could a hacker tamper with the certificate in the process?
- For example, might a hacker tamper with the basic information of a certificate? No, as long as the basic information is tampered with, when the client compares the hash of the information with the decrypted signature of the public key, they will find that the two are inconsistent, so the certificate and the public key it carries are considered to be problematic.
- So, can the hacker tamper with the basic information of the certificate, then regenerate the corresponding digest, and then encrypt the digest to form the signature? This is also impossible because the hacker does not have the CA’s private key to form a signature himself.
HTTPS security implementation – TLS connection
TLS 1.2 Four handshakes
HTTPS security is implemented through SSL/TLS — HTTPS = HTTP over SSL/TLS, that is, a TLS layer is added between the original application layer and transport layer. In addition to the previous TCP three-way handshake, the client also needs to establish a TLS connection with the server through a four-way handshake.
First, the TCP connection is established using the three-way handshake:
Then set up a TLS connection with a four-way handshake:
- The client sends a Client Hello message.
- TLS Protocol version supported by yourself
- List of supported Cipher suites
- A random number
- After receiving the Client Hello message, the server:
- Sends a server Hello message to reply:
- TLS protocol version negotiated with the client
- A cipher Suite chosen from a list of cipher suites
- A random number (Server Random) sends the Server Certificate message to pass the server certificate
- Send the server key Exchange message, passing the parameters of the key exchange algorithm (Server Params)
- Sending the Server Hello done message indicates that the Server Hello is complete
- After the client verifies the certificate and confirms the identity of the server:
- Send client key exchange, pass parameters of key exchange algorithm (Client params)
- The client and the server hold the shared client params and server Params, and use ECDHE algorithm to calculate the pre-master secret.
- The client and server generate the master secret by holding the shared Client Random, Server Random, and pre-master Secret. (Three random numbers are used here to increase randomness in cases where both ends may be unreliable)
- The master key derives the session key
- The client sends a change Cipher Spec message to inform the server to encrypt the communication with the session key
- The client sends the FINISHED handshake
- When the server receives the message:
- Send “change Sipher spec” to indicate that you agree to use the session key to encrypt the communication between the two parties
- Send finished to end the handshake
This completes the TLS four-way handshake, after which both parties send and receive encrypted HTTP requests and responses.
PS: The above procedure uses ECDHE to exchange session keys, which differs from RSA in two ways:
- Using ECDHE, the two ends need to exchange the parameters of the key exchange algorithm, and use two parameters to generate pre-master secret. Using RSA, you only need to generate pre-master secret directly from the client side and share it with the server side. Of course, the final use of pre-master secret + Client random + Server Random to generate master secret is the same
- Using ECDHE, you can implement TLS False Start. The client can send HTTPS requests before the server responds with finished. After RSA is used, request responses can be sent and received only after both parties finished.
TLS 1.3 Three-way handshake
In TLS 1.3, the client carries the parameters of the key exchange algorithm (Client params) at the beginning of sending the Client Hello message, rather than at the second sending of the message as in TLS 1.2. Therefore, 1.3 only requires three handshakes. 1 round trip (1-RTT); 1.2 requires four handshakes and two round trips (2-RTT)
HTTPS Performance Optimization
HTTPS is generally slower than HTTP because it requires a handshake to create a TLS connection and consumes up to two RTTS. How to optimize HTTPS connection speed? We can start from the following aspects:
-
Hardware optimization: faster and better CPU, accelerated handshake and transmission SSL accelerator card, accelerated encryption and decryption
-
Software optimization: Linux kernel upgrade, Nginx upgrade, OpenSSL upgrade, etc
-
Protocol optimization: TLS 1.3 is used to greatly simplify the handshake process, only 1-RTT uses ECDHE as the key exchange algorithm, and supports TLS False Start
-
Certificate optimization:
- Optimize the transfer process with ECDSA certificate and optimize certificate verification with OCSP
-
Session reuse:
The focus of TLS connections is on creating a common master key (or session key). It would be too much effort to recreate the master key every time a TLS connection is created – if this master key can be cached, it can be used by the same client and server when they connect again. This is session multiplexing, and session multiplexing can be implemented in two ways: session ID and session ticket.
- Session ID: After the client and server successfully shake hands for the first time, the two sides cache a key-value. Key is the session-ID unique to the session, and value is the session key. The next time the same client establishes a TLS connection with the server, it will carry the previous session-ID in the Client Hello, and the server will use the session-ID to find the related value in the memory. Once it finds the value, it will directly use the session key to restore the session state. Establish communication directly within an RTT.
- Disadvantage 1: The server needs to store a large number of key-values, which is stressful
- Disadvantage 2: If multiple servers are used for load balancing, the client may not hit the previously accessed server, so you still need to go through the complete TLS connection process
- Session ticket: The server does not reserve the session status but distributes pressure to the client. In this case, the server encrypts the session key, forms a ticket, and sends the session key to the client. The client saves the ticket. When connecting to the server again, the client sends the ticket to the server. The server decrypts the ticket to obtain the session key.
- Pre-shared key: No matter the session ID or session ticket, 1-RTT is required to restore the session status. The pre-shared key proposed in TLS 1.3 only requires 0-RTT. Its principle is similar to session ticket, but the client sends ticket and HTTPS requests when connecting again. In this way, if the server has no problem verifying the session key, You can restore the session state and respond with data to the client on this first return.
For security purposes, you need to verify the validity of the session key in each of these methods.
- Session ID: After the client and server successfully shake hands for the first time, the two sides cache a key-value. Key is the session-ID unique to the session, and value is the session key. The next time the same client establishes a TLS connection with the server, it will carry the previous session-ID in the Client Hello, and the server will use the session-ID to find the related value in the memory. Once it finds the value, it will directly use the session key to restore the session state. Establish communication directly within an RTT.