preface
Speaking of network communication protocols, I believe you are familiar with TCP and HTTP, which can be said to be the cornerstone of today’s Internet communication. However, in terms of network security, they have great security risks:
-
Risk of wiretapping. Third-party attackers can eavesdrop on communications at will, for example by obtaining payment account passwords.
-
Impersonation risk. Third-party attackers can impersonate someone else to communicate with you, such as impersonating a bank’s website to steal bank account passwords.
-
Tampering with risk. Third-party attackers can modify communications at will, such as adding phishing urls to responses.
Therefore, SSL/TLS protocol came into being. SSL/TLS is a secure communication protocol built on top of the transport layer and below the application layer. It is designed to eliminate the security risks mentioned above and ensure network communication security. HTTPS is known as HTTP + SSL/TLS. SSL/TLS is the cornerstone of secure communication on the Internet today.
Now, if you were to design SSL/TLS protocols, what would you do?
This article will introduce how to design a simple SSL/TLS protocol step by step from the perspective of designers. At the end of this article, we will briefly introduce the working mechanism of TLS 1.2 to help you have a deeper understanding of the basic principles of SSL/TLS protocol.
Data encryption based on symmetric encryption algorithm
The risk of eavesdropping is mainly because the communication parties transmit data in plaintext on the network, so the attacker can obtain the communication content through simple network packet capture.
The best way to address the risk of eavesdropping is to encrypt data. That is, the client encrypts the data before sending it. After receiving the ciphertext, the server decrypts and restores the data. This prevents plaintext from spreading over the network and thus prevents third-party attackers from eavesdropping.
When it comes to encryption algorithms, many people first think of symmetric encryption algorithms, which are known for their simplicity and efficiency. Symmetric encryption means that both encryption and decryption use the same key. Common algorithms include DES and AES.
Now, let’s try using symmetric encryption algorithms for secure communication:
Symmetric key encryption is based on the premise that both communication parties must use the same key to encrypt data. There are mainly offline and online key exchange schemes to achieve this purpose:
-
Offline key exchange, that is, the communication parties agree to exchange keys in person (for example, through a USB flash drive). This scheme can guarantee the security of key exchange, but it is difficult to popularize. In most scenarios, the client and server will never meet.
-
Online key exchange, that is, the transfer of keys over the network. But the plaintext transmission key on the network can also be intercepted by the attacker, such encryption is meaningless.
Therefore, simple symmetric encryption cannot meet the requirements of communication security, and we need to continue to optimize……
Data encryption based on asymmetric encryption algorithm
Asymmetric encryption algorithm refers to encryption and decryption using different keys, the two different keys form a key pair, namely public key and private key. A public key is a public key that anyone can access. The private key is confidential. When the public key is used to encrypt data, only the corresponding private key can be decrypted. Common asymmetric encryption algorithms include RSA and ECC.
Now, we try to use asymmetric encryption algorithms for secure communication:
By using asymmetric encryption algorithm, we can not only encrypt data, but also solve the problem of key exchange, thus eliminating the risk of eavesdropping. However, the biggest disadvantage of asymmetric encryption algorithm is that the encryption and decryption speed is very slow, more than 1000 times slower than symmetric encryption algorithm. Therefore, asymmetric encryption algorithms are usually only suitable for encrypting small amounts of data.
So far, the simple use of symmetric encryption algorithm or asymmetric encryption algorithm can not meet the requirements, but also need to continue to optimize……
Data encryption based on symmetric encryption + asymmetric encryption algorithm
Symmetric encryption algorithm is fast in encryption and decryption, but it has the problem of key exchange. Asymmetric encryption algorithm can solve the key exchange problem, but the encryption and decryption speed is slow. Therefore, we can combine the two algorithms, that is, data encryption is carried out through symmetric encryption algorithm, and symmetric encryption is used to encrypt symmetric key when exchanging symmetric key, so as to ensure that the key cannot be eavesdropped by attackers during network transmission.
Now, we try to use symmetric encryption + asymmetric encryption algorithm to achieve secure communication:
Using the scheme of symmetric encryption + asymmetric encryption algorithm, we eliminate the risk of eavesdropping, and there is no encryption and decryption performance problem, but still can not eliminate the risk of impersonation.
Consider the following scenario:
-
The attacker intercepts the public key of the server and saves it.
-
The attacker impersonates the server and sends its public key to the client.
-
An attacker intercepts the symmetric key encrypted with an illegal public key, decrypts the symmetric key, and saves the plaintext
-
The attacker uses the server public key to re-encrypt the symmetric key and sends the forged key to the server as a client.
After this operation, the attacker can obtain the symmetric key without the knowledge of both client and server. In this scenario, the attacker moves from passive-aggressive eavesdropping to aggressive impersonation, making both the client and the server think they have been communicating with each other.
Therefore, we need to find a way for clients to ensure that the public key they receive must be sent from the real server, that is, to authenticate the real identity of the “server”……
CA certificate-based identity authentication
Overview of Digital Certificates
Quoting the definition of Baidu Baike:
Digital certificate refers to a digital authentication that marks the identity information of each party in Internet communication. People can use it to identify each other on the Internet.
A Digital Certificate is like a real world ID card that identifies the legal identity of a network user (person, company, server, etc.). Just as the ID card must be issued by the Public Security Bureau, the trusted digital Certificate must also be issued by an authoritative organization, namely Certificate Authority (CA). The digital Certificate issued by CA is usually called CA Certificate.
A CA certificate contains plaintext information such as the public key of the applicant, information about the applicant, information about the issuing authority (CA), validity time, and certificate serial number, as well as a DIGITAL signature of the CA. It is the existence of the signature that proves the validity of the certificate.
Digital signature is based on asymmetric encryption algorithm. When issuing a certificate, a CA uses a specified algorithm (such as SHA256 algorithm) to calculate a digital digest of the plaintext information of the certificate, and then uses the CA’s private key to encrypt the digest to form a signature.
Certificate verification consists of the following two parts:
-
Check whether the plaintext information of the certificate is valid, for example, whether the certificate has expired and whether the domain name is consistent.
-
The public key published by CA is used to decrypt the signature of the certificate, and the digital digest 1 is obtained. The same algorithm is used to calculate the digital summary 2 for the plaintext information of the certificate. Compare number 1 and number 2 to see if they are equal.
There is not only one certificate issuing agency. For example, agency A can use the root certificate issued by the CA to issue A level 2 certificate to agency B. Institution B can then use level ii certificates to issue Level III certificates to Institution C, and so on, the so-called certificate chain.
Use the CA certificate to authenticate the identity of the communication party
Now, we add the CA certificate to authenticate the identity of the communicating party:
After the CA certificate is imported, the public key of the server is stored in the certificate provided by the server. When the client verifies that the server certificate passes, it indicates that the public key is a valid public key from the server. In this way, the subsequent communication flow can proceed normally.
However, if the symmetric key remains unchanged, it is still possible for an attacker to brute force the symmetric key. Therefore, we also need a symmetric key that is best able to be different for each connection……
Use random numbers to generate symmetric keys
In order to make the symmetric key of each connection different, we can introduce random number to generate symmetric key and ensure its randomness. However, considering that the random numbers currently generated by the computer are all pseudo-random numbers, in order to further enhance the randomness, we can achieve this goal by generating multiple random numbers.
We can design it like this:
-
The client and server generate a random number (1 and 2) and exchange them in ClientHello and ServerHello packets.
-
After authenticating the identity of the server, the client generates a random number 3 and sends it to the server using the public key of the server. (At this point, both sides of the communication have 3 same random numbers)
-
The client and server then generate the final symmetric key according to the three random numbers.
In this way, the randomness of the key generated by three random numbers is ensured and the possibility of being cracked by attackers is reduced.
Although random number 1 and 2 are plaintext transmission, random number 3 is secret transmission, which ensures that it is difficult for an attacker to crack the key.
So far, we have successfully prevented attackers from eavesdropping on communication between clients and servers through various means. However, if the attacker does not want to eavesdrop on the content of communications, but simply wants to cause damage. For example, an attacker intercepts a ClientHello packet and changes the random number 1 to 4. As a result, the keys generated by the client and server are inconsistent. In this scenario, although the connection has been established, the client and server cannot communicate properly:
To do this, we need a mechanism to verify the correctness of all messages during the connection establishment phase (the handshake phase) and prevent the establishment of incorrect connections……
Verify the correctness of the handshake message
We can use Numbers to check the correctness of all handshake messages, that is, at the end of the handshake phase, both sides of communication through the Hash algorithm (such as SHA256) to receive and send all messages to calculate the Numbers in this paper, and then use the negotiated symmetric key for the digital encrypt, sent to the other party.
After receiving the digital digest ciphertext from the peer party, decrypt it with the symmetric key. If the decryption succeeds, the key generation is successful. The two digital summaries are then compared to see if they are consistent. If they are, the handshake messages have not been tampered with and the correct connection can be established.
By now, we have largely eliminated the risk of eavesdropping (through data encryption), impersonation (through certificate authentication), and tampering (through digital digitization). However, in order to establish a secure communication channel, we need to go through several steps such as message interaction, encryption and decryption, identity authentication, etc., which has certain performance loss.
Because after a handshake, the key has been negotiated and saved by both parties. When establishing a connection next time, you can use the key negotiated during the last handshake to avoid renegotiation and improve performance. We need a mechanism to reuse sessions to improve protocol performance……
Reuse sessions to improve performance
In order to retain the key negotiated last time, we assign a session ID to each connection.
-
When a connection is first created, it is generated by the server and returned to the client via ServerHello.
-
The next time a connection is created, the client sends the session ID to the server via ClientHello, indicating that it wants to reuse the session.
-
After receiving the session ID, the server sends a Finished message, indicating that it agrees to reuse the session. That is, the server can use the keys negotiated in the last session for secure communication.
At this point, we have completed a simple SSL/TLS protocol design. The real SSL/TLS protocol is certainly not that simple, but the core ideas and basic principles are similar. However, SSL/TLS adds some mechanisms for better security, scalability and ease of use, such as supporting multiple encryption algorithms and using MAC instead of common digital digest to complete integrity verification.
Below, we will briefly explain how the real SSL/TLS protocol works.
Overview of the SSL/TLS protocol mechanism
Secure Sockets Layer (SSL) is the predecessor of THE Transport Layer Security (TLS) protocol. Their versions have evolved as follows. The latest version is TLS 1.3. In this section, we will analyze TLS 1.2, the most widely used version, and introduce the basic working mechanism of SSL/TLS.
SSL 1.0 -> SSL 2.0 -> SSL 3.0 -> TLS 1.0 -> TLS 1.1 -> TLS 1.2 -> TLS 1.3
Overview of SSL/TLS protocols
SSL/TLS protocol is located between the transport layer and the application layer in the network protocol stack. It can be divided into two layers, with a total of five sub-protocols:
Record deal
The lowest level of Record protocol is responsible for the encapsulation of sub-protocols on the upper layer, providing the ability to secure communication:
-
Private connection. Symmetric encryption algorithms (such as AES and RC4) are used to encrypt data. In addition, the communication parties negotiate different encryption keys during each connection to achieve better security. In addition, the Record protocol can also provide unencrypted encapsulation, such as Hello packets in the handshake phase.
-
Reliable connection. The Message Authentication Code (MAC) is used to verify data integrity. Similarly, you can omit this feature during the handshake phase.
Handshake protocol
Handshake provides identity authentication, encryption algorithm, and key negotiation for both communication parties.
-
Identity authentication. The peer is authenticated based on the CA certificate, which uses the asymmetric encryption technology (such as RSA and DES). This feature is optional, but it is common practice to do at least one-way authentication.
-
Negotiation of security parameters. Security parameters (such as encryption algorithms, hash algorithms, and keys) are negotiated to ensure that attackers cannot obtain keys during the negotiation.
-
Reliable negotiation. During the negotiation of security parameters, the attacker cannot tamper with packets.
Handshake contains the following packet types: ClientHello, SeverHello, Certificate, ServerKeyExchange, Certificate, ServerHelloDone, ClientKeyExchange, CertificateVer Ify, ChangeCipherSpec, Finished.
Change Cipher Spec
The Change Cipher Spec protocol is also used in the handshake phase. When a communication party sends a Change Cipher Spec packet, it indicates that the key has been negotiated. The key is used for encryption transmission from the next message.
Alert protocol
The Alert protocol is used only when the connection is abnormal. The current protocol defines the Alert message types as follows:
1CLOSE_notify: the sender does not send any more messages, which is used to close the connection properly. This is similar to FIN packet 2unexpected_message in TCP: unexpected messages 3BAD_record_mac: If the MAC address in the received message is incorrect, the message has been tampered with. 4decryption_failed_RESERVED: Decryption fails. It is an earlier version of TLS. 6DecompresSION_failure: Failed to decompress when using the compression function. 7Handshake_failure: Failed to negotiate the correct security parameters during the handshake phase. 8no_certificate_RESERVED: 9BAD_certificate: certificate signature authentication failed 10UNsupported_certificate: unsupported certificate type received 11certificate_revoked: Receiving an expired certificate 12CERTIFicate_EXPIRED: receiving an expired certificate 13Certificate_UNKNOWN: Certificate exception scenarios except the preceding four scenarios 14ilLEGal_parameter: 15Unknown_CA: indicates a certificate issued by an untrusted CA. 16Access_denied: indicates that the certificate passed, but the sender refused to continue the handshake. 17decode_error: indicates that the message fails to be decoded. Security steps, such as signature verification failures and Finished message verification failures, fail in the handshake phase. 19export_restriction_RESERVED: the earliest TLS version uses 20protocol_version. The protocol version does not support 21Insufficient_Security: The security algorithm required by the server and the client cannot sufficient22internal_error: An internal error of the protocol 23user_Canceled: The user closes the connection abnormally. 24NO_renegotiation: refuses to renegotiate. 25unsupported_extension: indicates an unsupported extensionCopy the code
Application Data protocol
Application Data protocol is used in the communication stage to encapsulate the Data of the Application layer. After encapsulation by Record protocol, it is forwarded by TCP protocol.
Handshake process of SSL/TLS protocol
First handshake
-
The client sends a ClientHello packet to the server to initiate connection establishment. The ClientHello packet contains the following information:
-
Version: TLS protocol Version supported by the client
-
Random: A Random number generated by the client, which is then used to generate master secret
-
SessionID: indicates the SessionID. If it is not empty, the client wants to reuse the session
-
CipherSuites: list of encryption suites supported by the client, which must be carried if the SessionID is empty
-
CompressionMethods: List of compression algorithms supported by the client
-
Extensions: Indicates the extension content
-
Second handshake
-
The server sends a ServerHello packet to the client, which contains the following information:
-
Version: TLS Version. Select the highest Version supported by both communication parties
-
Random: A Random number generated by the server, which is then used to generate master secret
-
SessionID: SessionID. If the value is empty, the server starts a new session and does not want to reuse it. If it is the same as the SessionID brought by the client, the session is reused. Otherwise, start a new session that may be reused in the future
-
CipherSuite: Selected encryption suite
-
CompressionMethod: Indicates the selected compression algorithm
-
Extensions: Indicates the extension content
-
-
The server sends a Certificate packet to the client, which contains the server’s Certificate in X. 509 format and contains information such as the server’s public key, server domain name, issuer information, and validity period.
This parameter is optional. Send this parameter when the client needs to use a certificate to authenticate the server
-
The Server sends a Server Key Exchange packet to the client, which contains the security parameters used by the client to generate premaster Secret.
This parameter is optional. The Certificate packet is sent when the information contained in the Certificate packet cannot support the client to generate premaster Secret
-
The server sends a CertificateRequest packet to the client to obtain the client certificate, which contains the desired certificate type, signature algorithm, and CA list.
Optional. Send when bidirectional authentication is enabled
-
The server sends a ServerHelloDone packet to the client, indicating that the current server has sent all contents related to the key exchange.
Third handshake
-
The client sends a Certificate packet carrying the client Certificate to the server.
This parameter is optional. This parameter is sent when a CertificateRequest packet is received from the server
-
The client sends a ClientKeyExchange message to the server, which contains a preMaster Secret (random number) encrypted with the server’s public key, and is then used to generate the Master Secret.
-
The client sends a CertificateVerify packet to the server, which contains the digital signatures of all handshake packets sent so far to prove that its private key matches the public key in the previously sent certificate.
This parameter is optional when the Certificate packet is sent to the server
-
The client sends a ChangeCipherSpec message to the server, indicating that the encrypted transmission starts from the next message.
-
The client sends a Finished packet to the server, containing the summary of all handshake messages, to prevent tampering.
The fourth handshake
-
The server sends a ChangeCipherSpec message to the client, indicating that the encrypted transmission starts from the next message.
-
The server sends a Finished packet to the client, containing the summary of all handshake messages, to prevent tampering.
The last
SSL/TLS protocol is not absolutely secure, it also has many vulnerabilities are constantly mined by hackers, of course, SSL/TLS protocol is constantly improved. TLS 1.3, released in 2018, has many enhancements over TLS 1.2. For example, in terms of performance, the handshake phase is reduced from 2-RTT to 1-RTT, and 0-RTT mode is supported. In terms of security, ServerHello packets are encrypted before transmission, and some insecure encryption suites (such as static RSA and Diffie-Hellman) are not supported.
Although TLS version 1.3 has many mechanical changes, the basic principles remain the same. Therefore, with a thorough understanding of the SSL/TLS protocol principle, no matter how the version evolves, we can quickly complete the learning of the protocol mechanism.
reference
Transport Layer Security (TLS) Protocol Version 1.2, RFC 5246
The Transport Layer Security (TLS) Protocol Version 1.3, RFC 8446
Overview of SSL/TLS protocol operation mechanism, Ruan Yifeng
Overview of SSL/TLS Encryption, MicroSoft Document
The First Few Milliseconds of an HTTPS Connection, Jeff Moser
Why does HTTPS require 7 handshakes and 9 times delay, faith-oriented programming
HTTPS authoritative guide, Yang Yang, Li Zhenyu et al
Digital certificate, Baidu Baike
For more articles, please pay attention to wechat public number: Yuan Runzi’s invitation