HTTP is a great protocol that is simple and convenient to use. However, everything has two sides, HTTP is no exception, it has a relatively big security problem:
-
Communications are not encrypted, plaintext is used, and content can be eavesdropped
Since HTTP does not have encryption, it is impossible to encrypt the contents of a request or response, which are all sent in clear text. Once you get to the TCP/IP level, you can get hacked. Actually, even if it’s encrypted, the packets we send can be tapped, but they can’t decrypt it.
-
Do not verify the identity of the communicating party and may encounter camouflage
In HTTP communication, there is no processing step to determine the communicator, which makes it possible that the client or server is masqueraded.
-
The integrity of the message could not be proved and may have been tampered with
There is no way to determine whether the received request or response has been modified using HTTP. As a result, the content is modified in transit, completely unknown to the recipient. Therefore, there may be an intermediary between the client and the server, arbitrarily modifying the request and response, also known as a man-in-the-middle attack.
To solve these problems, HTTPS is here. But HTTPS is not a new protocol, it just adds encryption and authentication mechanisms over HTTP. Whereas HTTP communicates with TCP directly, HTTPS communicates with THE Secure Socket Layer (SSL) first, and then the SSL communicates with TCP. Also called HTTP Secure for short HTTPS.
HTTPS uses the Secure Socket Layer (SSL) and Transport Layer Security (TLS) protocols. Initially, there was only SSL, but SSL 3.0 was followed by TLS 1.0, TLS 1.1, and TLS 1.2. TLS is a protocol developed based on SSL, which can be understood as a derived relational security protocol. As a result, we sometimes refer to the two protocols as SSL.
Let’s look at how HTTPS solves the three problems with HTTP mentioned above.
-
HTTP communication is not encrypted, uses clear text, and the content may be eavesdropped
To this end, HTTPS encrypts the transmitted data. When it comes to encryption, we have to mention two categories of encryption algorithms: symmetric encryption and asymmetric encryption.
Symmetric encryption means that encryption and decryption use the same key. The most obvious example is the TV codebook. Both sides of the communication take a password book, according to the password book to send messages; After receiving the message, also according to the password book to parse. One problem with this is that key transfer is crucial. Symmetric encryption keys must never be divulged. In the military, if a spy got hold of the code book, the entire intelligence network would be broken.
There is also an asymmetric encryption. It has a public key, it has a private key. Generally speaking, public key encryption and private key decryption are used. (Note that in general, it is also possible to encrypt private keys and decrypt public keys, which we will cover in a moment). If the public key is used for encryption, only the private key can decrypt the data. At this point, the public key can be trusted to the other party, the transmission process does not worry about being stolen, even if stolen, the other party can not decrypt. Of course, this time to hide the private key secretly.
Since the key can be leaked during transmission, if we had to choose only one method of communication encryption, we would definitely choose asymmetric encryption: for example, the client holds a public key and a private key, and transmits the public key to the server. When the server sends a message, it encrypts it with the client’s public key. Similarly, the client saves a copy of the server’s public key…… That’s a good idea, but asymmetric encryption is slower than symmetric encryption. As a result, the real situation is a combination of the advantages of the two, the asymmetric encryption algorithm is used to transmit symmetric encryption keys, and the symmetric encryption algorithm is used for encryption after transmission.
-
Do not verify the identity of the communicating party and may encounter camouflage
The point is to use certificates, certificates issued by authorities, for example, you’ve heard HTTPS is out of date. HTTPS certificates have to be certified by the certificate-issuing institution, which can cost hundreds of dollars a year. And we’ll see how he does that when we get to the process.
-
The integrity of the message could not be proved and may have been tampered with
HTTP can’t tell if a packet has been modified or not, but we can use SSL to send data with a Message Authentication Code (MAC) digest, which can tell if the packet has been modified or not.
Let’s look at the specific HTTPS encryption process.
-
The Client sends Client Hello to start the communication, sends the SSL version, the list of encryption components it supports, and generates A random number that will be used later, let’s call it random number A.
-
The Server responds with a Server Hello, selects one from the client’s list of encrypted components, and sends it back, as well as the SSL version. And as in the previous step, return a random number to the client, denoted as random number B.
-
The server sends a certificate, issued by an authority, containing the public key of the server. Later, after the client receives and verifies the information, it needs to use the public key to send the subsequent symmetric encryption key.
So how do we verify that the certificate is valid? Here use private key encryption & public key decryption. When an authority issues a certificate, it abstracts its certificate, generates a fingerprint, and encrypts the fingerprint using its private key to generate a digital signature. After receiving the certificate, the receiver uses the algorithm specified by the certificate to digest the certificate, and then uses the public key to decrypt the digital signature. If the result is consistent with the digest, the receiver is proved to be trusted.
-
The Server sends a Server Hello Done packet. Ok, phase 1 is over.
-
When the client receives the certificate, the way to verify it is to look for the local certificate. Most browsers have some trusted certificates built in. Then the client verifies that it is really trusted. Ok, use the public key in the certificate sent by the server in the previous step to encrypt a random number pre-master generated by itself. This phase is also called sending Client Key Exchange packets.
-
Then, the client sends a Change Cipher Spec packet. This is used to prompt the server to use the pre-master Secret key. So what is this pre-master Secret? In Parts 1 and 2, the client and server each generated a random number and sent it to each other. Both sides also have a pre-master random number. The symmetric key is generated by random number A, random number B, and pre-master. This is the symmetric key used in subsequent encryption.
-
The client sent a Finished packet. Procedure The packet contains an overall check value to determine its integrity.
-
The server sends a Change Cipher Spec packet to inform the client of the symmetric key
-
The server sent a Finished packet. Procedure
-
After the server and client exchange Finished packets, the SSH connection is established and HTTP requests are sent.
Speaking of, the whole process is still very round, very troublesome. It was also a bit of a struggle to understand because there was no actual contact with the protocol. After writing this, there is still a bit of confusion, and the biggest confusion is that I don’t have an intuitive understanding of why HTTP is insecure, except that it has these drawbacks. Can you understand how I feel? Just know HTTP has these shortcomings, but also just know, do not know exactly how to attack in reality, maybe the actual follow-up packet capture to go through the process will be clearer. I read Uncle Mouse’s blog, also have seen him have similar experience, read so much TCP/IP protocol knowledge, only the real computer room in the know what is.
But HTTPS encryption process is more familiar, is to achieve their purpose.