Source: official account [Jie Ge’s IT Journey]
Author: Alaska
ID: Jake_Internet
What does the HTTPS protocol have more than HTTP?
Hi, I’m Jack. HTTP protocol: HTTP protocol: HTTP protocol
HTTP profile
HTTP is short for Hyper Text Transfer Protocol. It is used to Transfer hypertext from the World Wide Web server to the local browser.
HTTP is a TCP/ IP-based communication protocol to transfer data (HTML files, image files, query results, etc.).
HTTP is an object-oriented protocol belonging to the application layer. Because of its simple and fast way, it is suitable for distributed hypermedia information system. It was put forward in 1990. After several years of use and development, it has been constantly improved and expanded. Currently the sixth version of HTTP/1.0 is used in the WWW. The standardization of HTTP/1.1 is under way, and the proposal of HTTP-NG(Next Generation of HTTP) has been put forward.
The HTTP protocol works in the client-server architecture. As the HTTP client, the browser sends all requests to the HTTP server through the URL. The WEB server sends response information to the client based on the received requests.
HTTP features:
- Simple and fast: when a client requests services from the server, it only needs to send the request method and path. The commonly used request methods are GET, HEAD and POST. Each method specifies a different type of contact between the client and the server. Because HTTP protocol is simple, the HTTP server program size is small, so the communication speed is fast.
- Flexibility: HTTP allows the transfer of any type of data object. The Type being transferred is marked by content-Type;
- Connectionless: The meaning of connectionless is to limit processing to one request per connection. The server disconnects from the customer after processing the request and receiving the reply from the customer. In this way, transmission time can be saved.
- Stateless: HTTP is a stateless protocol. Stateless means that the protocol has no memory for transaction processing. The lack of state means that if the previous information is needed for subsequent processing, it must be retransmitted, which can result in an increase in the amount of data transferred per connection. On the other hand, the server responds faster when it doesn’t need the previous information;
- Support B/S and C/S modes;
With all of these advantages, the question is, what’s wrong with HTTP? The answer is yes, and the reason is simple: if HTTP is perfect, why do you need a secure protocol called HTTPS?
Drawbacks of HTTP:
When we send private data to the server (such as your bank card, ID card), if we use HTTP communication. Then security will not be guaranteed;
First, during data transmission, the data may be captured by middlemen, so the data will be stolen by middlemen.
Secondly, after the data is obtained by the middleman, the middleman may modify or replace the data and then send it to the server.
Finally, after the server receives the data, it cannot determine whether the data has been modified or replaced. Of course, if the server cannot determine the data is really from the client.
In summary, HTTP has three drawbacks:
- Confidentiality of messages cannot be guaranteed;
- The integrity and accuracy of the message cannot be guaranteed;
- The reliability of the source cannot be guaranteed;
Introduction of HTTPS
What can be done about HTTP? HTTPS was created to solve these problems.
HTTPS (Hyper Text Transfer Protocol over Secure Socket Layer) is an HTTP channel for security purposes. It is the Secure version of HTTP.
That is, add SSL layer to HTTP, and SECURE Sockets Layer (SSL) is the basis of HTTPS security. Therefore, SSL is required for details of encryption. It is now widely used for security-sensitive communications on the World Wide Web, such as transaction payments.
HTTPS uses an asymmetric encryption algorithm to make it impossible to reverse the plaintext. Now let’s see what the specific workflow is.
Working principle:
HTTPS establishment process
Here, the ESTABLISHMENT of HTTPS to disconnect is divided into 6 phases, 12 processes. Each of the 12 processes will be explained below:
1. Client-hello: The Client sends a Client Hello packet to initiate SSL communication. The packet contains the specified VERSION of SSL supported by the client and the Cipher Suite list (encryption algorithm and key length).
2. Server-hello: When SSL communication is enabled, the Server responds with a Server Hello packet. As with the client, the message contains the SSL version as well as the encryption component. The server encryption component content is filtered from the received client encryption component;
3. Server – Certificate sending: The server sends certificate packets. The message contains a public key certificate;
4. Server – I’m Done: Finally, the Server sends a Server Hello Done message to notify the client that the initial SSL handshake negotiation is complete.
5. Client – Send Key: After the first SSL handshake, the Client responds with a Client Key Exchange message. The message contains a random password string called pre-master secret used in communication encryption. The message has been encrypted with the public key in Step 3.
6. Client – Use this secret key: The message will prompt the server that the communication after this message will be encrypted with the pre-master secret key.
7. Client – I’m done: this packet contains the overall checksum value of all the packets so far connected. Whether the handshake negotiation can succeed depends on whether the server can decrypt the message correctly.
8. Server – Sends a C Change Cipher Spec message (I am receiving the secret key);
9. The server sent the D Finished message (I received the secret key).
10, the client – start to send the body: the server to send HTTP request, send relevant content;
11. Server – Start receiving body: the client receives the HTTP request and processes the relevant content;
12. Client – Disconnect: The client is finally disconnected. When the connection is disconnected, the close_notify packet is sent. After this step, the TCP FIN packet is sent to close the communication with TCP.
In addition, in the preceding flow chart, the application layer attaches a message digest called MAC when sending data. MAC checks whether packets are tampered to ensure packet integrity.
To illustrate this, the following illustration is a little more detailed than the figure above for digital certificates (image from Illustration HTTP).
The establishment of HTTPS and the process of communication are described above. Since the actual workflow looks like this, what algorithm can achieve such a function, and what is the way to achieve asymmetric encryption? How does that work out mathematically? So what’s the theoretical basis for that? What underpins HTTPS that allows him to encrypt the transmission?
The theoretical principle of HTTPS:
HTTPS uses some encryption and decryption, digital certificate, digital signature technology to achieve. Here are the basic concepts of these technologies.
To ensure the confidentiality of messages, encryption and decryption are needed. Encryption and decryption algorithms are currently divided into symmetric encryption and asymmetric encryption.
Symmetric encryption (shared key encryption)
The client and server share a secret key to encrypt and decrypt messages, which is called symmetric encryption. The client and server agree on an encrypted key. The client encrypts the message with the key before sending it. After sending the message to the server, the server decrypts the message with the key.
Graphic encryption process:
Symmetric encryption algorithm used here:
- M: Plaintext, the content that we intend to transmit;
- C: secret key, in symmetric encryption algorithm need to use secret key encryption, decryption with secret key (encryption algorithm can be very simple, addition, subtraction, multiplication and division, can also be very complex);
- N: ciphertext: plaintext The content encrypted with the secret key is called ciphertext and is also transmitted over the network.
For example, the client transmits 1 (plaintext) to the server, and 1 + 3 (3 is the secret key) = 4 gets the ciphertext for transmission. The server gets the ciphertext 4, and 4-3 (3 is the secret key) =1 gets the plaintext for communication between the client and the server, and vice versa.
Advantages of symmetric encryption:
- Symmetric encryption solves the problem of message confidentiality in HTTP.
Disadvantages of symmetric encryption:
- Symmetric encryption guarantees message confidentiality, but because the client and server share a secret key, it makes the secret key especially easy to leak.
- Because of the high risk of key disclosure, it is difficult to ensure the reliability of information sources, integrity and accuracy of information;
Symmetric encryption key disclosure risk is very high, the secret key is fixed, leading to easy cracking, so is there a better way to encrypt transmission, for example, each time the use of the secret key is not the same, each time the decryption of the secret key is not the same, or other situations to increase security?
Asymmetric encryption (public key encryption)
Since symmetric encryption, the key is so easy to leak, so we can use an asymmetric encryption way to solve the problem. When asymmetric encryption is used, both the client and server have a public key and a private key. The public key can be exposed, but the private key can only be seen by itself.
A message encrypted with a public key can only be unlocked by the corresponding private key. Conversely, messages encrypted with the private key can only be unlocked by the public key. In this way, the client encrypts the message with the server’s public key before sending it, and the server decrypts the message with its private key after receiving it.
Graphic encryption process:
The explanation is as follows:
- M: The plaintext, the content we intend to transmit;
- D: indicates the public key, which is used in asymmetric encryption algorithms.
- E: refers to the private key, which is used for decryption in asymmetric encryption algorithms.
- N: Indicates the ciphertext. The content in plaintext encrypted with the secret key is called ciphertext and is also transmitted over the network.
Generate public key D and private key E on the server this time, and keep the private key. Then public key D is made public. The client that wants to communicate with the server uses public key D to encrypt and send it to the server with private key E. The server uses private key E to decrypt the ciphertext and finally get the plaintext.
Introduction to asymmetric encryption algorithm RSA
RSA is the most influential public key encryption algorithm at present. It can resist most of the known cryptographic attacks so far, and has been recommended by ISO as the public key data encryption standard.
Today only short RSA keys can be broken by brute force. As of 2008, there was no reliable way to attack the RSA algorithm in the world. Messages encrypted with RSA are virtually unbreakable as long as their keys are long enough. However, with the theory of distributed computing and quantum computer becoming mature, RSA encryption security is challenged.
The RSA algorithm is based on a very simple number theory fact: it is easy to multiply two large prime numbers together, but extremely difficult to factor their product, so the product can be exposed as an encryption key.
HTTP performance tuning
Reduce the number of HTTP requests
Reducing the number of HTTP requests is a very important aspect of performance optimization, so there is a principle in almost all optimization principles: Reduce the number of HTTP requests, regardless of the others.
Let’s first consider why reducing HTTP requests can optimize performance:
1. Reduce the time spent on DNS requests, whether true or false, because basically, reducing the number of HTTP requests can reduce the time spent on DNS requests and resolution;
2, reducing the pressure of the server that is usually considered the most, is the biggest reason I used to explain it to others, because each HTTP request is going to be the server resources, in particular, some need to compute merge operations such as server, server of CPU resources is no laughing matter, hard disk can be bought with money, CPU resources are not so cheap;
3. Reduce HTTP request headers. When we make a request to the server, we will carry cookies and other information under the domain name in the HTTP header. Bandwidth performance will be affected during such requests and responses;
DNS request and resolution
In simple terms, for example, a URL such as www.taobao.com, where WWW part is called hostname, Taobao part is a secondary domain, and com is a primary domain. If it is such a URL: www.ali.tao.com then Ali is a tertiary domain.
When we request a URL, we first go to the local server to see if there is a resolution result in the cache. If there is no resolution result, we go to the root DNS server. The root DNS server returns to the local DNS server the IP address of the primary DNS server for the queried domain. We then request the DNS server for the IP address we just returned, and then return the IP address of the next level of domain name until we find the server IP address indicated in the domain name, and then cache the result for next use and return the result.
The DNS resolution of a first requested URL can be expensive, but the results are cached after the first request, so that subsequent requests do not have to go through the complicated resolution process.
Reduce server stress
Too many HTTP requests are dangerous for the server. If your server is not very strong, please take this into consideration first. Other optimization strategies are just optimization, but this is the server, you need to make sure your server is running properly.
But this is Taobao, we have enough speed to provide enough user experience. If your server can’t provide this speed and can’t handle this kind of frequent asynchronous requests, this optimization should be careful. The latency may make navigation unusable, which is also scenario-specific.
Taobao is now widely deploying CDN, which can provide us with sufficient background resources. With the continuous improvement of CDN and background environment, we should focus more on the improvement of foreground transmission speed and display parsing speed.
Reduce HTTP headers
HTTP header is a huge guy, you open the home page of Taobao.com, alert document. Cookie, you will find that Taobao’s cookie is relatively large, every time you request Taobao’s server will return these data, there are some other header information, take up a lot of space. You can imagine how costly that is.
In fact, since the use of CDN, all these need not be considered too much, because CDN and Taobao main site are not under the same domain name, cookie will not pollute each other, and CDN domain name basically does not have cookie and header information, so every time when requesting static resources, You don’t run around with cookies from the master site, you just transfer the subject content of the resource, so the performance impact becomes minimal after using CDN. However, if your static resource server and the master server are in the same domain, you need to control the size of cookies and other headers because they will be sent every time.
conclusion
This time we have a preliminary understanding of the network protocol HTTP and HTTPS, understand the advantages and disadvantages of HTTP, it is due to some shortcomings of HTTP, HTTPS emerged, we understand its working principle through the legend, it is more complicated, need further understanding. Then we talked about HTTP performance tuning, about reducing the number of requests, reducing server stress, and so on;
In short, different emphases should be considered for different scenarios, and appropriate optimization should be carried out for different site sizes and types, rather than blindly pursuing standards and best practices.
In this paper, to the end.
Original is not easy, if you think this article is useful to you, please kindly like, comment or forward this article, because this will be my power to output more high-quality articles, thank you!
By the way, please give me some free attention! In case you get lost and don’t find me next time.
See you next time!