Let’s take a look at these two images. The first is a visit to the domain www.12306.cn, Google’s Browser prompts unsafe links, and the second is https://kyfw.12306… 56 began marking HTTP pages that collect password or credit card data as “insecure.” With Chrome 62, released in October 2017, HTTP pages with input data and all HTTP pages viewed in traceless mode are marked as “insecure.” In addition, Apple is forcing all iOS apps to use HTTPS encryption by January 1, 2017.

The history of HTTP and HTTPS

What is the HTTP?

Hypertext Transfer protocol, is a request and response based, stateless, application layer protocol, often based on TCP/IP protocol transfer data, the Most widely used network protocol on the Internet, all WWW files must comply with this standard. HTTP was originally designed to provide a way to publish and receive HTML pages.

This official demo, set up by Akamai, uses HTTP/1.1 and HTTP/2 to request 379 images at the same time. Looking at the time of the request, it is clear that HTTP/2 performance is superior.

Multiplexing: Multiple request-response messages are sent from a single HTTP/2 connection request, and multiple request streams share a TCP connection, enabling multiple connections in parallel rather than relying on the establishment of multiple TCP connections

HTTP Packet Format

What is the HTTPS

In the illustrated HTTP book, HTTPS is HTTP in an SSL shell. HTTPS is a transport protocol for secure communication over the computer network. It uses SSL/TLS to establish full-channel communication and encrypt data packets. The primary purpose of HTTPS is to provide identity authentication to web servers and protect the privacy and integrity of the data exchanged. PS:TLS is a transport layer encryption protocol, formerly SSL, released by Netscape in 1995.

HTTP VS HTTPS

HTTP features:
  1. Stateless: The protocol does not store state for the client and does not have the ability to “remember” things. For example, to access a website, you need to log in repeatedly

  2. Connectionless: Prior to HTTP/1.1, each request required a three-way handshake and a four-way handshake to re-establish a connection with the server due to stateless features. For example, if a client requests the same resource for several times in a short period of time, the server cannot distinguish whether it has already responded to the user’s request. Therefore, it takes unnecessary time and traffic to respond to the request again each time.

  3. Request-based and response-based: The basic feature where the client initiates a request and the server responds

  4. Simple, fast and flexible

  5. Communication uses plaintext, request and response, which do not confirm the communication party and cannot protect the integrity of data

Here is a simple packet capture experiment to look at data transferred using HTTP requests:

* * results analysis: the HTTP protocol to transmit data in clear text display * * * * for stateless some resolution strategy: * * scene: going to mall stores user need to use time is longer, need the HTTP communication status of the user for a period of time to save, to perform a landing operation, such as in 30 minutes all requests do not need to log in again.

  1. Through the Cookie/Session technique

  2. HTTP/1.1 Keep-alive (HTTP keep-alive) : The TCP Connection is kept as long as either end does not explicitly request the disconnection. Connection: keep-alive in the request header indicates that a persistent Connection is used

HTTPS features:

Based on HTTP, SSL or TLS provides data encryption, identity verification, and data integrity protection

Through packet capture, you can see that data is not transmitted in plaintext, and HTTPS has the following features:

  1. Content encryption: the use of mixed encryption technology, the middle can not directly view the plaintext content
  2. Authentication: Authenticates the client to access its own server through a certificate
  3. Protect data integrity: Prevent transmitted content from being impersonated or tampered with by middlemen

** Hybrid encryption: ** combines asymmetric and symmetric encryption techniques. Clients use generated by symmetric encryption key to encrypt data transmission and then using asymmetric encryption’s public key to the secret key to encrypt, so the network transmission of data is secret key encryption cipher and secret secret key is encrypted with the public key, so even if hackers to intercept, since there is no private key, unable to get the secret key to encrypt plaintext, and can’t get to clear data.

** Digital digest: ** Hases the original text using the one-way hash function to “digest” the plaintext to be encrypted into a string of fixed length (such as 128bit) ciphertext. The result of different plaintext digests is always different. The same plaintext digests must be the same.

** Digital signature technology: ** Digital signature is based on public key encryption system, which is another application of public key encryption technology. It combines public key encryption technology with digital abstract technology and forms a practical digital signature technology.

  • The receiving party can verify the true identity of the sender;

  • The sender cannot deny the sent message afterwards.

  • The recipient or illegitimate party cannot forge or tamper with the packet.

Asymmetric encryption requires a public key for encryption, so where does the public key come from? A digital certificate is usually issued by a trusted digital certificate authority (CA) after authenticating the identity of the server. The certificate contains a key pair (public key and private key) and owner identification information. The digital certificate is placed on the server with server authentication and data transmission encryption.

4. HTTP communication transmission

When the client enters the URL and press Enter, the DNS resolves the domain name to obtain the IP address of the server. The server listens to the request from the client on port 80. The port establishes a connection through TCP/IP (which can be implemented through sockets). HTTP belongs to the application layer protocol of TCP/IP model, so the communication process is actually the corresponding data on and off the stack.

Packets are sent from the application layer to the transport layer. The transport layer establishes a connection with the server through a TCP three-way handshake and releases the connection with a four-way handshake.

Why do you need three handshakes? An error occurs in case an invalid connection request segment is suddenly sent to the server.

For example, the segment of the first connection request packet sent by the client is not lost, but is detained on a network node for a long time. As a result, it is delayed to reach the server until a certain time after the connection is released. This is an invalid connection request segment. However, after receiving the invalid connection request segment, the server mistakenly thinks it is a new connection request sent by the client. Therefore, the server sends an acknowledgement packet to the client and agrees to establish a connection. If the “three-way handshake” is not adopted, a new connection is established as soon as the server sends an acknowledgement. Since the client does not send a request to establish a connection, it ignores the confirmation from the server and does not send data to the server. However, the server thinks that a new transport connection has been established. And wait for the client to send data. Therefore, the “three-way handshake” is not adopted, in which case a lot of server resources are wasted.

Why do you need four waves? TCP is in full-duplex mode. When a TCP client sends a FIN packet, the client sends a message to the server indicating that all data has been sent. However, the client can still accept the data from the server. When the server returns an ACK packet, it indicates that it knows that no data is sent from the client, but the server can still send data to the client. When the server also sends a FIN packet, the server tells its client that it has no data to send either. If the server receives a FIN packet, the TCP connection is terminated.

Five, HTTPS implementation principle

SSL connection establishment process

  1. The client sends a request to the server for Baidu.com and then connects to port 443 of the server. The information sent is mainly random value 1 and encryption algorithm supported by the client.

  2. After receiving the information, the server responds to the client with the handshake information, including the random value 2 and the matched negotiated encryption algorithm. The encryption algorithm must be a subset of the encryption algorithm sent by the client to the server.

  3. The server then sends the second response packet to the client as a digital certificate. The server must have a digital certificate, which can be made by itself or applied to the organization. The difference is that the certificate issued by the user needs to be authenticated by the client before the user can continue to access the certificate, while the certificate applied by a trusted company does not display a prompt page. The certificate is actually a pair of public and private keys. The certificate is actually a public key that contains a lot of information, such as the certificate issuer, expiration time, the public key of the server, the signature of the third-party certificate Authority (CA), and the domain name information of the server.

  4. The client parses the certificate, which is performed by TLS on the client. First, it verifies whether the public key is valid, such as the issuing authority and expiration time. If an exception is found, a warning box is displayed indicating that there is a problem with the certificate. If there is no problem with the certificate, a random value (pre-primary key) is generated.

  5. After the client authentication certificate passes, the session key is then assembled with random value 1, random value 2, and the pre-master key. The session secret key is then encrypted using the certificate’s public key.

  6. Transmit encrypted information. This part of the transmission is the session secret key encrypted with the certificate. The purpose is for the server to decrypt with the secret key to obtain random value 1, random value 2 and the pre-master key.

  7. The server decrypts the random value 1, random value 2 and the pre-master key, and then assembles the session key, which is the same as the client session key.

  8. The client encrypts a message using the session key and sends it to the server to verify whether the server can normally accept the message.

  9. The server encrypts a message with the session key and sends it back to the client. If the client can accept the message, the SSL connection is established.

Question:

1. How can I ensure that the public key delivered by the server to the client is a real public key, not a forged public key by a middleman?

2. How to transfer the certificate safely? What if the certificate is switched?

Digital certificate includes information after the server public key encryption, authority, server domain name, and after the CA private key signature certificate content (after first certificate number is obtained by computing the Hash function summary, then use authority private key encrypted digital the digital signature), signed certificate of calculating methods and the corresponding domain name.

Verify the certificate security process

  1. After receiving the certificate, the client uses the public key of the locally configured authority to decrypt the certificate to obtain the public key of the server and the digital signature of the certificate. The digital signature is decrypted by the CA public key to obtain the certificate information summary.

  2. Then the certificate signature method computes the summary of the current certificate and compares it with the received summary. If the summary is the same, the certificate must be issued by the server and has not been tampered by the middleman. Because intermediary, although have authority’s public key, can resolve the certificate content is tampered with, but tampering after broker needs the certificate to encryption, but there is no authority to the private key middlemen, cannot be encrypted, forced encryption will only lead to the client cannot decrypt, if middlemen to modify the certificate, will lead to certificate content and signature does not match

Can third-party attackers make their certificates display information that is also a server? (Disguised server configuration) Obviously this is not possible, because when a third-party attacker goes to CA to seek authentication, CA will require it to provide whoIS information of the domain name, domain name management email, etc., to prove that you are the owner of the server domain name. A third party attacker can’t provide that information so he can’t fool the CA that he owns the domain name of the server.

Six, application and summary

Safety considerations:
  1. HTTPS protocol encryption scope is also relatively limited, in hacker attacks, denial of service attacks, server hijacking and other aspects of almost no role

  2. The credit chain system of SSL certificates is not secure, especially if some countries can control the CA root certificate, and man-in-the-middle attacks are just as feasible

    A man-in-the-middle attack (MITM attack) is when a hacker intercepts and tampers with communication data in a network. It is divided into passive MITM and active MITM. Passive MITM only steals communication data without modifying it, while active MITM can not only steal data, but also tamper with communication data. The most common man-in-the-middle attacks occur on public wifi or public routes.

Cost considerations:
  1. You need to purchase an SSL certificate. The more powerful the certificate, the higher the cost

  2. SSL certificates usually need to be bound to IP addresses, so multiple domain names cannot be bound to the same IP address. IPv4 resources cannot support this consumption. (SSL extensions can partially solve this problem, but they are cumbersome and require browser and operating system support. This feature is almost useless).

  3. According to ACM CoNEXT, using HTTPS increases page load time by nearly 50% and power consumption by 10% to 20%.

  4. HTTPS connection caching is not as efficient as HTTP and has high traffic cost.

  5. The HTTPS connection consumes much more resources on the server side, and it costs more to support websites with many visitors.

  6. HTTPS handshake takes a long time, affecting the website response speed and user experience. A better way is to use divideand rule, similar to 12306 website using HTTP protocol home page, about user information and other aspects of HTTPS

SSL four-way handshake

  1. The Client requests to establish an SSL link and sends a random number – Client Random and the encryption method supported by the Client, such as RSA public key encryption, to the server. In this case, plaintext transmission is performed.
  2. The Server replies with a client-supported encryption method, a random number – Server random, a trusted Server certificate, and an asymmetric encrypted public key.
  3. After receiving the reply from the server, the client uses the public key of the server and adds a new random number – Premaster Secret to encrypt the public key and encryption method sent by the server and sends the message to the server.
  4. After receiving the reply from the Client, the Server uses the known encryption and decryption methods to decrypt it. Meanwhile, Client Random, Server Random and Premaster Secret are used to generate the symmetric encryption key — Session key for HTTP link data transmission through certain algorithms.

Why you need a “three-way handshake”

In The fourth edition of Computer Network written by Xie Xiren, the purpose of “three-way handshake” is “to prevent the invalid connection request message segment from being suddenly transmitted to the server, resulting in errors”. In another classic book, Computer Networks, the purpose of the three-way handshake was to solve the problem of “delayed repeated grouping in networks”. These two different expressions illustrate the same point. Xie Xiren version of “computer network” example is that the connection request message of “failed” in such a case: the first connection request from the client message segment is not lost, but in a network node of the stranded for a long time, so that has been delayed to release after a certain time to reach the server connection. Originally, this is an invalid packet segment. However, after the server receives the invalid connection request packet segment, it mistakenly thinks it is a new connection request sent by the client. Then the client sends a confirmation message to agree to establish a connection. Assuming that the “three-way handshake” is not used, a new connection is established as soon as the server sends an acknowledgement. Since the client does not send a connection request, it ignores the server’s confirmation and does not send data to the server. However, the server assumes that the new transport connection has been established and waits for data from the client. As a result, many of the server’s resources are wasted. The three-way handshake prevents this from happening. For example, the client does not issue an acknowledgement to the server’s acknowledgement. When the server receives no acknowledgement, it knows that the client has not requested a connection.” . The main purpose is to prevent the server from wasting resources by waiting.

One might wonder why an ACK is sent with the SYN during a TCP handshake, but not with the FIN. The reason is that TCP works in full-duplex mode. When TCP receives FIN packets, no data is sent, but data can still be sent.

Working Principle of SSL

SSL works in three phases: handshake, key export, and data transmission.

1. Handshake stage:

In the handshake phase, you need to complete the following three tasks: establishing a TCP connection, verifying the server identity, and distributing the master key. The general process is described as follows:

After the TCP connection is established, the client sends a HELLO packet to the server. The packet contains a list of password algorithms supported by the client. After receiving the packet, the server selects a symmetric algorithm. An asymmetric algorithm and a MAC algorithm, together with its certificate, respond to the client (the certificate is a binding between an entity authenticated by an authority and its public key).

Because in various encryption processes, as long as it involves the use of public keys, there is generally a risk that the public key will be stolen and forged by intruders, so the digital certificate issued by the authority is needed to prove the binding between a public key and an entity.

Client after received from the server certificate, you can clear know the current is with their own communication server is the target server, the client is then extracted from the certificate from the server’s public key, and in the master key of the client to generate a random MS, and then use server’s public key to be encrypted before sent to the server, The server decrypts the master key MS with its own private key, thus completing the distribution of the master key.

Both the client and the server have the master key, which no one else knows, and the subsequent data encryption and validation process is easy.

2. Key Export:

In the key export stage, the communication parties use the master key to generate four keys in the same way. The four keys have the following functions:

EB: session encryption key used to send data from the server to the client

MB: session MAC key used to send data from the server to the client

EA: Session encryption key used to send data from client to server

MA: session MAC key used to send data from the client to the server

The session encryption key is the symmetric key used to encrypt the transmitted data, while the MAC key is the key that marks the integrity of the transmitted data.

MAC: identifies packets. It is a technology used to monitor packet integrity. It is not complicated, the process of sending will definitely cascade with a identify key, the identification of key Shared by both parties to a communication, then compute the hash value of data after the cascade this hash value is called the original data of MAC, message authenticated code attach message identification code behind the original plaintext, together sent to the receiver. The receiver cascaded the same authentication key with the received plaintext, calculated the hash value in the same way, and compared it with the received hash value MAC. If they are the same, the data is not tampered. The MA and MB are the authentication keys in the MAC.

3. Data transmission

SL divides the data stream into records, encrypts each record EA and attaches a MAC (for integrity authentication), then encrypts the record and MAC, and sends the encrypted packet to the server. After receiving the packet, the server decrypts it with the corresponding EB symmetric key. Then use MB for data integrity check.