HTTP common status code

1xx

The 1XX status code is an intermediate state in protocol processing, and is rarely used.

2xx

The 2XX status code indicates that the server successfully processed the client’s request, which is the most desirable state.

“200 OK” is the most common success status code, indicating that everything is OK. If the request is not HEAD, the server will return the body data in the response header.

“204 No Content” is also a common success status code, essentially the same as 200 OK, but with No body data in the response header.

206 Partial Content is applied to HTTP block download or resumable HTTP. It indicates that the body data returned by the response is not the entire resource, but a part of it. It is also the status that the server successfully processed.

3xx

The 3XX status code indicates that the resource requested by the client is changed, and the client needs to re-send the request to obtain the resource with a new URL, that is, redirection.

301 Moved Permanently: Permanently redirects. In this case, the requested resource no longer exists, and you need to use a new URL to access it again.

“302 Found” indicates a temporary redirect, indicating that the requested resource is still available but needs to be accessed using a different URL for the time being.

Both 301 and 302 use the Location field in the response header to indicate the next URL to jump to, and the browser will automatically redirect to the new URL.

304 Not Modified does Not indicate a jump. It indicates that the resource is Not Modified and redirects the existing buffer file. This parameter is also called cache redirection for cache control.

4xx

The 4XX status code indicates that the packet sent by the client is incorrect and cannot be processed by the server.

400 Bad Request: indicates that there is an error in the Request packet sent by the client.

403 Forbidden Indicates that the server forbids access to resources, but the request from the client fails.

404 Not Found indicates that the requested resource does Not exist or is Not Found on the server and therefore cannot be provided to the client.

5xx

5XX Status code Indicates that the client requests the packet correctly, but the server processes an internal error, which is a server-side error code.

“500 Internal Server Error” and “400” are general Error codes. We do not know what happened to the Server.

501 Not Implemented indicates that the functionality requested by the client is Not yet supported. It is similar to “opening soon, please look forward to it.”

502 Bad Gateway is an error code returned when the server functions as a Gateway or proxy, indicating that the server works properly and an error occurs when accessing the back-end server.

503 Service Unavailable indicates that the server is busy and cannot respond to the server temporarily, similar to the message Network Service is busy, please try again later.

HTTP common fields

The Host field

When a client sends a request, it specifies the domain name of the server.

Host: www.A.com With the Host field, you can send requests to different sites on the “same” server.

The Content – Length field

When the server returns data, the content-Length field indicates the Length of the response data.

Content-length: 1000 content-length: 1000 content-length: 1000 Content-length: 1000 Content-length: 1000 Content-length: 1000

Connection field

The Connection field is most commonly used when a client asks the server to use a TCP persistent Connection for reuse by other requests.

HTTP/1.1 uses persistent connections by default, but in order to be compatible with older HTTP versions, you need to specify keep-alive as the value of the Connection header.

Connection: keep-alive A TCP Connection is established that can be reused until the client or server actively closes the Connection. However, this is not a standard field.

The content-type field

The Content-Type field is used to tell the client what format the data is in when the server responds.

Content-Type: text/html; The type above charset= UTF-8 indicates that a web page is being sent and the encoding is UTF-8.

When a client requests, it can use the Accept field to declare which data formats it accepts.

Accept: / In the code above, the client declares that it can Accept data in any format.

The Content – Encoding field

Content-encoding Specifies the data compression method. What compression format is used to represent data returned by the server

Content-encoding: gzip Indicates that the data returned by the server is compressed in gZIP mode and the client needs to be decompressed in this mode.

At request time, the client uses the accept-Encoding field to indicate which compression methods it can Accept.

Accept-Encoding: gzip, deflate

HTTP features

1, the advantages of

HTTP’s greatest strengths are “simplicity, flexibility and ease of extension, wide application, and cross-platform.”

  1. simple

    The basic HTTP packet format is header + body, and the header information is also in the form of key-value text, which is easy to understand and reduces the threshold for learning and using.

  2. Flexible and easy to expand

    HTTP protocol in all kinds of request methods, URI/URL, status code, header fields and other components of the requirements are not fixed, allowing developers to customize and expand.

    And because HTTP works at the application layer (OSI Layer 7), its lower layers can be changed at will.

    HTTPS adds an SSL/TLS security transport layer between HTTP and TCP, and HTTP/3 even replaces TCP with UDP-based QUIC.

  3. Widely used and cross-platform

2 and disadvantages

HTTP protocol has advantages and disadvantages of a double-edged sword, respectively is “stateless, plaintext transmission”, but also a major disadvantage of “insecure”.

For the stateless problem, there are many solutions, among which the relatively simple way to use Cookie technology. Cookie Controls client status by writing Cookie information in request and response packets. After the first request from the client, the server sends a “sticker” containing the client’s information. When the client requests the server, the server can recognize the “sticker”

The serious drawback of HTTP is that it is insecure:

  • Communications use clear text (not encryption) and the content can be eavesdropped.
  • The identity of the communicating party is not verified, so it is possible to encounter camouflage.
  • The integrity of the message could not be proved, so it may have been tampered with.

HTTP security problems can be solved using HTTPS, that is, through the introduction of SSL/TLS layer, so as to achieve the ultimate security.

4. HTTP performance

The HTTP protocol is based on TCP/IP and uses a “request-reply” communication mode, so the key to performance lies in these two points.

1. Long connection

One of the big performance issues with HTTP/1.0 in the early days was that for every request made, a new TCP connection (three-way handshake) was created, and it was a serial request, making unnecessary TCP connections and disconnections and increasing communication overhead. In order to solve the problem of TCP connection, HTTP/1.1 proposed the communication mode of long connection, also called persistent connection. The advantage of this approach is that it reduces the overhead caused by the repeated establishment and disconnection of TCP connections and reduces the load on the server side. The characteristic of a persistent connection is that the TCP connection remains as long as neither end explicitly disconnects.

2. Pipeline network transmission

HTTP/1.1 uses long connections, which make it possible for pipelines to travel over networks.

Within the same TCP connection, the client can initiate multiple requests. As long as the first request is sent out, the second request can be sent out without waiting for it to come back, which can reduce the overall response time.

3. Queue head is blocked

The request-reply pattern exacerbates HTTP’s performance problems.

Because when one request in the sequential sequence of requests is blocked for some reason, all the subsequent requests are also blocked, causing the client to never request data, this is called “queue head blocking”. It’s like being stuck in traffic on the way to work.

HTTP and HTTPS

HTTP has the following security risks because it is transmitted in plaintext:

  • Risk of eavesdropping, such as access to communication content on communication links, easy to lose user numbers.
  • Tampering risks, such as forced placement of spam, visual pollution, and user blindness.
  • Impersonating risk, such as impersonating Taobao site, users money is easy to do not.

HTTPS adds SSL/TLS protocol between HTTP and TCP layer, which can solve the above risks:

  • Information encryption: Interactive information cannot be stolen.
  • Verification mechanism: communication content cannot be tampered with, so it cannot be displayed normally.
  • Certificate of identity: prove that Taobao is real Taobao.

The SSL/TLS protocol can ensure secure communication as long as it does not do evil.

How does HTTPS address the above three risks?

  • The hybrid encryption method realizes the confidentiality of information and solves the risk of eavesdropping.
  • Algorithm to achieve integrity, it can generate a unique “fingerprint” for data, fingerprint used to verify the integrity of data, to solve the risk of tampering.
  • Putting the server public key into the digital certificate solves the risk of impersonation.

1, mixed encryption

The hybrid encryption method can ensure the confidentiality of information and solve the risk of eavesdropping.

HTTPS uses a hybrid encryption mode that combines symmetric and asymmetric encryption:

  • Asymmetric encryption is used to exchange “session keys” before communication is established, and asymmetric encryption is not used later.
  • In the process of communication, all the plaintext data are encrypted in the way of symmetric encryption “session secret key”.

Reasons for adopting the “hybrid encryption” approach:

  • Symmetric encryption uses only one key, which is fast in operation. The key must be kept secret and cannot be exchanged securely.
  • Asymmetric encryption uses two keys: public key and private key. The public key can be distributed arbitrarily while the private key is kept secret, which solves the key exchange problem but is slow.

2. Summary algorithm

The algorithm is used to realize the integrity, which can generate a unique “fingerprint” for the data to verify the integrity of the data and solve the risk of tampering.

Before the client sends a clear through the algorithm to calculate plaintext “fingerprint”, when sending the “fingerprint + plaintext” after encryption cipher text, together sent to the server, the server after decryption, using the same cleartext based algorithm is sent, by comparing the “fingerprints” of the client to carry and the calculate a “fingerprint” compare, if “fingerprint” the same, Note The data is complete.

3. Digital certificates

The client requests the public key from the server and encrypts the information with the public key. After receiving the ciphertext, the server decrypts it with its own private key.

This raises some questions. How do you ensure that the public key is not tampered with and trusted?

Therefore, a third-party authority (CA) is required to add the server public key to the digital certificate (issued by the CA). As long as the certificate is trusted, the public key is trusted.

A digital certificate is used to ensure the identity of the server’s public key to avoid the risk of impersonation.

4. How does HTTPS establish a connection

Basic flow of SSL/TLS protocol:

  • The client requests and verifies the public key of the server.
  • Both parties negotiate to produce session secret keys.
  • The two parties use the session key for encrypted communication.

The first two steps are the SSL/TLS setup, also known as the handshake phase.

The “handshake phase” of SSL/TLS involves four communications, as shown below:

Detailed process for establishing SSL/TLS protocol:

  1. ClientHello

First, the client initiates an encrypted communication request to the server, known as a ClientHello request.

In this step, the client sends the following information to the server:

(1) SSL/TLS protocol version supported by the client, for example, TLS 1.2.

(2) Client Random number produced by the Client, which is later used to produce “session secret key”.

(3) List of password suites supported by the client, such as RSA encryption algorithm.

  1. SeverHello

When the server receives a request from the client, it sends a response, called SeverHello, to the client. The server replies with the following content:

(1) Check the SSL/ TLS protocol version. If the browser does not support the SSL/ TLS protocol, disable encrypted communication.

(2) Server Random number produced by the Server, which is later used to produce “session secret key”.

(3) List of confirmed cipher suites, such as RSA encryption algorithm.

(4) Digital certificate of the server.

3. The client responds

After receiving the response from the server, the client uses the CA public key in the browser or operating system to verify the authenticity of the digital certificate of the server.

If there is no problem with the certificate, the client extracts the server’s public key from the digital certificate and uses it to encrypt packets to send the following information to the server:

(1) A random number (pre-master key). The random number is encrypted by the server’s public key.

(2) Notification of change of encrypted communication algorithm, indicating that subsequent information will be encrypted with “session secret key” for communication.

(3) Notification of the end of client handshake, indicating the end of client handshake. This entry also makes a summary of all previous content occurrence data for the server to verify.

The random number of the first item above is the third random number in the whole handshake phase, so that the server and client have three random numbers at the same time, and then use the encryption algorithm negotiated by both parties to generate the “session secret key” of this communication.

  1. Final response from the server

After receiving the third random number (pre-master key) from the client, the server calculates the session key for the communication through the negotiated encryption algorithm. The final message is then sent to the client:

(1) Notification of change of encrypted communication algorithm, indicating that subsequent information will be encrypted with “session secret key” for communication.

(2) Notification indicating that the handshake phase of the server is over. This entry also makes a summary of all the previous content of the occurrence of the data, used for client verification.

At this point, the entire SSL/TLS handshake phase is complete. Next, the client and server enter encrypted communication using the plain HTTP protocol, but with the “session secret key” to encrypt the content.

HTTP/1.1, HTTP/2, HTTP/3 evolution

Performance improvements in HTTP/1.1 over HTTP/1.0:

  • The use of TCP long connections improves the performance overhead associated with HTTP/1.0 short connections.
  • Support pipeline network transmission, as long as the first request is sent out, do not have to wait for it to come back, can send out the second request, can reduce the overall response time.

However, HTTP/1.1 still has performance bottlenecks:

  • Request/response headers are sent uncompressed, with more headers and more latency. Only the Body part can be compressed;
  • Send lengthy headers. Sending the same header each time causes more waste;
  • The server responds according to the order of requests. If the server responds slowly, the client cannot request data all the time, that is, the queue head is blocked.
  • No request priority control;
  • Requests can only start from the client and the server can only respond passively.
  • What is HTTP/2 optimized for the performance bottleneck in HTTP/1.1?

The HTTP/2 protocol is based on HTTPS, so HTTP/2 security is also guaranteed.

HTTP/2 performance improvements over HTTP/1.1:

  1. The head of compression

    HTTP/2 will compress headers. If you make multiple requests at the same time and their headers are the same or similar, the protocol will help you eliminate duplicates.

    This is known as the HPACK algorithm: a header table is maintained on both the client and the server, all fields are stored in this table, an index number is generated, and the same fields are not sent later, only the index number is sent, which increases speed.

  2. Binary format

    HTTP/2 is no longer like the plain text packets in HTTP/1.1, but fully adopts the binary format. The header information and data body are both binary, and they are collectively referred to as frames: header information frame and data frame.

    This is not friendly to humans, but it is friendly to computers, because computers only understand binary. After receiving a packet, they do not need to convert the plaintext packet into binary, but directly parse the binary packet, which increases the data transmission efficiency.

  3. The data flow

    HTTP/2 packets are not sent sequentially, and consecutive packets within the same connection may belong to different responses. Therefore, the packet must be marked to indicate which response it belongs to.

    All packets of data for each request or response are called a Stream. Each data stream is marked with a unique number that specifies an odd number for streams sent by the client and an even number for streams sent by the server

    The client can also specify the priority of the data flow. The server responds to a request with a higher priority.

  4. multiplexing

    HTTP/2 allows multiple requests or responses to be concurrent in a single connection, rather than in a one-to-one sequence.

    By removing serial requests in HTTP/1.1, there is no need to queue, and the “queue head blocking” problem is eliminated, reducing latency and greatly improving connection utilization.

  5. Server push

    HTTP/2 also improves the traditional “request-reply” mode of working to some extent, where services can actively send messages to clients instead of passively responding.

    For example, when the browser just requests HTML, it proactively sends static resources such as JS and CSS files that may be used to the client in advance to reduce the delay. This is also called Server Push (also called Cache Push).

The defects of HTTP / 2

The main problem with HTTP/2 is that multiple HTTP requests are multiplexing a TCP connection, and the underlying TCP protocol does not know how many HTTP requests there are. So once packet loss occurs, TCP’s retransmission mechanism is triggered, so all HTTP requests in a TCP connection must wait for the lost packet to be retransmitted.

  • In HTTP/1.1, if a single request is blocked in transit, all the post-queue requests are blocked
  • HTTP/2 Multiple requests reuse a TCP connection, which blocks all HTTP requests once packet loss occurs.

This is all based on the TCP transport layer, so HTTP/3 changes the HTTP layer TCP protocol to UDP!

UDP occurs regardless of the order and regardless of packet loss, so there will be no HTTP/1.1 queue header blocking and HTTP/2 a lost packet all retransmission problems.

We all know that UDP is not reliable transmission, but based on UDP QUIC protocol can achieve similar to TCP reliable transmission.

  • QUIC has its own set of mechanisms to ensure the reliability of transmission. When packet loss occurs in a stream, only this stream is blocked, and other streams are not affected.
  • TLS3 has been updated to version 1.3, and the header compression algorithm has been updated to QPack.
  • It takes six interactions for HTTPS to establish a connection, starting with three handshakes and then three handshakes for TLS/1.3. QUIC directly merges six TCP and TLS/1.3 interactions into three, reducing the number of interactions.

So, QUIC is a pseudo-TCP + TLS + HTTP/2 multiplexing protocol on top of UDP.

QUIC is a new protocol, for many network devices, do not know what is QUIC, only as UDP, which will cause new problems. So HTTP/3 is now very slow to become popular and it is not known if UDP will be able to reverse TCP in the future.

7. Optimize HTTP/1.1

The first idea is to use caching techniques to avoid sending HTTP requests. After receiving the response from the first request, the client can cache it to the local disk. If the cache has not expired, the client can directly read the response data from the local cache. If the cache expires, the client sends the request with a summary of the response data. The server compares and finds that the resource has not changed. Then the server sends a 304 response without the packet body, telling the client that the cached response is still valid.

The second idea is to reduce the number of HTTP requests in the following ways:

  1. Redirection requests originally processed by the client are sent to the proxy server to reduce the number of redirection requests.
  2. Combining multiple small resources into one large resource for retransmission reduces the number of HTTP requests and the repeated transmission of headers. In this way, the number of TCP connections is reduced, thus saving the network consumption of TCP handshake and slow start.
  3. On-demand access to resources, access only the resources that the current user can see/use, and then access the next resource as the client moves down, thereby delaying the request and reducing the number of HTTP requests at the same time.

The third idea is to reduce the size of transmission resources by compressing response resources, so as to improve transmission efficiency. Therefore, a better compression algorithm should be selected.