What is HTTP?

HTTP is a communication protocol at the application layer. HTTP2.0 and previous versions are based on TCP. This means that THE HTTP protocol has TCP features, including connection orientation, congestion control, flow control, etc. However, HTTP is a stateless protocol. Stateless means that every time a client makes an HTTP connection request, the connection is broken after a series of operations have been completed. More on this later), and this connection state is not recorded. That is, the next time a client makes a connection request, the server will treat the connection as if it were new.

HTTP persistent and non-persistent connections

In HTTP/1.0, if a client wants to get an object, it has to set up a connection, and when the server returns, the connection is immediately disconnected. If you need to get the object, you need to re-establish a connection request at this time. This is a case of non-persistent connections.

There is also a situation where a client can obtain multiple sets of data objects through multiple requests simply by setting up a connection. This is called a persistent connection. HTTP/1.1 uses this connection by default.

Non-persistent connections cause the server to maintain a large number of connections, and for the server to reserve buffers and maintain various variables for each connection, which can cause significant server overhead. In addition, since HTTP is based on THE TCP protocol, each initiation of a connection requires a three-way handshake, which consumes network resources.

Continuous connection means that the client and server only need to maintain one connection. After the client establishes the connection, the server keeps the connection open so that the client can reuse the connection request data. When the connection is closed: 1. The client request times out and the server automatically closes. 2. The server returns the data successfully and the client receives the data successfully. As for how to know the client receives the data successfully, the packet header will tell the client the size of the packet. 3. When transferring files, the client does not know when it is finished. In this case, the server notifies the client that the data transfer is complete and actively disconnects the connection.

Continuous connections have advantages over non-continuous connections: only one connection is needed to obtain different resource objects, and the server only needs to maintain a set of buffers and variables, effectively freeing up the server’s resources. It also reduces the frequency of connection requests, which allows the server to respond to client requests more quickly and saves time.

HTTP Protocol Format

Here’s a request to visit baidu’s home page:

The first line: Get for the request method, followed by a space; / indicates the request path, followed by a space; HTTP/1.1 indicates that the protocol version is 1.1.

The second line: Host indicates the requested domain name.

Line 3: Connection indicates that the Connection needs to continue after the request is returned.

Line 4: cache-control indicates the mechanism for caching. No-cache or max-age=0 indicates that the client can cache resources. The validity of cached resources must be verified each time before using them. This means that an HTTP request is made each time, but the download of the HTTP response body can be skipped while the cached content is still valid.

Line 5: upgrade-insecure Request indicates that it is used to send a signal to the server that the client preferentially selects encrypted and authenticated responses, and that it can handle the upgrade-insecure requests CSP directive successfully. In short, the client supports an upgrade mechanism on the server, and the server can reset the secure version of the site. A Vary header can be added to the response so that the response is not provided by the cache server to clients that do not support the upgrade mechanism.

Line 6: user-agent represents the peer of the network protocol to identify the application type, operating system, software developer, and version of the User Agent that initiated the request.

Line 7: Accept indicates the type of content the client can handle.

Line 8: accept-encoding indicates the Encoding that the client can receive.

Line 9: Aeccept-Language Indicates the supported Language.

Line 10: Cookie represents an identity of the server to the client. This option can be disabled.

DNT (Do Not Track) indicates users’ preference for website tracking. It allows users to specify whether they are more interested in personal privacy or customizing content.

0 indicates that the user is willing for the target site to track the user’s personal information. 1 indicates that the user does not want the target site to track the user’s personal information.

Line 12: SEC-GPC indicates that the user does not wish to share preferences.

Here are the responses:

The first line: HTTP/1.1 indicates the protocol version. 200 is the status code. OK indicates that the request returned to normal. The other fields have the same meaning as the request, but are in the form of a question and answer. Although there are many fields in an HTTP request, the meanings of the fields can be found here. At the same time, an HTTP protocol format might be summarized as follows:

Some common request methods:

Note the difference between POST and GET requests: GET requests usually GET data, and the parameters are usually concatenated in the request header, so there is a length limit. POST submits the form, resulting in changes to the server data, which is usually stored in the Body. Of course, POST requests can also do what GET requests do. Here is some discussion about the two requests: www.w3schools.com/tags/ref_ht… Stackoverflow.com/questions/5…

Some commonly used reply fields

What are the differences between HTTP1.0, HTTP1.1 and HTTP2.0

Short link model:

In version 1.0, an HTTP request can only be made from one connection. Each time the browser initiates a request, it establishes a new connection and closes the connection when it receives a response. Since HTTP/1.0 is implemented based on the TCP protocol, each initiation of a connection is equivalent to re-establishing a TCP connection. Because TCP connections are so resource-intensive, both the client and the server need to go through a complicated handshake before sending data. HTTP performance can also suffer if the network is congested or the bandwidth is low.

Long link model:

The persistent link model allows the browser to make multiple requests within a single connection. In the HTTP/1.1 header, there is a keep-alive field that specifies the duration of the connection declaration. The server disconnects the connection based on the time specified in this field. HTTP/1.1 uses persistent connections by default. HTPP/1.0 usually sets Connection to a value other than close if you want to enable persistent connections.

Persistent connection model flaw: Even though multiple requests can reuse a connection, the connection consumes server resources when it is idle. And the long – link model can not avoid Dos attack. This model allows for concurrency, but it does so by creating multiple connections. In HTTP/1.x, the browser needs to play one request before it can initiate another.

Note that in HTTP/ 1.x, the browser establishes six connections for each domain at the same time, but this can cause cross-domain problems.

The improvement of HTTP2.0

HTTP2 introduces a new binary layer that dramatically improves performance by modifying the way data is transmitted between the client and server (in frames) without changing the application semantics of HTTP/1.x. HTTP2 compresses request headers when data is transmitted, and can reuse a connection to transmit data frames, thus achieving true concurrency. In addition, HTTP2 supports request prioritization, flow control, and server push.

Comparing HTTP2 with HTTP/1.x, where HTTP/1.x uses newlines as text separators, HTTP2 uses binary encoding by splitting the data into smaller messages and frames.

The figure above shows the mapping of streams, messages, and frames in an HTTP2 connection. There are three concepts to understand here. • Stream: Data stream, a bidirectional byte stream within an established connection that can carry one or more messages. • Message: Indicates a message. A sequence of request or response frames mapped to the same stream. Frame: frame. The smallest unit of HTTP2 communication. Each frame contains a frame header, which at least identifies the data stream to which the current frame belongs.

The relationship between these concepts is summarized as follows: • All communication is done over a TCP connection, which can host any number of bidirectional data streams. • Each data flow has a unique identifier and optional priority information to host bidirectional messages. • Each message is a logical HTTP message (such as a request or response) containing one or more frames. • A frame is the smallest unit of communication that carries specific types of data, such as HTTP headers, message payload, and so on. Frames from different data streams can be interlaced and then reassembled based on the data stream identifier of each frame header.

The figure above shows HTTP2 transferring data over a connection, as we can see in a TCP connection. The client and server can transmit data streams independently and interleaved. At the same time, it requires the other end to receive the frame data, through certain rules to reassemble into a message.

• Although HTTP2 fixes HTTP/1.x queue headers, it is blocked by TCP layer queue headers. • If TCP window scaling is disabled, the bandwidth latency will limit the throughput of the connection. • When packet loss occurs, the TCP congestion window is reduced, which has a potential impact on the throughput of the overall connection.

HTTP error code:

• 1 xx newsCopy the code

Since there are no 1XX status codes defined in THE HTTP/1.0 protocol, the server is prohibited from sending 1XX responses to such clients except under certain experimental conditions.

100 Continue

The server has received the request header, and the client continues to send the request body. The server must send a final response to the client after the request is completed. If the server is required to check the header of the request, the client must send Expect: 100-continue as the header in its initialization request and receive the 100 continue status code before sending the body.

101 Switching Protocols

The server has understood the client’s request and will use the Upgrade header to inform the client to use a different protocol to complete the request. After sending the blank line at the end of the response, the server will switch to the protocols defined in the Upgrade header.

• 2 xx successCopy the code

The 200 OK request was successful, and the response header or data lift desired by the request is returned with this response.

204 Not Content The server successfully processed the request without returning any Content.

• 3xx redirectionCopy the code

The requested resource has a list of feedback options, each with its own specific address and browser-driven negotiation. The user or browser can choose a preferred address to redirect to.

The 320 Found requires the client to perform a temporary redirection, which is temporary.

304 Not Modified Indicates that the resource has Not been Modified Since the version specified by the if-modified-since or if-none-match arguments in the request header. In this case, the resource does not need to be downloaded because the client still has a copy of the previous download.

305 Use Proxy The requested resource can be accessed only through the specified Proxy.

• The 4XX client is incorrectCopy the code

400 Bad Request A client error occurs. For example, the syntax is incorrect, the Request body is too large, or the Request is invalid. As a result, the server cannot process the Request.

401 Unauthorized User does not have necessary credentials.

403 Forbidden The server understood the request, but refused to perform it.

404 Not Found The request failed. The requested resource was Not Found on the server, but subsequent requests are allowed.

408 Request Timeout The Request times out. If the client completes the sending of a request without waiting time for the server to prepare, the client can submit the request here at any time without any changes.

• 5xx server errorCopy the code

Indicates that the server was unable to complete an obviously valid request. This kind of status code indicates that the server has an error or abnormal status in the process of processing the request, or the server may realize that the current hardware and software resources cannot complete the corresponding request processing. Unless this is a HEAD request, the server should contain an explanation entity that explains the current error code and whether the condition is temporary or permanent.

500 Internal Server Error: The Server encountered an unexpected condition that caused it to be unable to complete processing the request.

501 Not Implemented The server does Not support a feature required for the current request. When the server cannot recognize the method of the request and cannot support its request for any resource. (For example, new features in the Web services API)

503 Service Unavailable The server is currently unable to process requests due to temporary server maintenance or overload. The condition is temporary and will recover over time.

502 Bad Gateway The Gateway cannot obtain a valid response from the upstream server.

505 HTTP Version Not Supported The server does Not support or refuses to support the HTTP Version used in the request. This implies that the server cannot provide the same response as the current client HTTP version. The response body should include an entity that describes why the version is not supported and what protocols the server supports.

Reference article:

Developers.google.com/web/fundame… Developer.mozilla.org/zhCN/docs/W… HPBN. Co/http2 / developers.google.com/web/fundame…

If the article is not rigorous or wrong place, welcome to point out, thank you very much.