HTTP protocol 8 connection questions

preface

Http protocol is one of the most popular Http protocols in the world. Http protocol is one of the most popular protocols in the world. Http protocol is one of the most popular protocols in the world

This article mainly includes the following contents

HttpWhat is it?
HttpWhy are protocols stateless?
What is queue head blocking?
GET.POST.PUTWhat’s the difference?
Why introduceHttps?
Why introduceHttp2.0?
Why introduceHttp3.0?
Browser inputurlWhat happened?

The table of contents is as follows

1. `Http`What is it?

Http is translated as hypertext transfer protocol, and its main function is to communicate between the client and the server. We usually use Http protocol when viewing Web pages in the browser. According to the URL specified in the browser address bar, the Web browser obtains file resources and other information from the Web server. The Web page is displayed.

The Http protocol has the following characteristics

No connection: Each request must be connected once. At the end of the request, the connection will not remain
Stateless: Each request is independent and no information about the connection is recorded at the end of the request, reducing network overhead, which is both a plus and a minus
Flexible: throughhttpThe head of the protocolContent-TypeTag, which can transfer data objects of any data type (text, pictures, video, and so on), very flexible
Simple and fast: When sending a request to access a resource, only the request method andURLIt’s easy to use, becausehttpThe protocol is simple enough to makehttpThe server’s program size is small and therefore communication speed is fast

Of course, Http has some disadvantages

Stateless: The request does not record any connection information. Without memory, it is impossible to tell whether the originator of multiple requests is the same client. This means that if the previous information is required for subsequent processing, it must be retransmitted, which may result in an increase in the amount of data transmitted for each connection
Plaintext transmission:HttpPackets are transmitted in plain text. If there is a middleman in the communication process, all the requested content can be easily obtained
Queue head block: Multiple queue heads are blocked when long connections are enabledHttpRequest to reuse oneTCPThe connection can only handle one request at a time, so if the current request takes too long, other requests will be blocked

2. `Http`Why are protocols stateless?

The Http protocol is stateless, that is, each request is independent and the server does not store the state of the client. Therefore, in order to distinguish the identity of the user, we need to carry the identity information (such as cookies) in the Header each time, which actually leads to a large amount of data transmitted in each connection.

So whyHttpHow do you design it?

httpIt was originally designed to be stateless because it was only used to browse static files, and the stateless protocol was sufficient without any additional burden.
As thewebIt needs to be stateful, but not modifiedhttpDoes the protocol make it stateful? It’s not necessary. Because we often linger on one page for a long time before moving on to another, maintaining status between the two pages can be costly.
Second, the old versionhttpIt’s stateless, but right nowhttpThe new requirement, as is common practice in the software field, is to be compatible with historical versions inhttpAdd another layer to the protocol to get what we want. So introducingcookie,sessionAnd so on to achieve this stateful connection.
At the same time, saving user status is a very complex process, andHttpThe protocol is designed to process a large number of transactions more quickly and ensure the scalability of the protocolHTTPThe protocol design is relatively simple. So there’s no need toHttpState management is introduced in protocols

3. What is queue head congestion?

Http1.0 is connection-free, which means that each request/reply client creates a connection with the server and then disconnects immediately after completion. Keep-alive support is introduced in Http1.1.

As shown above, using long connections can be reducedTCPHandshake time improves request speed

But the same is true for long connectionsHttpProtocols also have queue blocking problems, because we can reuse themTCPThe connection, butHttpRequests are still serial

Request 1 -> Response 1 -> Request 2 -> Response 2 -> Request 3 -> Response 3

As shown above, this is a serial fifo queue, no priorities, the priority order of only team, the first row in front of the request processing, the result is if the team takes too long, the first request behind the request is only in the blocking state, this is the team head congestion Of course we can also through some way to alleviate this problem

A domain name allows multiple long connections to be allocated, which increases the number of tasks in the queue so that no one task in the queue blocks all other tasks. A domain name concurrent connection is now available in the browser standard6 ~ 8A (Chrome6个/Firefox8)
A domain name can have a maximum of concurrent requests6 ~ 8We can use multiple secondary domain names. When we visit the server, we can have different resources from different secondary domain names, and they all point to the same server, so that more long connections can be concurrent, thus reducing queue head blocking

4. `GET`.`POST`.`PUT`What’s the difference?

4.1 `GET`with`POST`The difference between

GETUsed to retrieve information, is side-effect-free, idempotent, and cacheable.
POSTUsed to modify data on the server, side effects, non-idempotent, not cacheable

In fact, the main differences between GET and POST are these. Some articles on the Internet say that THE LENGTH of GET URL is limited, while HTTP protocol does not have the length limits of Body and URL, which are mostly caused by browsers and servers. Browser reasons do not say, the server because processing long URL to consume more resources, for performance and security (to prevent malicious construction of long URL to attack) consideration, will give THE URL length limit.

4.2 `PUT`with`POST`The difference between

Some people say that the difference between PUT and POST is that POST is used to create data and PUT is used to update data. Actually, both PUT and POST can create data, but the main difference is that PUT is idempotent, whereas POST is not idempotent so PUT can be used to update data, it can also be used to create data, whereas POST can only be used to create data

If you POST two identical data sets, two data sets are created, and if you PUT two identical data sets, only one data set is created

5. Why`Https`?

As mentioned above, HTTP is transmitted in plaintext and has the following major security drawbacks

HttpCommunications use clear text (not encryption) and the content can be eavesdropped
The identity of the communicating party is not verified, so it is possible to encounter camouflage
The integrity of the message could not be proved, so it may have been tampered with

Therefore, THE Https protocol is introduced to ensure that the information exchanged between the Client and Server cannot be wiretapped by other third parties and can prevent tampering and camouflage

HttpsThe main communication process is shown in the figure:

We mainly did the following things

1. The client communicates with the server to negotiate the encryption mode

2. Client (Client) and server (Server) Identify each other

3. Safe exchange between the partieshttpsKey used for communication (Session Key)

The specific process of Https encryption is more complex, which also involves the verification of certificate chain, man-in-the-middle attack and other knowledge points. I have summarized an article before, but I will not summarize it here. Interested students can understand :Android programmers need to understand Https and man-in-the-middle attack

6. Why`Http2.0`?

We have already introduced the main disadvantages of THE Http protocol. 1. The request/response headers are sent without compression. 2. Send lengthy headers. Sending the same header each time causes more waste; 3. The server responds according to the request sequence. If the server responds slowly, the client cannot request data all the time, that is, the queue head is blocked. 4. No request priority control; 5. The request starts from the client and the server responds only passively

Http2.0 was introduced to address these issues and has the following major improvements over Http1.1

6.1 Header Compression

HTTP/2 will compress headers. If you make multiple requests at the same time and their headers are the same or similar, the protocol will help you eliminate duplicates.

This is known as the HPACK algorithm: a header table is maintained on both the client and the server, all fields are stored in this table, an index number is generated, and the same fields are not sent later, only the index number is sent, which increases speed

6.2 Multiplexing

6.2.1 Binary Frame Division

HTTP/2No longer asHTTP / 1.1The header information and data body are both binary, and collectively referred to as frames (frame) : Header and data frames.

This is not friendly to humans, but it is friendly to computers, because computers only understand binary. After receiving a packet, they do not need to convert the plaintext packet into binary, but directly parse the binary packet, which increases the data transmission efficiency

6.2.2 Multiplexing To Solve Queue Header Congestion

The Headers + Body format is now split into binary frames. The Headers frame stores the header field and the Data frame stores the request Body Data. After frame splitting, the server sees a bunch of out-of-order binary frames instead of complete HTTP request packets.

Both communication parties can send binary frames to each other. This bidirectional sequence of binary frames is also called a Stream. HTTP/2 uses streams to communicate multiple data frames over a TCP connection. This is the concept of multiplexing. These binary frames are not sequenced, so they are not queued, and there is no HTTP queue blocking.

For example, in aTCPOn the connection, the server received the clientA 和 BTwo requests if foundAThe process is time-consuming, so respondARequest the part that has already been processed, and then respondBRequest, complete, and then respondARequest the rest.

6.5 Server Push

HTTP/2 also improves the traditional “request-reply” mode of working to some extent, where services can actively send messages to clients instead of passively responding.

For example, when the browser just requests HTML, it proactively sends static resources such as JS and CSS files that may be used to the client in advance to reduce the delay, which is also called server push

7. Why`Http3.0`?

Http2.0 is not yet learned, how to Http3.0 again? Generally speaking, this is because Http2.0 still has some flaws

The main problem with HTTP/2 is that multiple HTTP requests are multiplexing a TCP connection, and the underlying TCP protocol does not know how many HTTP requests there are. So once packet loss occurs, TCP’s retransmission mechanism is triggered, so all HTTP requests in a TCP connection must wait for the lost packet to be retransmitted.

HTTP/2 multiple requests reuse a TCP connection, which blocks all HTTP requests once packet loss occurs. As you can see, this is not really a problem with Http, it’s a problem with the transport layer so Http /3 changes the underlying Http protocol from TCP to UDP!

UDPHappens regardless of the order, regardless of the packet loss, so will not appearHTTP / 1.1The queue head is blocked andHTTP/2A lost packet full retransmission problem.

Everybody knowsUDPUnreliable transmission, but based onUDP 的 QUICProtocols can be implemented similarlyTCPThe reliability of transmission.

QUICIs a in theUDPOn top of the falseTCP + TLS + HTTP/2The multiplex protocol is not described in detail here, interested students can refer to:This section describes the QUIC protocol connection process

In general, QUIC is a new protocol, for many network devices, do not know what QUIC is, just as UDP, which will cause new problems. As a result, HTTP/3 is now very slow to become popular, so you’ll just have to do a quick overview.

8. Enter in the browser`url`What happened?

To answer this question, you need to have some understanding of the TCP/IP protocol family

8.1 `Tcp/IP`Protocol family

An important aspect of the TCP/IP protocol family is layering. The TCP/IP protocol family is divided into four layers: application layer, transport layer, network layer, and data link layer. There are benefits to having TCP/IP layered. For example, if the Internet was governed by only one protocol, when a design change was needed somewhere, all parts would have to be replaced altogether. After layering, you only need to replace the changing layers. Once the interfaces between the layers are laid out, the internal design of each layer is free to change.

It is worth mentioning that the design becomes relatively simple after the layering. Applications at the application level can just think about the tasks assigned to them, without having to figure out where on earth they are, what their transmission routes are, and whether they can ensure delivery.

8.1.1 application layer

The application layer determines the activities of communication when providing application services to users. The TCP/IP protocol family stores various common application services. For example, File Transfer Protocol (FTP) and Domain Name System (DNS) services are two of them. The HTTP protocol is also in this layer.

8.1.2 transport layer

Transport layer Provides data transfer between two computers in a network connection to the upper application layer. At the transport layer, there are two different protocols: Transmission ControlProtocol (TCP) and User Data Protocol (UDP).

8.1.3 network layer

The network layer is used to handle packets of data as they flow across the network. A packet is the smallest unit of data transmitted over a network. This layer defines the path (the so-called transport route) through which the packets are sent to each other’s computers. The role of the network layer is to select a transmission route among many options when it is transmitted to and from the other computer through multiple computers or network devices. The IP protocol is at the network layer

8.1.4 Data Link Layer

Used to handle the part of the hardware connected to the network. It includes the device driver for controlling the operating system, hardware, NIC (Network Interface Card), optical fiber and other physical visible parts (and all transmission media such as connectors). Hardware categories are within the scope of the link layer.

8.2 Browser Input`url`General process

Parses user inputUrl
throughDNSProtocol Query by domain nameipaddress
The client initiates a request
The server accepts the request and processes it
The client receives the response
Browser render page

Here we focus on the process by which the client initiates the request and the server receives the request, which is used hereTCP/IPProtocol family

When sending data, each layer encapsulates the data, and when receiving data, each layer unencapsulates the data, as shown below:

In simple terms, it is from the application layer to send HTTP request, to the transport layer through the three-way handshake to establish TCP connection, and then to the IP address of the network layer, and then through the data link layer and physical layer, and finally to the server. The server then goes through the reverse process, fetching the data at each layer. We don’t go into much detail here, but if you want to learn more about what happens after the browser enters the URL and enters enter, please refer to what happens after the browser enters the URL and enters enter.

conclusion

This article mainly combs the Http protocol related knowledge points, and answers the following questions

HttpWhat is it?
HttpWhy are protocols stateless?
What is queue head blocking?
GET.POST.PUTWhat’s the difference?
Why introduceHttps?
Why introduceHttp2.0?
Why introduceHttp3.0?
Browser inputurlWhat happened?

If you help, welcome to like, thank you ~

The resources

What is the difference between POST and PUT in HTTP? What happens when you type the URL into the browser and press Enter (super Detailed version)

HTTP protocol 8 connection questions

preface

1. HttpWhat is it?

2. HttpWhy are protocols stateless?

3. What is queue head congestion?

4. GET.POST.PUTWhat’s the difference?

4.1 GETwithPOSTThe difference between

4.2 PUTwithPOSTThe difference between

5. WhyHttps?

6. WhyHttp2.0?