Three-way handshake

About TCP

Transmission Control Protocol (TCP) is a connection-oriented, reliable, byte stream – based transport layer communication Protocol. The equivalent is User Datagram Protocol (UDP), an unreliable transport layer Protocol.

TCP Packet Format

Knowledge preparation

  • SYN synchronization bit SYN=1 indicates a link request. When SYN=1 and ACK=0, it indicates that this is a connection request packet. If the peer agrees to establish a connection, SYN=1 and ACK=1 should be set in the response packet. Therefore, a SYN value of 1 indicates that this is a connection request or connection accept message.
  • ACK bit ACK=1 Confirm valid ACK=0 Confirm invalid;
  • Seq serial number random
  • Ack Reply Number Indicates the number sent by the peer party +1
  • FIN Terminate FIN= 1 Data has been sent and needs to be released

Simple dialogue, basic understanding of three handshakes

C: I will send you a message (SYN=1, seq=100)

S: Ok, I’m ready, you send (ACK=1, ACK= 101. SYN=1, seq=200)

C: Ok, received (ACK=1, ACK= 201)

  1. The client sends a SYN=1 query packet to the server. Seq is X, and the client enters the SYN_SENT state.

  2. The server replies with ACK=1 and SYN=1. The ack number is X +1, the query number seq is Y, and the state is SYN_RCVD.

  3. After receiving this packet, the client replies with an ACK of 1 (Y +1) and enters the Established state.

The packet transmitted during the handshake does not contain data. After three handshakes, the client and server start data transmission.

Why the three-step handshake

Imagine a two-step handshake. The client sends request packet A, but the server does not receive the request packet due to network delay. After receiving the packet, the server establishes A link and waits for the client to send data. The client sends data normally. After A while, the packet sent for the first time also reaches the server. The server establishes A link again and waits for the client to send data, but the client is unaware of this. Waste of server resources.

After three waves and confirmation of connection, send the HTTP request

The HTTP request

The process of sending an HTTP request is to construct an HTTP request packet and send it to a specified port on the server through TCP. A request packet consists of a request line, a request header, and a request body.

POST /auth/login HTTP/1.1 // Request line // Request headerHost: blog-server.hunger-valley.com
Connection: keep-alive
Content-Length: 41
Accept: application/json, text/plain, */*
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6Imh1bmdlciIsImlkIjoxLCJpYXQiOjE2MTExMjc1MjMsImV4cCI6MTYxMTM4NjcyM30 .U-CkNW7WU0zprsjI23eK-0TE5wS_gD-2ZTFW8wE31FUUser-Agent: Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36Content-Type: application/json; charset=UTF-8Origin: https://jirengu-inc.github.io
Referer: https://jirengu-inc.github.io/
Accept-Encoding: gzip, deflate, br
Accept-Language: zh-CN,zh; Q = 0.9, en. Q = {0.8 "username" : "hunger", "password" : "123456"} / / request bodyCopy the code

The request line

Contains the request method, URL, and protocol version

  • There are eight request methods: GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS, and TRACE.
  • The URL is the requested address, which is specified by < protocol > : //< host > : < port >/< path >? < parameters > composition
  • Protocol version Indicates the HTTP version number

Request header

The request header notifies the server that there is information about the client request. It contains a lot of useful information about the client environment and the request body. For example, Host indicates the Host name and virtual Host. Connection,HTTP/1.1 added to use keepalive, a Connection can send more than one request; User-agent, request originator, compatibility, and customization requirements.

Request body

It can hold data for multiple request parameters, including carriage returns, newlines, and request data, which may not be present in all requests. The picture above bears three request parameters: name, password and realName.

Respond to the request

The application that handles requests, the Web Server, is installed on each server. Common Web server products include Apache, Nginx, IIS, or Lighttpd. Web server plays the role of control. For the requests sent by different users, it will combine configuration files and entrust different requests to the programs that process the corresponding requests on the server for processing (such as CGI scripts, JSP scripts, Servlets, ASP scripts, server-side JavaScript, Or some other server-side technology, etc.), and returns the result of daemon processing as a response.

The HTTP response message also consists of three parts (response line + response header + response body).

HTTP / 1.1 200OK // response line // response headerServer: Nginx / 1.4.6 (Ubuntu)Date: Wed, 20 Jan 2021 07:28:09 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 406
Connection: keep-alive
X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, PUT, POST, DELETE, PATCH, OPTIONS
Access-Control-Allow-Headers: Content-Type, Authorization
Access-Control-Allow-Credentials: true
ETag: W/"196-Ay8U/71Rt0EbDzvYIuK2YtXe7xE"

{"status":"ok"."msg":"Login successful"."data": //Response body {"id":1."username":"hunger"."avatar":"https://avatars.dicebear.com/api/human/hunger.svg?mood[]=happy"."createdAt":"The 2020-09-17 T03:03:55. 803 z"."updatedAt":"The 2020-09-17 T03:03:55. 803 z"},"token":"Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6Imh1bmdlciIsImlkIjoxLCJpYXQiOjE2MTExMjc2ODksImV4cCI6MTYxMTM4Njg4OX0 .dcO4DTvWAVYPPL5do3j9zyfa48-69j157iAiXae5yrw"}
Copy the code

Common HTTP status codes

Information response

100 Continue

This temporary response indicates that everything so far is ok, and the client should continue to request and ignore it if it is done.

2XX responded successfully

200 OK

The request is successful

201 Created

The request was successful and a new resource was created as a result. This is usually the response that is returned after a POST request, or some PUT request.

202 Accepted

Request received, but not yet responded, no result. This means that there is no asynchronous response to indicate the result of the current request, and that other processes and services are expected to handle the request, or batch processing.

206 Partial Content

The server has successfully processed some of the GET requests. HTTP download tools such as FlashGet or Xunlei use this type of response to implement breakpoint continuation or break up a large document into multiple download segments at the same time. The request must contain a Range header indicating the Range of content the client expects, and may contain if-range as a request condition.

3 xx redirection

This type of status code indicates that the client needs to take further action to complete the request.

301 Moved Permanently

The requested resource has been permanently moved to the new location, and any future references to this resource should use one of the several URIs returned by this response. If possible, clients with link editing capabilities should automatically change the requested address to the one that is returned from the server.

302 Found

The requested resource now temporarily responds to the request from a different URI. Since such redirects are temporary, the client should continue to send future requests to the original address.

304 Not Modified

If the client sends a conditional GET request and the request is granted, the content of the document has not changed (since the last access or according to the conditions of the request), the server should return this status code. The 304 response disallows the inclusion of a message body and therefore always ends with the first blank line after the message header.

305 Use Proxy

The requested resource must be accessed through the specified proxy. The Location field gives the URI information for the specified proxy, and the receiver needs to send a separate request repeatedly to access the resource through this proxy. Only the original server can establish the 305 response.

4XX Client error

Such status codes indicate that the client appears to have made an error that interferes with the server’s processing

400 Bad Request

The current request cannot be understood by the server. The client should not re-submit this request unless it is modified.

2. The request parameters are incorrect.

401 Unauthorized

The current request requires user authentication. The response must include a WWW-Authenticate header applicable to the requested resource to ask for user information. The client can repeatedly submit a request with the appropriate Authorization header information. If the current request already contains Authorization certificates, the 401 response represents server validation that has rejected those certificates. If the 401 response contains the same authentication query as the previous response, and the browser has attempted authentication at least once, the browser should show the user the entity information contained in the response, as this entity information may contain relevant diagnostic information.

403 Forbidden

The server understands the request, but refuses to execute it. Unlike the 401 response, authentication does not help, and the request should not be submitted twice.

404 Not Found

The requested resource was not found on the server. There is no information to tell the user whether the condition is temporary or permanent. If the server is aware of the situation, it should use the 410 status code to tell the old resource that it is permanently unavailable due to some internal configuration mechanism and that there are no reachable addresses. The 404 status code is widely used when the server does not want to reveal exactly why the request was rejected or when no other suitable response is available.

405 Method Not Allowed

The request method specified in the request line cannot be used to request the corresponding resource. The response must return an Allow header representing a list of request methods that the current resource can accept. Because the PUT and DELETE methods write to the resources on the server, most web servers do not support or do not allow these methods by default, and return 405 errors for such requests.

Server response

Indicates that the server is unable to complete an obviously valid request. This status code indicates that an error or exception occurs when the server processes the request. It may also indicate that the server realizes that the current hardware and software resources cannot complete the request processing

500 Internal Server Error

The server encountered a situation that it did not know how to handle.

501 Not Implemented

The server does not support a feature required for the current request. When the server does not recognize the requested method and cannot support its request for any resource.

502 Bad Gateway

An invalid response was received from the upstream server when a server working as a gateway or proxy tried to execute the request.

503 Service Unavailable

The server is not ready to handle the request. The common reason is that the server is down for maintenance or heavy loads. Note that along with this response, a user-friendly page explaining the problem should be sent. This response should be used for temporary conditions and retry-after: if possible, the HTTP header should contain an estimated time before the service is restored. The webmaster must also pay attention to the cache-related headers sent with this response, because these temporary conditional responses should not normally be cached.

504 Gateway Timeout

Return this error code when the server, acting as a gateway, cannot get a response in time.

505 HTTP Version Not Supported

The server does not support the HTTP protocol version used in the request.

Four steps to wave

  1. The client sends a FIN, seq is U, telling the server it wants to close the connection.

  2. The server receives the FIN and sends back an ACK (SEQ is V) with the ACK number u+1.

  3. The server notifies the application to close the network connection, and notifies the server when the application is closed. The server sends a FIN (FIN=1,ACK=1, SEq = W,ACK= U +1) to the client.

  4. The client sends an ACK packet with ACK = 1 seQ = U +1 ACK = W +1.

Why do you wave in four steps

This is because when the server SOCKET in LISTEN receives a SYN packet from the client for establishing a connection, it can send ACK and SYN in one packet. ACK responds and SYN synchronizes. However, when the server receives a FIN packet notification from the client, it can only send an ACK message saying, “OH, I see.” And then notify the application. Only after the application completes sending all data and determines that it is ready to terminate can the server send a FIN to tell the client that it is ready to disconnect. In this step, ACK packets and FIN packets must be sent separately.

What are the types of HTTP caches?

Cache flow chart

There are two types of HTPP cache: negotiated cache and mandatory cache

Force caches to use local caches without sending requests, and negotiate caches to send requests to the server asking for updates.

Negotiate the cache

The ETag and If – None – Match

ETag is the Entity Tag of a URL, which is the identifier of a URL resource. It is similar to md5 of a file. When the server returns a URL, it can calculate a hash value or a digital version number based on the returned content. Then add it to the header of response, which might look something like this:

ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"
Copy the code

The client saves the ETag along with the return value, and then, on the next request, uses the matching if-none-match to place it inside the request header, which might look like this:

If-None-Match: "33a64df551425fcc55e4d42a148795d9f25f89d4"
Copy the code

The server then takes the if-none-match from the request and compares it with the current version of ETag:

  1. If it is the same, return directly304Semantics for,Not Modified, does not return the content (body), returns onlyheaderTell the browser to use the cache directly.
  2. If not, return200And the latest content

ETag also comes with a less commonly used request header —- if-match, which is the opposite of if-none-match. If-none-match is downloaded If it does not Match. If-match is usually used in POST or PUT requests, and the semantics are only submitted If there is a Match. For example, If you are editing an item, others may be editing it at the same time. And when you submit the edit, someone else may have submitted it before you, and then the ETag on the server side has changed, and the IF-match is not present, and the server is going to give you the feed Failed 412 error. If if-match is true, 200 is normally returned.

Summary:

Etag is the equivalent of tagging a resource to create a “unique” fingerprint. The Etag changes when the file is modified on the server. It works in a similar way to last-modify. In the real world, this unique is not rigorous.

Last-Modified & If-Modified-Since

Last-modified and if-modified-since are also used together, similar to ETag and if-none-match. Whereas ETag places a version number or hash value, last-modified places the Last modification time of the resource. Last-modified is placed in the header of response. When the browser requests a resource from the server, the server will respond with the modification time of the resource, which may look like this:

Last-Modified: Wed, 21 Oct 2000 07:28:00 GMT
Copy the code

If modified-since is added to the header of the **request. If modified-since is added to the header of the **request. Wed, 21 Oct 2000 07:28:00 GMT. The server can compare the requested file modification time with the actual file modification time to determine whether the resource is expired. Long like this:

If-Modified-Since: Wed, 21 Oct 2000 07:28:00 GMT
Copy the code

The server takes this header and compares it with the current version:

  1. The current version was modified later than this, that is, changed after this time, return200And new content
  2. The current version changes at the same time as this, i.e., no updates, return304, does not return the content, only the header, the client directly uses the cache

If-modified-since, If- unmodified-since, If- unmodified-since, If- unmodified-since If the client passes if-unmodified-since, like this:

If-Unmodified-Since: Wed, 21 Oct 2000 07:28:00 GMT
Copy the code

The server takes this header and compares it with the current version:

  1. If there is no update after this time, the server returns200And returns the content.
  2. If it’s updated after this time, it’s actually thisifIf not, an error code is returned412Semantics for,Precondition Failed

ETag and Last-Modified priorities

ETag and Last-Modified are both negotiated caches that require the server to evaluate and compare. If both exist, which one should be used? The answer is ETag, which has a higher priority than last-Modified. One problem with last-Modified design is that last-modified accuracy is only down to seconds. If a resource is being Modified frequently, many times in the same second, you won’t be able to tell the difference from last-Modified. But ETag generates a new one every time it is Modified, so it is more precise and accurate than last-Modified. However, ETag is not completely ok. If your ETag is designed to be a hash value, it will be computed on every request, which will cost the server extra resources. The specific use of which one, according to their own project to make a choice.

Mandatory cache

The third and fourth examples above are mandatory caching, where I know that at some point I don’t even have to ask the server to use the cache. In both examples, Expires is a separate header; max-age and immutable are both cache-control headers.

Expires

Expires represents the expiration time of the resource, which is the header of the server’s response with this field:

Expires: Wed, 21 Oct 2000 07:28:00 GMT
Copy the code

The client browser then stops making requests until that time and uses the cached resources directly.

Cache-control: max-age= number of expiration seconds. Expires is ignored.

Cache-Control

Cache-control is more complex and has a number of properties that can be set. Max-age is just one of the properties. It looks like this:

Cache-Control: max-age=20000
Copy the code

This means that the current resource will not be requested for 20,000 seconds and will be cached.

Immutable is also a cache-control property, but it is experimental and does not work well across browsers. Cache-control: immutable; cache-control: immutable;

Other common attributes include:

No-cache: Forces the request to be submitted to the server for validation (negotiated cache validation) before using the cache.

No-store: Does not store anything about client requests or server responses, that is, does not use any caching.

In addition, cache-Control also has many attributes, which you can refer to the MDN documentation.

Expires and Cache-Control priorities

In a nutshell: If a max-age or s-maxage directive is set in the cache-control response header, the Expires header is ignored.

Negotiate cache and enforce cache priority

This is actually quite understandable. The negotiation cache requires the request to be negotiated with the server. If the forced cache is in effect, the request will not be sent at all. So the priority is: judge the mandatory cache first, if the mandatory cache works, use the cache directly; If you force the cache to be invalidated, send a request to the server to negotiate whether to use the cache.

How does HTTP control caching

  • When the browser sends a request to the server for obtaining resources for the first time, the server responds with a status code of 200. The response header contains cache-Control and Etag fields, and the response body is the original resource. The browser receives the response and caches the resource locally.

  • When the browser sends a request to obtain the resource again, it checks whether the resource has expired (based on the cache-control :max-age= expiration time of the previous response packet). If it is within the expiration time, use the resource directly.

  • If the time expires, a request is sent asking if the resource is still available. The request contains the if-none-match header field, which is the Etag in the previous response packet.

  • After receiving the request, the server compares the Etag in if-none-match with the newly calculated Etag. If the request is matched, the server directly returns a packet with status code 304 that does not contain the response body to inform the browser that the resource is still available. If not, a new packet with a status code of 200 with cache-Control, Etag, and the original resource is returned.

  • If no Etag is present, last-modified and if-Modified-since are used to make similar judgments.

conclusion

The HTTP caching mechanism is as follows:

  1. HTTPCaching mechanisms are divided intoMandatory cacheandNegotiate the cacheTwo categories.
  2. Mandatory caching means don’t ask (don’t initiate requests), just use caching.
  3. Mandatory cacheCommon techniques areExpiresandCache-Control.
  4. ExpiresThe value of is a time before which the cache is valid and no request is required.
  5. Cache-ControlThere are lots of property values, common propertymax-ageThe duration of the cache is set, in units ofsecondsYou don’t have to make a request until that time.
  6. immutableIs alsoCache-ControlProperty indicating that the resource should never be requested again in this lifetime, but it is not compatible.Cache-ControlOther attributes are available for referenceMDN document.
  7. Cache-Controlthemax-agePriority thanExpiresHigh.
  8. Negotiate the cacheCommon techniques areETagandLast-Modified.
  9. ETagIt’s just a resourcehashValue or version number, corresponding to commonrequest headerforIf-None-Match.
  10. Last-ModifiedIn fact, plus the time of resource modification, the corresponding commonrequest headerforIf-Modified-Since, the accuracy ofseconds.
  11. ETagIt changes with every modification, andLast-ModifiedThe accuracy is only up toseconds, soETagMore accurate, higher priority, but computationally required, so more overhead on the server.
  12. If both the mandatory cache and negotiation cache exist, check whether the mandatory cache takes effect first. If the mandatory cache takes effect, you do not need to initiate a request and use the cache directly. If the force cache does not take effect then initiate a request to determine the negotiation cache.

What’s so great about HTTP2.0

Problems with http1.x

  • Pipeling browsers have their own issues and bugs when handling pipeling and are not supported by default. In addition, there will still be server blocking for large files.
  • The main use is keep-alive, where requests for resources are serial in a connection. To speed up parallelism, the browser will open multiple connections. By default, a domain name can open a maximum of six connections. Requests exceeding this limit will be blocked. (So some website static resources use multiple domain names, but too many domain name management inconvenience and domain name resolution also need time)
  • A client can initiate a request, but not a server
  • Request/response header is too large, send uncompressed, wasteful
  • The header of each request/response is mostly redundant and repetitive
  • Data compression is not mandatory and may not be compressed
  • Request order has no priority, only fate.
  • The client can parse HTML to send resource requests, and so can the server

The improvement of Http2.0

  • Based on binary streams. A TCP connection is divided into several streams. Each Stream can transmit several messages. Each Message consists of several minimal binary frames.
  • Multiplexing. A TCP connection can handle multiple requests without limit
  • Requests can be prioritized
  • Compressing Http headers
  • Server Push. The client sends a request for HTML, and the server sends the HTML along with the resources needed in the HTML
  • Server Hints, Preload, and Prefetch. The browser loads the large image at an idle time and may need it for the next request

Preload vs. Server Push

  • Preload tells the browser what resource to load immediately next.
<link rel="preload" href="https://example.com/images/large-background.jpg">
Copy the code
  • Prefetch preloads, telling the browser what resource to load next. Load at idle time.
<link rel="preload" href="https://example.com/images/music.mp3">
Copy the code

Refer to the article


Offer · Yuque.com

What happens when you go from URL input to page presentation?

Easy to understand HTTP caching policy – SegmentFault think no