Calmly deal with all kinds of soul questions, but also improve their own professional quality as a Web development, HTTP caching mechanism, can say that this is one of the important knowledge points front-end engineers need to master.

Hypertext Transfer Protocol (HTTP) is an application layer Protocol used in distributed, collaborative and hypermedia information systems. It can be said that HTTP is the foundation of contemporary Internet communication.

Hypertext Transfer Protocol Secure (HTTPS) HTTPS, often called HTTP over TLS, HTTP over SSL, or HTTP Secure, is a transport protocol for Secure communication over a computer network. HTTPS communicates over HTTP, but uses SSL/TLS to encrypt packets. HTTPS is developed to provide identity authentication for web servers and protect the privacy and integrity of exchanged data.

Features and disadvantages of HTTP

The characteristics of

There is no connection,stateless,flexible,Simple and quick

  • Connectionless: The meaning of connectionless is to limit processing to one request per connection. The server disconnects from the customer after processing the request and receiving the reply from the customer. In this way, transmission time can be saved
  • Stateless: The state refers to the context of the communication process, and each HTTP request is independent and irrelevant. By default, no state information is required
  • Flexible: Data objects of any data Type (text, image, video, etc.) can be transmitted through the content-Type tag in the HTTP header
  • Simple and fast: When sending a request to access a resource, you only need to send the request method and URL. It is simple to use. Because the HTTP protocol is simple, the HTTP server program is small, so the communication speed is fast

disadvantages

stateless,unsafe,Clear transmission,Team head block

  • Stateless: The request does not record any connection information. Without memory, it is impossible to tell whether the originator of multiple requests is the same client. This means that if the previous information is required for subsequent processing, it must be retransmitted, which may result in an increase in the amount of data transmitted for each connection
  • Insecure: Plaintext transmission may be eavesdropped and insecure, lack of identity authentication may be disguised, and lack of packet integrity verification may be tampered with
  • Plaintext transmission: the packet (header part) uses plaintext, directly exposing the information to the outside world. WIFI trap is the feature of multiplexing plaintext transmission, inducing you to connect to a hotspot, and then frantically capturing your traffic, so as to get your sensitive information
  • Queue head blocking: When a long connection is enabled (described below), only one TCP connection can be established and only one request can be processed at a time. If the request takes too long, other requests will be blocked.

HTTP packet structure

Composition: Request message and response message

Request message: request line, request header, blank line, request data

Response message: response line, response header, blank line, response data

  • Request line: contains request method, request address, HTTP protocol, and version
  • Request header: Notifies the server that there is information about a client request
  • Empty line: Sends carriage return and newline characters to inform the server that there are no more headers below
  • Request data: Request parameters
  • Status line: contains HTTP protocol and version, digital status code, and English name of status code
  • Response header: The server returns some description of the client pair
  • Response data: Text information returned by the server to the client

HTTP request method

HTTP1.0

  • Get Obtains server data
  • Post transfers resources, usually resulting in changes to server resources
  • Head Indicates that the user obtains the packet header

HTTP1.1

  • Get Obtains server data
  • Post transfers resources, usually resulting in changes to server resources
  • Patch/PUT Updates data
  • Delete Delete data
  • A HEAD is similar to a GET request except that the response returned does not contain any specific content. The user obtains the packet head
  • Options allows the client to view server performance, such as the type of requests the server supports
  • Trace Trace path, used for testing or diagnosis
  • Connect requires the use of the tunnel protocol to connect to the broker

GET and POST

  • GET is harmless when the browser falls back, while POST initiates the request again
  • GET requests are actively cached by the browser, leaving a history, while POST is not by default unless set manually
  • GET requests have length limits (browser limits vary in size) on the parameters passed in the URL, while POST has no limits
  • The GET argument is passed through the URL, and the POST is placed in the Request body
  • The URL generated by GET can be bookmarked, but POST cannot
  • GET is less secure than POST because the GET request parameters are directly exposed to the URL and therefore cannot be used to pass sensitive information
  • GET requests can only be URL encoded, while POST supports multiple encoding methods
  • GET accepts only ASCII characters for the data type of the argument, while POST has no restrictions
  • GET generates one TCP packet and POST generates two packets (Firefox only sends them once). GET The browser sends the HTTP header and data with a response of 200 successful, POST sends the header with a response of 100 continue, and data with a response of 200 successful

The HTTP status code

Status code classification

Status code meaning explain
1xx Server receives request Receiving a request starting with 1xx indicates that the server has received the request but has not returned the message to the client
2xx The request is successful, for example, 200 Indicates that the client has successfully requested data
3xx Redirect, such as 302 When the client receives a status code starting with 3xx, it indicates that the server does not care about the address requested by the client and asks the client to request another address
4xx Client error 404 A 4xx error is reported when a client requests an address unknown to the server
5xx Server error, such as 500 Indicates that the error originates from the server. For example, the interface written by the server is buggy

Common status code

Status code meaning use
200 OK to success Typically used for GET and POST requests
301 Redirect Permanently Redirects Permanently With location, the browser handles it automatically
302 Found Temporary redirection With location, the browser handles it automatically
304 Not Modified The resource is Not Modified The requested resource is not modified, and the server does not return any resources when it returns this status code. Clients typically cache accessed resources by providing a header indicating that the client wants to return only resources that have been modified after a specified date
404 Not Found The resource was Not Found The server could not find the resource (web page) based on the client’s request. With this code, a web designer can set up a personalized page that says “the resource you requested could not be found.
403 Forbidden No permission The server understands the request from the requesting client, but refuses to execute the request
500 Internal server Error Indicates a server Error Server internal error
504 Gateway time-out Indicates that the Gateway times out The server acting as a gateway or proxy did not get the request from the remote server in time

HTTP cache

1. Introduction to caching

What is caching?

Caching is a technique for saving a copy of a resource and using it directly on the next request.

Why cache is needed:

Without caching, a large number of images and resources are loaded on every network request, which makes the page load much slower. The purpose of caching is to minimize the volume and number of network requests and make pages load faster.

What resources can be cached? Static resources (JS, CSS, IMG)

  • The HTML of a website cannot be cached. HTML can be updated and templates can be replaced at any time.

  • Business data from web pages cannot be cached. Such as message boards and comment sections, where users can comment at any time, the contents of the database will be updated frequently.

  • The HTML of a website cannot be cached. HTML can be updated and templates can be replaced at any time.

  • Business data from web pages cannot be cached. Such as message boards and comment sections, where users can comment at any time, the contents of the database will be updated frequently

2. HTTP cache strategy (mandatory cache + negotiated cache)

Mandatory cache

What is a mandatory cache

Forced caching is when files are fetched directly from the local cache without sending a request.

legend

As you can see from the above figure, the browser sends a request to the server. After receiving the request, the server returns the resource and a cache-control to the client. This cache-control usually sets the maximum expiration time of the Cache.

As you can see from the above figure, the browser has received the value of cache-Control. When the browser sends a request again, it checks whether its cache-control is expired. If it is not, it pulls the resource from the local cache and returns it to the client without passing through the server.

Forcing the cache to have an expiration time means that one day the cache will fail. So suppose one day, the client’s cache-control fails, and it can’t pull resources from the local cache. It then re-requests the server as in the first figure, after which the server returns the resource and cache-control values again.

So that’s how you enforce caching.

Cache-Control

What is cache-control?

  • Exists in the Response Headers;
  • Control the logic of forced caching;
  • Such as:Cache-Control: max-age = 31536000(In seconds).

The cache-control value

The cache-control values meaning
max-age Set the maximum expiration time of the cache
no-cache No local cache, normal request to the server, the server does not care what we do
no-store Simple and crude, pull the cache directly from the server
private Only end users are allowed to cache, i.e. computers, phones, and so on
public Allow intermediate routes or intermediate proxies to cache
Expires
  • amenResponse Headers δΈ­
  • Also to control the expiration time of the cache (early use)
  • ifThe cache-control and expiresAt the same time,cache-controlIs of higher priority thanexpires
Negotiate the cache

What a negotiated cache is:

  • Negotiation cache, also known as contrast cache.
  • It is a kind ofCaching policy on the serverThat is, the server determines whether something can be cached.
  • The server checks whether the resources on the client are the same as those on the server304, otherwise return200And the latest resources.

legend

Similarly, a few diagrams illustrate the negotiated cache.

In the figure above, the whole process of negotiating a cache is shown. First, if the client makes a request to the server for the first time, the server returns the resource and its corresponding resource id to the browser. This resource identifier is a unique identifier for the currently returned resource, which can be either Etag or Last-Modified, as described after the legend.

Later, if the browser sends a request again, the browser will carry the resource identifier. In this case, the server checks the resource identifier to determine whether the browser resources are the same as those on the server. If so, 304 is returned, indicating that the Not Found resource is Not modified. If the result is inconsistent, 200 is returned along with the resource and the new resource id. This completes the negotiation of the cache.

Suppose our negotiated cache is last-Modified at this point. When the browser first sends a request, the server returns the resource and returns a last-Modified value to the browser. After the last-Modified value is given to the browser, the browser saves the last-Modified value in the if-Modified-since field, which is stored in the request header.

Later, when the browser sends a request again, the request header goes back to the server with the if-modified-since value. The server now matches the if-Modified-since value that the browser sent to the server to see If it is the same as its Last last-modified value. If they are equal, 304 is returned, indicating that the resource has not been modified. If not, 200 is returned, along with the resource and the new last-Modified value.

Suppose our negotiated cache is judged by Etag at this point. When the browser first sends a request, the server returns the resource and an Etag value to the browser. After the Etag value is given to the browser, the browser saves the Etag value in the if-none-match field, and if-none-match is stored in the request header.

Later, when the browser sends a request again, the request header goes to the server with the if-none-match value. The server then matches the if-none-match value that the browser sent to the server to see If it matches the value of its last modified Etag. If they are equal, 304 is returned, indicating that the resource has not been modified. If not, 200 is returned, along with the value of the resource and the new Etag.

Through the legend, I believe you have a new understanding of the negotiation cache. Next, I’ll look at some of the fields just included in the legend.

Resource identifier

In Response Headers, there are two resource identifiers:

  • Last-ModifiedThe request header corresponding to the last modification time of the resource isIf-Modified-Since οΌ›
  • EtagThe unique identification of resources, the so-called unique, can be imagined as the fingerprint of human beings, with uniqueness; butEtagIs essentially a string; The corresponding request header isIf-None-Match 。

Last-modified and Etag

  • When the response headerResponse HeadersAt the same time there isLast-Modified ε’Œ EtagIs used preferentiallyEtag οΌ›
  • Last-ModifiedOnly accurate to the second;
  • If the resource is generated repeatedly without changing the contentEtagMore accurate.

Headers sample

As can be seen from the figure above, last-modified in the response header corresponds to if-modified-since in the request header, and Etag corresponds to if-none-match in the request header.

The flow chart

We use a flow chart to show the whole process of negotiating cache.

3. Refresh operation mode and its impact on cache

Refresh operation. When we usually get online, there is always a moment of sudden network card, this time human nature is always very impatient, do not hesitate to refresh. However, the refresh also has some impact on the cache. Let’s take a look at the various refresh postures and their impact on the cache.

Normal operation

Definition: Address bar input URL, jump link, forward and backward, etc.

Impact on cache: The mandatory cache is valid, but the negotiated cache is valid.

Manually refresh

Definition: F5, click refresh button, right click menu refresh.

Impact on cache: Force cache invalidation and negotiate cache validity.

Forced to refresh

Definition: CTRL + F5.

Impact on cache: Force cache invalidation and negotiate cache invalidation.

The HTTP header is blocked

When HTTP is enabled for a long connection, a shared TCP connection can process only one request at a time. If the current request takes too long, other requests are blocked, which is also known as queue header blocking

Concurrent connections

Because a domain name allows multiple long connections to be assigned, it increases the task queue so that no one task in the queue blocks all other tasks. RFC2616 previously stipulated that the client can only concurrent 2 connections, but the reality is that many browsers do not follow the routine card, is to comply with this standard T_T, so in RFC7230 to cancel this provision, the current browser standard in a domain name can have 6 or 8 concurrent connections, Remember 6 8, not 6 (Chrome6 /Firefox8) if that’s not enough for you

Domain name subdivision

A domain name can be concurrent at most 6~8, so we can add several more domain names, such as A.baidu.com, B.baidu.com, c.baidu.com, and prepare more secondary domain names. When we visit Baidu.com, different resources can be obtained from different secondary domain names. And they all point to the same server, so you can make more long connections and with HTTP2.0, you can load a lot of resources all at once because of the multiplexing, you can send multiple requests in a SINGLE TCP connection

HTTPS

The HTTPS protocol relies on TLS/SSL for its main functions

The characteristics of the HTTPS

  • Encryption. HTTPS encrypts data to protect it from eavesdroppers. This means that when a user is browsing a website, no one can listen in on the information exchanged between the user and the website, or track the user’s activities, access history, etc., to steal user information.

  • Data integrity: Data will not be modified by eavesdropping during transmission. The Data sent by the user will be completely transmitted to the server to ensure that the server receives what the user sends.

  • Authentication, which means confirming the true identity of the other party, or proving that you are you (can be likened to face recognition), prevents man-in-the-middle attacks and builds user trust.

Relationship between SSL and TLS

  • Transport Layer Security (TLS), and its predecessor, Secure Sockets Layer (SSL), is a Security protocol designed to provide Security and data integrity for Internet communications.
  • When Netscape introduced the first version of its web browser, Netscape Navigator, in 1994, it introduced the HTTPS protocol, which uses SSL for encryption, which is the origin of SSL.
  • IETF standardized SSL and published the first version of TLS standard document in 1999. This was followed by RFC 5246 (August 2008) and RFC 6176 (March 2011). This protocol is widely supported in applications such as browsers, E-mail, instant messaging, VoIP, and network fax.

SSL/TLS

Refer to TLS/SSL for full understanding

. Continue to improve

The articles

JavaScript handwritten code interview unbeatable πŸ‚01

What does New Vue do with Vue source code