preface

Http caching mechanism as an important means of web performance optimization, for students engaged in Web development, should be a basic link in the knowledge base, but also for students who want to become front-end architects is a necessary knowledge and skills. For many of you on the front end, it’s not always clear that the browser caches the requested static file, but why it’s cached and how it works. Here, I will try to use a simple and clear text, like you introduce the HTTP cache mechanism, expect to understand the right front-end cache to help.

Before introducing HTTP caching, let’s take a quick look at HTTP packets

HTTP packets are data blocks sent and responded to during communication between the browser and the server. The browser requests data from the server and sends a request message. The server returns data to the browser and sends a response message. Message information is mainly divided into two parts (1) contains the attributes of the first (header) — — — — — — — — — — — — — — — — — — — — — — — — — – additional information (such as cookies, cache information) rules related to cache information, 2 are contained in the header. Contains data (the body) in the main body of — — — — — — — — — — — — — — — — — — — — — — — HTTP request really want a part of the transmission

Cache rule resolution

For your understanding, we assume that the browser has a cache database for storing cached information.

When the client requests data for the first time, no corresponding cached data exists in the cache database. Therefore, it requests the server and stores the data to the cache database after the server returns the request.




HTTP caches have a variety of rules, and they are categorized according to whether or not a request needs to be redirected to the server. I have divided them into two main categories (mandatory caches, comparative caches)

Before introducing these two rules in detail, let’s use sequence diagrams to give you a brief understanding of these two rules.





If cached data already exists, the process for requesting data is as follows, based only on the mandatory cache





If cached data already exists, the process for requesting data is as follows based on comparison caching only

Those of you who are not familiar with the caching mechanism may ask, under the process based on comparative caching, no matter whether you use caching or not, you need to send requests to the server, so why do you still use caching? Let’s leave that aside and get to the answer later in this article when we cover each of the caching rules in more detail.

We can see the difference between the two types of caching rules: the forced cache does not need to interact with the server if it is in effect, whereas the comparative cache does need to interact with the server whether it is in effect or not. Two types of cache rules can exist at the same time, and the force cache has a higher priority than the comparison cache. That is, when the force cache rule is executed, if the cache takes effect, the cache is directly used and the comparison cache rule is not executed.

Mandatory cache

From the above we know that the mandatory cache, in the case of the cache data is not invalid, can directly use the cache data, so how does the browser determine whether the cache data is invalid? We know that when the browser requests data from the server without caching, the server returns the data along with the cache rule information contained in the response header.

For force-caching, there are two fields in the response header to indicate the invalidation rule (Expires/ cache-Control).

Using Chrome’s developer tools, you can clearly see what happens to network requests when mandatory caching is in effect





Expires

The Expires value is the expiration time returned by the server. If the request time is shorter than the expiration time returned by the server, the cached data is directly used.

However, Expires is an HTTP 1.0 thing, and the default browser now uses HTTP 1.1 by default, so its role is largely ignored.

Another problem is that the expiration time is generated by the server, but the client time can be out of sync with the server time, which can lead to errors in cache hits.

So HTTP 1.1 uses cache-control instead.



Cache-Control

Cache-control is the most important rule. Common values include private, public, no-cache, max-age, and no-store. The default value is private.

Private: The client can cache

Public: both client and proxy servers can be cached.

Max-age = XXX: the cached content will be invalid in XXX seconds

No-cache: A comparative cache is required to validate cached data (described below)

No-store: all content will not be cached, forced cache, and comparison cache will not be triggered. Basically speaking to it



For Chinese chestnut



In the figure, cache-control only specifies max-age, so the default is private and the Cache time is 31536000 seconds (365 days).

In other words, if you request this data again within 365 days, you will get the data directly from the cached database and use it directly.



Compared to the cache

Comparison caches, as the name suggests, require a comparison to determine whether caches can be used. The first time the browser requests data, the server returns the cache id along with the data to the client, which backs up both to the cache database. When requesting data again, the client sends the backup cache ID to the server. The server checks the backup cache ID. After the check succeeds, the server returns the 304 status code to inform the client that the backup data is available.


First Visit:



Visit again:

Through the comparison of the two figures, we can clearly find that when the comparison cache takes effect, the status code is 304, and the packet size and request time are greatly reduced. The reason is that after the comparison, the server returns only the header, notifies the client to use the cache through the status code, and does not need to return the packet body to the client.

For comparison caching, the passing of cache identity is one of the most important things we need to understand. It is passed between request headers and response headers. There are two types of passing of cache identity, which we will discuss separately.


Last-Modified / If-Modified-Since

Last-modified:

When the server responds to the request, it tells the browser when the resource was last modified.



The if-modified-since:

This field notifies the server when the resource returned by the server was last modified during the last request.

When the server receives the request, it finds the if-modified-since header and compares it with the last modification time of the requested resource.

If the last modification time of a resource is greater than if-modified-since, it indicates that the resource has been Modified again. The status code 200 is returned in response to the entire resource content.

If the last modification time of a resource is less than or equal to if-modified-since, the resource has not been Modified, the browser responds to HTTP 304, telling the browser to continue using the saved cache.







Etag / If-None-Match(Priority over last-modified/if-modified-since)

Etag:

When the server responds to a request, it tells the browser the unique identity of the current resource on the server (the generation rules are determined by the server).





If None – Match:

This field informs the server of the unique identity of the client segment cached data when the server is requested again.

The server receives the request and finds if-none-match to Match the unique identifier of the requested resource.

If the resource is different, it indicates that the resource has been changed again. The status code 200 is returned in response to the whole resource content.

If yes, the resource has not been modified. In this case, the browser responds to HTTP 304, telling the browser to continue using the saved cache.





conclusion

For mandatory caching, the server notifies the browser of a cache time, within which the next request is made, the cache will be used directly.

For comparison cache, Etag and Last-Modified in the cache information are sent to the server through a request, which is verified by the server. When the 304 status code is returned, the browser directly uses the cache.




First request from browser:





When the browser requests again:

Reprint address: https://www.cnblogs.com/chenqf/p/6386163.html