preface

When requesting a static file (image, CSS, JS), etc., these files are characterized by files that do not change often. Storing these files that do not change often is a way for the client to optimize the user browsing experience. So that’s what client caching is all about.

Http caching mechanism as an important means of web performance optimization, for students engaged in Web development, should be a basic link in the knowledge base, but also for students who want to become front-end architects is a necessary knowledge and skills.

For many of you on the front end, it’s not always clear that the browser caches the requested static file, but why it’s cached and how it works.

Here, I will try to use a simple and clear text, like you introduce the HTTP cache mechanism, expect to understand the right front-end cache to help.

Cache rule resolution

HTTP caches have a variety of rules, and they are categorized according to whether or not a request needs to be redirected to the server. I have divided them into two main categories (mandatory caches, comparative caches)

Before introducing these two rules in detail, let’s use sequence diagrams to give you a brief understanding of these two rules.

If cached data already exists, the process for requesting data is as follows:

If cached data already exists, the data request process is as follows based on comparison cache only:

Those of you who are not familiar with the caching mechanism may ask, under the process based on comparative caching, no matter whether you use caching or not, you need to send requests to the server, so why do you still use caching?

Let’s leave that aside and get to the answer later in this article when we cover each of the caching rules in more detail.

We can see the difference between the two types of caching rules: the forced cache does not need to interact with the server if it is in effect, whereas the comparative cache does need to interact with the server whether it is in effect or not.

Two types of cache rules can exist at the same time, and the force cache has a higher priority than the comparison cache. That is, when the force cache rule is executed, if the cache takes effect, the cache is directly used and the comparison cache rule is not executed.

Mandatory cache

From the above we know that the mandatory cache, in the case of the cache data is not invalid, can directly use the cache data, so how does the browser determine whether the cache data is invalid?

We know that when the browser requests data from the server without caching, the server returns the data along with the cache rule information contained in the response header.

In the case of a force-cache, there are two fields in the response header that indicate the Expires rule (Cache-control). Using Chrome’s developer tools, you can clearly see what happens to a web request when a force-cache is in effect.

Expires

The Expires value is the expiration time returned by the server. If the request time is shorter than the expiration time returned by the server, the cached data is directly used.

However, Expires is an HTTP 1.0 thing, and the default browser now uses HTTP 1.1 by default, so its role is largely ignored.

Another problem is that the expiration time is generated by the server, but the client time can be out of sync with the server time, which can lead to errors in cache hits.

So HTTP 1.1 uses cache-control instead.

Cache-Control

Cache-control is the most important rule. Common values include private, public, no-cache, max-age, and no-store. The default value is private.



private:The client can cachepublic:Both client and proxy servers can be cached (front-end students, can be consideredpublicandprivateIt's the same.) max-age=xxx:Cached contents will expire in XXX seconds no-cache:A comparison cache is required to validate the cached data (described below) no-store:Nothing is cached, forced caching, comparison caching is not triggeredCopy the code

For front-end development, the more cache the better, so… Basically speaking to it

Here’s an example:

In this figure, cache-control only specifies max-age, so the default value is private. The Cache time is 31536000 seconds (365 days). In other words, if the Cache is requested within 365 days, the Cache data will be directly retrieved from the Cache database and used directly.

If you don’t understand, let’s say it in more general terms. When a client accesses a resource for the first time, the server returns the resource content as well as Expires: Sun, 16 Oct 2016 05:43:02 GMT.

The server tells the browser: you first give this file to me cache, before this expiration time, this file will not change, the next time you need this file, you don’t come to me for it, you just go to the cache, fast and good.

The browser replied: No.

So when the second HTML page attempts to access the resource before Sun, 16 Oct 2016 05:43:02 GMT, the browser stops fetching the file from the server and feeds itself from the cache.

However, after all, the browser is in the client, and the client time is not accurate. Users can modify the time of their machine as they like. For example, I set the time of my machine to Sun, 16 Oct 2016 05:43:03 GMT. My browser will no longer use the cache and will go to the server to fetch the file every time. So, the server angry: give you an absolute time, you can not judge the expiration due to the environment has been modified, so I will give you the relative time. Cache-control: max-age:600 So the browser will have to cache for 10 minutes.

But again, what if a server has both Expires and Cache-Control set? The cache-controll is only available in HTTP1.1. The more advanced cache-control setting is used as the standard.

Ok, here’s a problem. I have a file that may be updated from time to time, and the server really wants the client to come by from time to time and ask if the file is expired. If not, the server does not return data to you, but tells the browser that your cache is not expired (304). The browser then uses its own stored cache to do the display. This is called a conditional request.

Compared to the cache

Comparison caches, as the name suggests, require a comparison to determine whether caches can be used. The first time the browser requests data, the server returns the cache id along with the data to the client, which backs up both to the cache database.

When requesting data again, the client sends the backup cache ID to the server. The server checks the backup cache ID. After the check succeeds, the server returns the 304 status code to inform the client that the backup data is available.

For comparison caching, the passing of cache identity is one of the most important things we need to understand. It is passed between request headers and response headers. There are two types of passing of cache identity, which we will discuss separately.

Last-Modified / If-Modified-Since

Last-modified: When the server responds to a request, it tells the browser when the resource was Last Modified.

The if-modified-since:

This field notifies the server when the resource returned by the server was last modified during the last request.

When the server receives the request, it finds the if-modified-since header and compares it with the last modification time of the requested resource.

If the last modification time of a resource is greater than if-modified-since, it indicates that the resource has been Modified again. The status code 200 is returned in response to the entire resource content. If the last modification time of a resource is less than or equal to if-modified-since, the resource has not been Modified, the browser responds to HTTP 304, telling the browser to continue using the saved cache.

Etag/if-none-match (priority higher than last-modified/if-modified-since)

The first time a client accesses a resource, the server returns the resource’s contents with ETag: 1234, telling the client that the file’s tag is 1234. If I modify my resource, the tag will be different.

The second time the client accesses the resource, the client needs to query the server whether the resource has expired because there is a resource with Etag 1234 in the cache. So if-none-match: 1234. Tell the server: if the resource on your side is still labeled 1234, you will return 304 to tell me that there is no need to return the resource content. If not, you can return the resource content to me. The server compares the Etag to see if it returns 304 or 200.

Various refresh

Once you understand the cache tags above, it’s easy to understand the various refreshes.

There are three types of refresh



Type the address in the browser, press EnterF5
Ctrl+F5
Copy the code

Suppose for a resource:

Cache-control: max-age:600, Last_Modify: Last_Modify: Wed, 10 Aug 2013 15:32:18 GMT The browser then puts the resource file in the cache and decides to fetch it directly from the cache the next time it is used.

Browser URL enter

The browser finds the file in the cache, so it doesn’t send any requests and goes directly to the cache to fetch the presentation. (the) fastest

Now I hit F5 refresh

F5 tells the browser, don’t be lazy, at least go to the server to see if the file has expired. The browser then boldly sends a request with if-modify-since: Wed, 10 Aug 2013 15:32:18 GMT

Then the server finds: hey, I haven’t modified this file since this time, don’t need to give you any information, just return 304. So the browser gets 304 and goes to the cache and gets the resource happily.

But now we hit Ctrl+F5

Tell the browser that you first delete the file in your cache to me, and then go to the server to request a complete resource file down. So the client completes the forcible update operation…

By the way, that ETag is actually rarely used because it is calculated using algorithms that take up server computing resources. All server resources are precious, so ETag is rarely used.