Before we talk about front-end resource caching, let’s look at two scenarios:
Scene 1:
A business communication group internal staff A: Technical teachers, XX system can not access the product B: @ technical students C, please help technical students have A look at the technical students C: @ test students D, test students help reproduce test students D A busy… Student D: @ Internal employee A, THERE is no reappearance on my side. Could you refresh the page for A try? Employee A: I’ve tried, but I still can’t do it. Student D: How about strong brushing? Employee A: What is brushwork? D: Ctrl + F5 internal staff A: Ctrl + F5 is what? Test D:… Half A day later, internal staff A finally learned to brush the page… Employee A: It’s normal now. Thank you, teacher
Scene 2:
New demand online, product classmate A production acceptance… Page blank found inaccessible… Communication group product student A: @ test student B I can’t access it? Test classmate B after a meal operation… Test student B: @ Product student A, I have no problem with this, can you try it? Product student A is very skilled and strong brush cache. Product student A: @ Test student B, now normal after A while… Product student B: @ Test student B, I can’t visit the test student B: @ Product student B, please try brushing it. Product student B: Brushed it, but it didn’t use the test to scratch my head. Head scratching… a few moment later… Product Student B: Now all of a sudden… The test continues to scratch your head… The front end continues to scratch its head…
From the above two scenarios, we can see that a keyword [cache] has failed and can be resolved by clearing the cache. Could that be interpreted as the cache causing the failure? Can we not cache in that case?
So what exactly is the point of caching?
1. Speed up user access speed, improve user retention rate, and further promote conversion rate. 2
Since caching is necessary, why do these two scenarios have problems? Is it caching? Or are we using the cache incorrectly? Let’s take a look at how caching works.
Where do caches exist in general?
1. Browser 2. CDN server 3. Source server (build cache)
What resources are typically cached?
1, HTML file 2, CSS file 3, JS file 4, font file 5, image file above 5 more common, the actual situation is definitely more than.
How are these files cached in the browser
2. The browser determines whether to cache based on the attributes of the HTTP network protocol response header
Browser spontaneous caching is rare, so let’s take a look at the CACHE based on THE HTTP network protocol. Here are the cache attributes:
The property name | value | priority | The HTTP version | instructions |
---|---|---|---|---|
Expires | Date | low | 1.0 | The resource expiration time depends on the client time and is prone to deviation |
Cache-control | max-age | high | 1.1 | The cache time |
– | s-maxage | high | 1.1 | CDN cache duration, which has a higher priority than Max-age or Expires |
– | public | high | 1.1 | CDN cache is allowed |
– | private | high | 1.1 | CDN caching is disabled |
– | no-cache | high | 1.1 | The browser caches the resource, but each time it confirms to the server that the resource has changed |
– | no-store | high | 1.1 | Caching resources is absolutely prohibited |
– | must-revalidate | high | 1.1 | If the resource expires, a new resource is fetched from the server |
Pragma | no-cache | – | 1.0 | Used for backward compatibility with HTTP/1.0 only Cache servers, it behaves as cache-control: no-cache |
Last-modified | Date | – | – | Time when the resource was last modified |
Etag | string | – | – | Resource identifier, usually md5 or hash value |
Cache attributes of the HTTP network protocol request header:
The property name | value | priority | The HTTP version | instructions |
---|---|---|---|---|
Cache-control | max-age | high | 1.1 | The cache time |
— – | no-cache | high | 1.1 | The browser caches the resource, but each time it confirms to the server that the resource has changed |
Pragma | no-cache | – | 1.0 | Used for backward compatibility with HTTP/1.0 only Cache servers, it behaves as cache-control: no-cache |
If-Modified-Since | Date | – | – | Time when the resources retained by the client were last modified |
If-None-Match | string | – | – | The resource identifier reserved by the client |
How does the browser cache work?
1. How does the browser force refresh/disable the resource cache?
Configure cache-control: no-cache or Pragma: no-cache in the request header
2, how to set the browser strong cache?
Configure one of the following attributes in the resource response header :1) Expires: Mon, 10 Aug 2020 06:26:14 GMT 2) Cache-Control: max-age=604800
3. How does the negotiation cache work?
1) if-modified-since: Tue, 21 Jul 2020 17:21:36 GMT 2) if-none-match: W/” 5f172420-cd9A2″
The response header returns the following attributes: 1) Last-Modified: Tue, 21 Jul 2020 17:21:36 GMT 2) Etag: W/” 5F172420-CD9A2″
If the time of if-modified-since equals the time of last-modified and if-none-match equals the value of Etag, the server resource has not changed and can be read from the cache locally. If the value of if-none-match equals the value of Etag, the server resource has changed. New resources need to be pulled from the server again.
Here we’ve got to know about how the browser cache is a process, that we back 1 look at the scene, we suspect scenario 1 there may be a strong cache and strong cache invalidation, but the resources of the server and the server interface has been updated to the user to access the old resources, in the old resource request the new interface, led to the failure occurs. So the problem was fixed after the user flushed the cache and requested the latest server resources. But we can’t help asking, how should this problem be solved? Why didn’t the browser request new resources when we already published them?
Let’s look at a more detailed resource request flow:
If nothing changes in our request header, users will not be able to access the new resource no matter how many times we publish it during the cache. Some students might think of a version number or a timestamp for the request. Let’s see what happens if we use version numbers. See below:
Should HTML not be cached? Let’s look at what happens if THE HTML is not cached:
Obviously, as long as the HTML is not cached and the version number request, it can solve the problem that the resource cache cannot be updated, which is also the solution adopted by many traditional projects. But is this the best solution? Obviously not. There are two obvious problems with the version number scheme: 1) each release need to manually adjust the version number (a classmate say, that I can use timestamp, but the timestamp would exist problems 2) 2) quantity of each release will lead to full cache invalidation, means 1000 front static resources, you only changed one, all the rest of the 999 cache invalidation, obviously this is not what we want.
So how else can we improve the scheme? Here we have to praise the power of Webpack, because with Webpack, we can better play the role of cache.
Three hash values for webpack
1, the hash
The hash value generated based on the construction result of the entire project will change whenever any point in the project changes;
2, chunkhash
The hash value generated based on the result of chunk construction will change only when the content in chunk changes. Such as:
// before the modification // a.vue => a.fda123fd.js<template>
<div class='red'>hello world!</div>
</template>
// a.vue => a.fda123fd.css
<style>
.red {
color: red;
}
</style>// After modification, // a.vue => a.f123klnk.js<template>
<div class='red'>hello world!</div>
</template>
// a.vue => a.f123klnk.css
<style>
.red {
color: #f00;
}
</style>
Copy the code
3, contenthash
The hash value generated based on the content of the resulting file remains unchanged as long as the content of the file remains unchanged. Such as:
// before the modification // a.vue => a.fda123fd.js<template>
<div class='red'>hello world!</div>
</template>
// a.vue => a.45h6j7k8.css
<style>
.red {
color: red;
}
</style>// after modification // a.vue => a.fda123fd.js<template>
<div class='red'>hello world!</div>
</template>
// a.vue => a.3df4g56j.css
<style>
.red {
color: #f00;
}
</style>
Copy the code
Note that if the scope attribute is present in the style, the contenthash will change after each compilation, even if you use contenthash and only adjust the template or script without adjusting the style content.
Given the three hash values, contenthash is the best way to create a new file if the result of compiling the file changes, allowing us to make full use of the cache. Adjusted scheme:
After reading the above example, you probably have a good idea about scenario 1. What about scenario 2? What’s going on? Now, there’s another place for caching, the CDN server cache.
CDN server cache
In order to make our website resources faster to reach users, so with the CDN server, it is believed that most user-oriented websites will access the CDN server, how to ensure the correctness of CDN server resources is very important.
Diagram of CDN:
How does the CDN cache resources from the source site
Whether CDN will have error cache
If the publishing method is not appropriate, CDN cache errors may result. For example, if the HTML page is copied before the JS resource is copied, the REQUESTED JS does not exist. So the safest way is to copy js and other resources first, and then copy HTML pages.
What if CDN error cache occurs
You can update the CDN cache by refreshing the CDN cache
What are the CDN’s back source policies?
1. Active callback source: refresh CDN. 2. Passive callback source: user request, cache expired/new request
So, scenario 2, you probably have a sense of what’s going on, why is it that some people have access to things properly, and some people have access to things incorrectly? This may be because some CDN nodes have cache errors.