Preface:
The cache in question is when the browser stores requested static resources (HTML, CSS, JS, images) in the browser’s memory or on the computer’s local disk, so that when the browser accesses them again, they can be loaded directly from the local server without having to request them from the server.
1 Advantages and disadvantages of enabling caching
1.1 advantages:
Reduces unnecessary data transmission and saves bandwidth
Reduce server load and improve site performance
Accelerated client loading web page speed
User experience friendly
1.2 disadvantages:
If the resource changes but the client is not updated in time, users will be delayed in obtaining information, which is even worse if the older version has bugs.
2 Cache Principles (Mandatory cache and Negotiated Cache)
How does the browser decide which resources need to be cached and which do not? How do you determine whether a resource is cached or updated from the server? To do this, we first need to understand two concepts:
2.1 Forced Cache (cache-Control)
Mandatory cache; When the browser requests a file, the server caches the file in the Respone header. The time and type of the cache are controlled by the server, as follows:
Respone header cache-control: max-age public private no-cache no-store
- Max-age Indicates that the cache time is 31536000 seconds (one year).
- Public means that it can be cached by browsers (clients) and proxy servers. Proxy servers are usually used by nginx.
- Immutable indicates that the resource is never changed;
Immutable configuration this resource is not actually immutable; it is set so that users do not call the server when refreshing a page.
If you only set cahe-control:max-age=31536000,public this is a strong cache. Every time the user opens the page normally, the browser will determine whether the cache has expired, and if it has not, it will read the data from the cache.
If cahe-control:max-age=315360000,public = immutable, the browser does not request service even if the user updates the page, the browser reads the cache directly from the local disk or memory and returns 200 status. See the red box above (from Memory Cache).
This is what the Facebook team suggested to the IETF working group that developed the HTTP standard in 2015: They wanted HTTP to add a property field to the cache-Control response header to indicate that the resource would never expire, so that browsers would no longer have to make conditional requests for those resources.
Cache-control: Max – age = XXXX, public
Both client and proxy servers can cache this resource;
If the client has a request for the resource within XXX seconds, it directly reads the cache,statu code:200, and sends an HTTP request to the server if the user refreshes the resource
Cache-control: Max – age = XXXX, private
Only the client can cache the resource; The proxy server does not cache
The client reads the cache directly in XXX seconds,statu code:200
Cache-control: Max – age = XXXX, immutable
If the client needs to request the resource within XXX seconds, it directly reads the cache,statu code:200, and does not send HTTP requests to the server even if the user refreshes the resource
cache-control: no-cache
Skip setting strong cache, but do not prevent setting negotiated cache; If you have a strong cache, you will only use the negotiation cache if the strong cache fails. If you set no-cache, you will not use the negotiation cache.
cache-control: no-store
No caching, this will make the client and server do not cache, there is no so-called strong cache, negotiated cache.
2.2 Negotiated Cache
Strong caching is to set an expiration date for a resource. Each time a client requests a resource, it checks whether the expiration date is valid. The server is only asked if it has expired. So, strong caching is meant to be self-sustaining for clients. However, when the client requests the resource and finds that it is expired, it will request the server, and then the process of requesting the server can set up the negotiation cache. In this case, the negotiation cache requires interaction between the client and the server.
Negotiation cache Settings: Settings in response Header
Etag: '5C20ABBD-e2E8' // Last-Modified: Mon, 24 Dec 2018 09:49:49 GMTCopy the code
Etag and Last-Modified in the Response header are returned with each request. The server compares the etag and last-Modified in the Request header with the next request. If eTAG is changed, then last-Modified is used to determine whether the resource has changed. If it has changed (eTAG and last-Modified are both changed), then the new resource is directly returned to the status 200. Etag and last-Modified of response header corresponding to update; If the resource is unchanged, it is eTAG and last-Modified. In this case, the client negotiates the cache for each request, retrieving the resource from the local cache. The status code is 304.
2.2.1 Effect of ETAG and Last-Modified
Reason 1: some files may change periodically, but their contents do not change (only change the modification time), at this time we do not want the client to think that the file has been modified, and GET again;
Cause 2: Some files are Modified very frequently. For example, If the file is Modified less than seconds (say N times in 1s), if-modified-since can be checked in the granularity of seconds. This modification cannot be determined (or UNIX record MTIME is only accurate to seconds).
Some servers do not know exactly when a file was last modified.
In this case, the cache can be more accurately controlled using eTAGS, because etags are unique identifiers on the server side that are automatically generated by the server or generated by the developer for the corresponding resource.
Last-modified and ETag can be used together. The server will verify the ETag first. If the ETag is consistent, the server will continue to compare last-Modified, and finally decide whether to return 304.
2.3 the difference between
Strong cache vs. negotiated cache:
The cache | Control indicators | Form of resource acquisition | Status code | Send the request to the server |
---|---|---|---|---|
Mandatory cache | Expires / Cache-Control | From the slow access | 200 (from cache) | No, cache access directly |
Negotiate the cache | Etag / Last-modified | From the slow access | 304 (not modified) | If yes, the server tells the browser whether the cache is available |
3. Extend knowledge:
Understand cache from memory cache and from Disk cache
state | type | instructions |
---|---|---|
200 | form memory cache | Do not request network resources, resources in memory, general scripts, fonts, pictures will be stored in memory; |
200 | from disk cache | Do not request network resources. In disks, non-scripts are generally stored in memory, such as CSS. |
200 | Resource size value | Download the latest resources from the server; |
304 | Packet size | Request server to find that the resource is not modified and use the local cache; |
3.1 Three-level Cache Principle:
-
The form memory cache is loaded directly.
-
If the memory does not exist, it is loaded directly from the disk cache.
-
If the hard drive is not available, then make a network request;
-
Loaded resources are cached to hard disk and memory.
Therefore, we can explain this phenomenon by taking the picture 📷 as an example (combined with the following flow chart 1) :
Scene 1:
Access -> 200 -> Exit browser; -> 200(from disk cache) -> F5 refresh -> 200(from memory cache) -> Ctrl + F5 refresh -> 200;Copy the code
3.2 Some conclusions (Google Browser as an example) :
If the image is base64, it is from memroy cache.
In 2.0 privacy mode, almost all from memroy cache;
4 Impact of user operations on cache
The user action | Expires/ cache-control | (Negotiated cache) last-Modified /Etag |
---|---|---|
Address enter | effective | effective |
Page link jump | effective | effective |
A new window | effective | effective |
Move back | effective | effective |
F5 to refresh | invalid | effective |
Ctrl+F5 Force refresh | invalid | invalid |
Diagram 1
5. Application in project
5.1 Back-end Servers such as Nodejs:
To set the negotiation cache:
res.setHeader('max-age': '3600 public')
res.setHeader(etag: '5c20abbd-e2e8')
res.setHeader('last-modified': Mon, 24 Dec 2018 09:49:49 GMT)
Copy the code
5.2 Nginx Configuration:
5.3 strategy
Index.html files are cached by negotiation. Other STATIC resources of JS, CSS and image adopt strong cache + negotiation cache;