Preface:

The cache in question is when the browser stores requested static resources (HTML, CSS, JS, images) in the browser’s memory or on the computer’s local disk, so that when the browser accesses them again, they can be loaded directly from the local server without having to request them from the server.

1 Advantages and disadvantages of enabling caching

1.1 advantages:

Reduces unnecessary data transmission and saves bandwidth

Reduce server load and improve site performance

Accelerated client loading web page speed

User experience friendly

1.2 disadvantages:

If the resource changes but the client is not updated in time, users will be delayed in obtaining information, which is even worse if the older version has bugs.

2 Cache Principles (Mandatory cache and Negotiated Cache)

How does the browser decide which resources need to be cached and which do not? How do you determine whether a resource is cached or updated from the server? To do this, we first need to understand two concepts:

2.1 Forced Cache (cache-Control)

Mandatory cache; When the browser requests a file, the server caches the file in the Respone header. The time and type of the cache are controlled by the server, as follows:

Respone header cache-control: max-age public private no-cache no-store

  • Max-age Indicates that the cache time is 31536000 seconds (one year).
  • Public means that it can be cached by browsers (clients) and proxy servers. Proxy servers are usually used by nginx.
  • Immutable indicates that the resource is never changed;

Immutable configuration this resource is not actually immutable; it is set so that users do not call the server when refreshing a page.

If you only set cahe-control:max-age=31536000,public this is a strong cache. Every time the user opens the page normally, the browser will determine whether the cache has expired, and if it has not, it will read the data from the cache.

If cahe-control:max-age=315360000,public = immutable, the browser does not request service even if the user updates the page, the browser reads the cache directly from the local disk or memory and returns 200 status. See the red box above (from Memory Cache).

This is what the Facebook team suggested to the IETF working group that developed the HTTP standard in 2015: They wanted HTTP to add a property field to the cache-Control response header to indicate that the resource would never expire, so that browsers would no longer have to make conditional requests for those resources.

Cache-control: Max – age = XXXX, public

Both client and proxy servers can cache this resource;

If the client has a request for the resource within XXX seconds, it directly reads the cache,statu code:200, and sends an HTTP request to the server if the user refreshes the resource

Cache-control: Max – age = XXXX, private

Only the client can cache the resource; The proxy server does not cache

The client reads the cache directly in XXX seconds,statu code:200

Cache-control: Max – age = XXXX, immutable

If the client needs to request the resource within XXX seconds, it directly reads the cache,statu code:200, and does not send HTTP requests to the server even if the user refreshes the resource

cache-control: no-cache

Skip setting strong cache, but do not prevent setting negotiated cache; If you have a strong cache, you will only use the negotiation cache if the strong cache fails. If you set no-cache, you will not use the negotiation cache.

cache-control: no-store

No caching, this will make the client and server do not cache, there is no so-called strong cache, negotiated cache.

2.2 Negotiated Cache

Strong caching is to set an expiration date for a resource. Each time a client requests a resource, it checks whether the expiration date is valid. The server is only asked if it has expired. So, strong caching is meant to be self-sustaining for clients. However, when the client requests the resource and finds that it is expired, it will request the server, and then the process of requesting the server can set up the negotiation cache. In this case, the negotiation cache requires interaction between the client and the server.

 

Negotiation cache Settings: Settings in response Header

Etag: '5C20ABBD-e2E8' // Last-Modified: Mon, 24 Dec 2018 09:49:49 GMTCopy the code

Etag and Last-Modified in the Response header are returned with each request. The server compares the etag and last-Modified in the Request header with the next request. If eTAG is changed, then last-Modified is used to determine whether the resource has changed. If it has changed (eTAG and last-Modified are both changed), then the new resource is directly returned to the status 200. Etag and last-Modified of response header corresponding to update; If the resource is unchanged, it is eTAG and last-Modified. In this case, the client negotiates the cache for each request, retrieving the resource from the local cache. The status code is 304.

2.2.1 Effect of ETAG and Last-Modified

Reason 1: some files may change periodically, but their contents do not change (only change the modification time), at this time we do not want the client to think that the file has been modified, and GET again;

Cause 2: Some files are Modified very frequently. For example, If the file is Modified less than seconds (say N times in 1s), if-modified-since can be checked in the granularity of seconds. This modification cannot be determined (or UNIX record MTIME is only accurate to seconds).

Some servers do not know exactly when a file was last modified.

In this case, the cache can be more accurately controlled using eTAGS, because etags are unique identifiers on the server side that are automatically generated by the server or generated by the developer for the corresponding resource.

Last-modified and ETag can be used together. The server will verify the ETag first. If the ETag is consistent, the server will continue to compare last-Modified, and finally decide whether to return 304.

2.3 the difference between

Strong cache vs. negotiated cache:

The cache Control indicators Form of resource acquisition Status code Send the request to the server
Mandatory cache Expires / Cache-Control From the slow access 200 (from cache) No, cache access directly
Negotiate the cache Etag / Last-modified From the slow access 304 (not modified) If yes, the server tells the browser whether the cache is available

3. Extend knowledge:

Understand cache from memory cache and from Disk cache

state type instructions
200 form memory cache Do not request network resources, resources in memory, general scripts, fonts, pictures will be stored in memory;
200 from disk cache Do not request network resources. In disks, non-scripts are generally stored in memory, such as CSS.
200 Resource size value Download the latest resources from the server;
304 Packet size Request server to find that the resource is not modified and use the local cache;

3.1 Three-level Cache Principle:

  1. The form memory cache is loaded directly.

  2. If the memory does not exist, it is loaded directly from the disk cache.

  3. If the hard drive is not available, then make a network request;

  4. Loaded resources are cached to hard disk and memory.

Therefore, we can explain this phenomenon by taking the picture 📷 as an example (combined with the following flow chart 1) :

Scene 1:

Access -> 200 -> Exit browser; -> 200(from disk cache) -> F5 refresh -> 200(from memory cache) -> Ctrl + F5 refresh -> 200;Copy the code

3.2 Some conclusions (Google Browser as an example) :

If the image is base64, it is from memroy cache.

In 2.0 privacy mode, almost all from memroy cache;

4 Impact of user operations on cache

The user action Expires/ cache-control (Negotiated cache) last-Modified /Etag
Address enter effective effective
Page link jump effective effective
A new window effective effective
Move back effective effective
F5 to refresh invalid effective
Ctrl+F5 Force refresh invalid invalid

Diagram 1

5. Application in project

5.1 Back-end Servers such as Nodejs:

To set the negotiation cache:

res.setHeader('max-age': '3600 public')
res.setHeader(etag: '5c20abbd-e2e8')
res.setHeader('last-modified': Mon, 24 Dec 2018 09:49:49 GMT)
Copy the code

5.2 Nginx Configuration:

5.3 strategy

Index.html files are cached by negotiation. Other STATIC resources of JS, CSS and image adopt strong cache + negotiation cache;