Here, I mainly record my daily knowledge learning. I try to write a summary article by combining notes with my own understanding. The details may not be introduced and explained one by one, but the content is only for referenceCopy the code

Some days ago, I was asked about front-end optimization in an interview. I answered the question about using front-end cache, and then I was asked some questions about cache, but the answer was not very good, so it is necessary to sort out and understand. Since I’ve only looked at client-side caching so far, the rest of the content will focus on it. Of course, other ways of caching are also something to learn

First, cache classification

We usually talk about client-side caching, but there are other categories. Here is an understanding:

  • CDN cache
  • DNS cache
  • Client cache
  • Service workers and caching and offline caching
  • PageCache with Ajax cache

Second, the role of cache

Using the cache has the following optimizations:

  • Reduce redundant data transmission
  • Save network costs and alleviate network bottlenecks
  • By reducing the requirements on the original server, the server can respond faster and avoid overloads

Caching may sometimes bring unwanted results, such as the use of cached data resulting in real-time rendering is not good or bad, can be used as a tool, the key depends on how we use it

Client cache

There are two types of client cache: strong cache and negotiated cache

  • Strong cache: Cache resources locally. Use the local cache (no communication with the server) when requesting again
  • Negotiated cache: Caches a resource locally and saves the cache information about the resource. When requested again, the resource information is returned to the server and the server can determine whether the browser can use the local cache directly (at least once with the server)
    • If so, use the local cache directly
    • If not, request the server again for the latest resource

The procedure for client caching is as follows:

1. Browser requests resources for the first time:

  • If no local resources exist, obtain data from the server
  • Once a resource is obtained, the header content of the response determines how to cache the resource (whether to cache it, and if so, whether to use strong caching or negotiated caching).

2. The browser requests resources later:

  • Obtains the header information of the resource cache and determines whether the resource matches the cache (based on cache-Control and Expires information). If a match is found, the resource information, including the cache header information, is directly obtained from the cache. This request will not communicate with the server (strong caching)
  • If the strong cache is not hit, the browser sends a request to the server with the header fields returned from the first request (last-modified/if-modified-since and Etag/ if-none-match). If yes, the server will return a new response header to update the corresponding header information in the cache, but it does not return the resource content, which tells the browser that it can be obtained directly from the cache (negotiated cache).
  • Otherwise, get the latest resource content from the server

To illustrate this more concretely, use a graph:

The browser requests the resource for the first time

The browser then requests the resource

4. Strong cache

The main header fields associated with strong caching are Expires and cache-Control: max-age=number. And if both cache-control and Expires are present, cache-control takes precedence over Expires

  • Expires: This was the http1.0 specification. The value is a time string in the GMT format of an absolute time, such as Tue May 14 2019 00:00:10 GMT. If the time is retrieved before expires, the local cache is valid, and expires otherwise
  • Cache-control: max-age=number: thisIs http1.1Header information is mainly judged by the max-age value of the field, which is a time relative value. The first request time of a resource and the validity period set by cache-control are used to calculate a resource expiration time. By comparing this expiration time to the current request time, the local cache is valid if the request time is before the expiration time, and expires otherwise. Cache-control In addition to this field, there are several common Settings:
    • 1) no-cache: no local cache is used
    • 2) No-store: the browser is directly forbidden to cache data. Every time a user requests this resource, a request will be sent to the server, and complete resources will be downloaded every time
    • 3) public: can be cached by all users, including end users and intermediate proxy servers such as CDN
    • 4) Private: it can only be cached by the browser of the end user and is not allowed to be cached by the trunk cache server such as CDN

Negotiation cache

The header fields associated with the negotiated cache are last-modified/if-modified-since and Etag/ if-none-match. The negotiated cache is used by the server to determine whether the cache resource is available, so the client and server communicate with each other through some kind of identifier. This allows the server to determine whether the requested resource is cacheable. This involves two header fields, which are paired with a last-Modified or Etag field in the response header of the first request. Subsequent requests will carry the corresponding request field (if-modified-since or if-none-match). If the response header does not have a last-Modified or Etag field, the request header will also have no corresponding field

  • Last-modified/if-modified-since: Both values are time strings in GMT format.
    • 1) The first time the browser requests a resource from the server, the server returns the resource with a last-Modified header added to the respone header. This header indicates the Last time the resource was Modified on the server
    • 2) When the browser requests the resource again, it adds an if-modified-since header to the header of the request. The header value is the last-modified value returned from the previous request
    • 3) When the server receives the resource request again, it determines whether the resource has changed according to if-Modified-since and the time when the resource was last Modified on the server. If there is no change, 304 Not Modified will be returned, but the resource content will Not be returned. If there are changes, the resource content is returned as normal. When the server returns a 304 Not Modified response, the last-Modified header is Not added to the response header, because since the resource has Not changed, the last-Modified header will Not change. This is the response header when the server returns 304
    • 4) When the browser receives the response from 304, it loads the resource from the cache
    • 5) If the negotiated cache is not hit and the browser loads the resource directly from the server, the last-modified Header will be updated when the resource is reloaded, and if-modified-since will enable the last-modified value returned Last time on the next request
  • Etag/ if-none-match: These two values are unique identifying strings for each resource generated by the server and will change when the resource changes; The process is similar to last-Modified/if-Modified-since. Unlike last-Modified, when the server returns a 304 Not Modified response, the ETag has been regenerated. The response header will also return this ETag, even if it’s the same ETag as the previous one

The use of last-Modified and Etag in descriptions is similar, but Etag(http1.1) was developed to solve several difficult last-Modified problems:

  • Some files may change periodically, but their contents do not change (just change the modification time), at which point we do not want the client to think that the file has been modified and GET again
  • Some files are Modified very frequently, such as If they are Modified in seconds or less (say N times in 1s), and if-modified-since the granularity that can be checked is s-level. Such changes cannot be determined (or the UNIX record MTIME is only accurate to seconds).
  • Some servers cannot accurately determine the last modification time of the file. In this case, Etag can be used to control the cache more accurately, because Etag is the unique identifier of the corresponding resource generated by the server automatically or by the developer

Last-modified and ETag can be used together. The server will verify the ETag first. If the ETag is consistent, the server will continue to compare last-Modified, and finally decide whether to return 304

Strong cache and negotiated cache

7. Cache location

The Cache locations can be roughly divided into the following three categories: Service Worker, Memory Cache, and Disk Cache, and their priorities are in this order. If you can’t find it, continue. If the resource is not found in any of these places, go to the server to request the resource

  • Service worker The user can customize which resources to cache on the hard disk, when to use the cache (routing matching rules), and when the cache matches and returns to clear the cache. There are two situations: one is to manually invoke the API, and the other is to use the browser to clear the cache when the capacity exceeds the limit. When a resource request is initiated, the browser first looks for the resource from the service worker. If the Cache is hit, the browser usually uses the fetch () method to fetch the resource. Then the browser goes to the memory cache or disk cache for the next cache search. Fetch () shows from ServiceWorker, regardless of whether the resource was fetched from the memory cache or disk cache or from the network

  • Memory cache The cache in memory, almost all network requests are automatically added to the memory cache by the browser, stored for a short period of time, and the memory cache is invalidated after the browser window is closed. Browsers ignore headers such as max-age=0, no-cache, etc. But if the header is set to no-store, the resource will not enter the memory-cache

  • Disk Cache A cache on a hard disk that persists and allows the same resource to be used across calls and sites. Resources in the cache are determined based on HTTP header fields, which resources are available, and which resources have expired. When a request hits the cache, resources are read from the hard disk. Most caches come from disk caches

Source of inquiry:

  • Chrome Developer tools -> Network -> check out the size column

  • XXK: Network request

  • form memory cache

  • disk cache

  • from ServiceWorker

No caching

  1. Meta set
<META HTTP-EQUIV="pragma" CONTENT="no-cache"> 
<META HTTP-EQUIV="Cache-Control" CONTENT="no-cache, must-revalidate"> 
<META HTTP-EQUIV="expires" CONTENT="0">
Copy the code
  1. Add the de-caching setting to the request header of the Ajax request
If-Modified-Since: 0
Cache-Control: no-cache
Copy the code
  1. Ajax requests urls with random numbers or timestamp parameters
url + ? ran = Math.random()/new Date().getTime()
Copy the code
  1. The backend returns the parameter Settings related to the response header