Cache is known as cache in English. But I found an interesting phenomenon: this word has different pronunciations in different people’s mouths. In order to fully understand caching, we need to start with the pronunciation, so that we can express our own bī ge when communicating with other colleagues (such as PM).

Note: this article is a bit long, so you may need to read it in batches. Take a coffee or beer break.

How to read the cache

In foreign IT circles and most foreign videos, cache is pronounced /kæʃ/(same as cash), which is also a widely accepted pronunciation. But I find that there are still quite a few programmers in China’s IT community (like myself…). Pronounced /kætʃ/(as catch). Although not quite correct, but does not prevent mutual communication. (But for the sake of purity, still should be close to the right direction)

There are also niche pronunciations, such as /keɪ /(same as kaysh), and even /kæ eɪ/(like a French pronunciation, with stress attached). Because these are too niche, they can cause communication barriers and are expected to be used smoothly only in certain situations or circles.

Front-end cache/back-end cache

Without further ado, let’s get to the definition: what is a front-end cache? What’s the opposite of backend caching?

A basic web request consists of three steps: request, process, and response.

The back-end cache mainly focuses on the “processing” step. By reserving database connections and storing processing results, the processing time is shortened and the “response” step is entered as soon as possible. That, of course, is beyond the scope of this article.

Front-end caching can be done in the remaining two steps: “Request” and “response.” In the “request” step, the browser can also directly use the resource by storing the result, eliminating the need to send the request. The “response” step requires cooperation between the browser and the server to shorten the transmission time by reducing the response content. These are discussed below.

This paper mainly includes

  • By cache location (Memory cache, Disk cache, Service Worker, etc.)
  • Classification by failure policy (Cache-Control.ETagEtc.)
  • Some examples to help you understand the principles
  • The application pattern of caching

Sorted by cache location

Most of the articles I’ve read that discuss caching start directly with Cache fields in HTTP headers, such as cache-control, ETag, max-age, and so on. But occasionally you hear people talk about memory cache, disk cache, etc. So how are these two systems related? Is there a crossover? (I personally find this to be the greatest value of this article, since I myself was a mess with both systems before writing it)

In fact, the FIELDS in the HTTP header belong to the disk cache, which is one of several cache locations. So in the spirit of going global to local, we should start with the cache location. The fields in these protocol headers and what they do will be covered in more detail when we talk about Disk cache.

We can see the final processing of a request in the Chrome developer tools column, Network -> Size: If it is a network request (how much K, how much M, etc.), otherwise it lists from Memory cache, from Disk cache, and from ServiceWorker.

Their priorities are :(search from top to bottom, find and return; If you can’t find it, continue)

  1. Service Worker
  2. Memory Cache
  3. Disk Cache
  4. Network request

memory cache

A memory cache is a cache in memory (as opposed to a disk cache). According to the operating system: read the memory first, then read the hard disk. Disk cache will be covered later (because it has a lower priority), but memory cache will be discussed first.

Almost all network requests are automatically added to the memory cache by the browser. But memory caches are destined to be “short-term storage” because of both their large size and the fact that the browser can’t take up unlimited memory. Normally, when a browser TAB is closed, the memory cache for that visit is invalidated (to make room for other tabs). In extreme cases (for example, if a page’s cache takes up too much memory), the first cache may die before the TAB closes.

As mentioned earlier, almost all requested resources can enter the memory cache. Here are two main items:

  1. Preloader. If you are not familiar with this mechanism, here is a brief introduction, and see this article for details.

    Those of you familiar with browser processing should know that when a browser opens a web page, it requests HTML and then parses it. Later, when the browser finds js, CSS, and other resources that need to be parsed and executed, it uses CPU resources to parse and execute them. In the old days (circa before 2007), the serial mode of “request JS/CSS – parse execution – request next JS/CSS – parse execution next JS/CSS” took place before each page was opened. The network request is obviously free at the time of parsing execution, so there is room to play: can we parse and execute JS/CSS while requesting the next (or next batch) resource?

    That’s what preloader does. There is no official standard for Preloader, however, so each browser handles it slightly differently. For example, some browsers also download @import content from CSS or poster from

    The resources retrieved by the Preloader request are stored in the memory cache for later parsing operations.

  2. Preload (although it looks like it’s two letters different from preloader). In fact, you should be more familiar with this, for example . These explicitly specified preloaded resources are also stored in the memory cache.

The memory cache ensures that two requests (e.g. with the same SRC and with the same href) will be requested at most once on a page to avoid waste.

However, when matching the cache, in addition to matching the exact same URL, they will also match their type, domain name rules in CORS, etc. Therefore, a resource cached as a script type cannot be used in an image request, even if their SRC is equal.

Browsers ignore headers such as max-age=0 and no-cache when fetching cache contents from memory cache. For example, if several images with the same SRC exist on the page, they will still be read from the memory cache, even though they may be set not to be cached. This is because the memory cache is only used for short periods of time, with a lifetime of one scan in most cases. Max-age =0 is generally interpreted semantically as “do not use it on your next browse”, so it does not conflict with memory cache.

However, if the webmaster really does not want a resource to be cached, even in the short term, then use no-store. With this header configuration, even the memory cache does not store it and therefore does not read from it. (The second example below illustrates this.)

disk cache

A disk cache, also known as an HTTP cache, is a cache stored on a hard disk, so it is persistent and actually exists in a file system. And it allows the same resource to be used across sessions, or even across sites, such as two sites using the same image.

Disk cache determines which resources can and cannot be cached strictly based on the various fields in the HTTP header. Which resources are still available and which are out of date and need to be rerequested. When the cache is hit, the browser reads the resource from hard disk, slower than from memory, but still faster than a network request. Most of the cache comes from disk cache.

We’ll discuss caching fields in HTTP protocol headers in more detail later.

Any persistent storage will face the problem of capacity growth, disk cache is no exception. When the browser automatically cleans up, mysterious algorithms remove the “oldest” or “most likely outdated” resources, so one by one. However, the algorithms used by each browser to identify the “oldest” and “most likely obsolete” resources are different, which may reflect their differences.

Service Worker

The above cache policy and cache/read/invalidate actions are determined internally by the browser. We can only set certain fields in the response header to tell the browser, not ourselves. For example, if you go to the bank to deposit/withdraw money, you can only tell the bank clerk how much money I want to deposit/withdraw, and then they will go through a series of records and procedures, put the money into the vault, or take the money out of the vault to give you.

However, the emergence of Service workers gives us another, more flexible and direct way to operate. Using the same deposit/withdrawal example, we can now bypass the bank clerk and go to a vault (a separate vault, of course) to put money in or take money out. So we can choose which money to put (which files to cache), when to take money out (routing matching rules), and what money to take out (caching matching and returning). Of course, in reality, banks do not open such services for us.

The cache that a Service Worker can operate is different from the memory cache or disk cache inside the browser. You can find this separate “vault” in Chrome F12, Application -> Cache Storage. Except for its location, the cache is permanent, meaning that when you close a TAB or browser, it will still be there the next time you open it (unlike the memory cache). There are two ways in which a resource in this cache can be cleared: by manually calling API cache.delete(resource) or by exceeding the capacity limit and being completely cleared by the browser.

If the Service Worker fails to hit the cache, the fetch() method is typically used to continue fetching the resource. At this point, the browser goes to the memory cache or disk cache for the next cache search. Note: A resource fetched by the ServiceWorker’s fetch() method will be marked as from ServiceWorker, even if it does not hit the ServiceWorker cache, or even if the network request actually went. This situation is illustrated in the third example, which follows.

Request network

If a request does not find the cache in any of the three locations, the browser formally sends a network request to retrieve the content. It is easy to imagine that in order to improve the cache hit ratio of future requests, it is natural to add this resource to the cache. To be specific:

  1. The handler in the Service Worker decides whether to store the Cache Storage (the extra Cache location).
  2. According to the relevant fields in the HTTP header (Cache-control.PragmaDetermines whether to store the disk cache
  3. Memory Cache Stores a reference to a resource for future use.

Classification by failure policy

The memory cache is a black box that the browser optimizes itself to read the cache faster. It is not controlled by the developer, nor is it constrained by HTTP headers. The Service Worker is an additional script written by the developer, and the cache location is independent. It also appeared late and is not widely used. The disk cache, also known as HTTP cache (because unlike memory cache, it follows HTTP header fields), is the most familiar. Mandatory caching, contrast caching, and cache-control also fall into this category.

Mandatory cache (also called strong cache)

Forced caching means that when a client requests it, it first accesses the cache database to see if the cache exists. If present, return directly; If it does not exist, request the real server and write the response to the cache database.

Forcing the cache to directly reduce the number of requests is the cache strategy that improves the most. Its optimization covers all three steps of requesting data mentioned at the beginning of this article. If caching is considered to optimize web page performance, mandatory caching should be the first consideration.

Fields that can cause forced caching are cache-Control and Expires.

Expires

This is an HTTP 1.0 field that represents the cache expiration time, and is an absolute time (current time + cache time), such as

Expires: Thu, 10 Nov 2017 08:45:11 GMT
Copy the code

Setting this field in the response header tells the browser that it does not need to request again until it expires.

However, there are two drawbacks to setting this field:

  1. Because the time is absolute, the user may change the local time of the client. As a result, the browser determines that the cache is invalid and requests the resource again. In addition, even if confident changes are not taken into account, factors such as time difference or errors can cause client and server time to be inconsistent, resulting in cache invalidation.

  2. It’s too complicated to write. String representing the time multiple Spaces, fewer letters, will result in invalid property setting invalid.

Cache-control

Given the shortcomings of Expires, in HTTP/1.1, a field, cache-Control, was added to indicate the maximum valid time for a resource Cache, during which a client does not need to send a request to the server

The difference between the two is that the former is absolute time while the latter is relative time. As follows:

Cache-control: max-age=2592000
Copy the code

Here are some common values for cache-control fields :(see MDN for a complete list.)

  • max-age: is the maximum effective time, as we can see in the example above
  • must-revalidate: If it exceedsmax-ageThe browser must send a request to the server to verify that the resource is still valid.
  • no-cacheAlthough the literal meaning is “do not cache”, the client is actually required to cache the content, but whether to use this content is subject to subsequent comparisons.
  • no-store: Literally “don’t cache.” Nothing is cached, including coercion and comparison.
  • public: All content can be cached (both client and proxy servers, such as CDN)
  • private: All content can be cached only by clients, not by proxy servers. The default value.

These values can be mixed, for example, cache-control :public, max-age=2592000. In a mixed use, their priority below: (picture from developers.google.com/web/fundame…).

Max-age =0 is equivalent to no-cache. In the literal sense of the specification, max-age expiration means revalidation SHOULD be done, and no-cache means revalidation MUST be done. But the reality is that, depending on the browser implementation, for the most part the two behave the same. (If max-age=0, must-revalidate is equivalent to no-cache.)

By the way, before HTTP/1.1, if you wanted to use no-cache, you usually used Pragma fields, such as Pragma: no-cache(which is also the only value of the Pragma field). But this field is just a browser convention, there is no exact specification, so it lacks reliability. It should only appear as a compatibility field, which is of little use in the current network environment.

To summarize, Expires has been replaced by cache-control since HTTP/1.1. Cache-control is a relative time. Even if the client time changes, the relative time does not change. This keeps the server and client time consistent. Cache-control is also very configurable.

Cache-control takes precedence over Expires, and in order to be compatible with HTTP/1.0 and HTTP/1.1, we set both fields in the actual project.

Contrast caching (also known as negotiated caching)

When the cache is forced to expire (for longer than a specified time), a comparative cache is used and the server decides whether the cached content is invalidated.

The browser requests the cache database and returns a cache id. The browser then uses this identifier to communicate with the server. If the cache is not invalid, HTTP status code 304 is returned to indicate continued use, and the client continues to use the cache. If it fails, new data and cache rules are returned, and the browser responds to the data before writing the rules to the cache database.

The comparison cache is consistent with no cache, but if it is 304, it only returns a status code and no actual file content, so the savings in response body size is its optimization point. Its optimizations cover the last of the three steps of requesting data mentioned at the beginning of this article: “response.” The network transmission time can be shortened by reducing the response volume. So it’s a small improvement over mandatory caching, but it’s better than no caching at all.

Contrast caches can be used in conjunction with force caches as a fallback if force caches fail. And they do appear together a lot on actual projects.

The comparison cache has two groups of fields (not two) :

Last-Modified & If-Modified-Since

  1. The server tells the client via the last-Modified field when the resource was Last Modified, for example

    Last-Modified: Mon, 10 Nov 2018 09:10:11 GMT
    Copy the code
  2. The browser records this value along with the content in the cache database.

  3. The next time the same resource is requested, the browser finds the “indeterminate expiration” cache from its cache. Therefore, the Last last-modified value is written to the if-Modified-since field in the request header

  4. The server compares the if-modified-since value to the last-Modified field. If it is equal, it indicates that it is not modified, and 304 is the response; Otherwise, it indicates modification, responds with 200 status code, and returns data.

But he has certain flaws:

  • If the resource is updated at a rate of less than seconds, then the cache cannot be used because it has a minimum time of seconds.
  • If the file is dynamically generated by the server, the update time for this method is always the generation time, even though the file may not have changed, so it does not work as a cache.

Etag & If-None-Match

To solve this problem, a new set of fields, Etag and if-none-match, emerged

An Etag stores a special identifier (usually hash generated) for a file. The server stores the Etag field of the file. The subsequent process is the same as last-Modified, except that the last-Modified field and the update time it represents are changed to the Etag field and the file hash it represents, and if-modified-since becomes if-none-match. The server also compares and returns 304 on a hit and 200 on a miss.

Etag has a higher priority than last-Modified

Cache summary

When a browser requests a resource

  1. Calling the Service WorkerfetchIncident response
  2. Check the memory cache
  3. Check the disk cache. Here’s another breakdown:
    1. If there is a forced cache and it is not invalidated, the forced cache is used and the server is not requested. The status codes are all 200
    2. If there is a mandatory cache but it has failed, use a comparison cache to determine whether 304 or 200 is available
  4. Sends a network request and waits for a network response
  5. Store the response in the disk cache (if the HTTP header is configured to store it)
  6. Store a reference to the response in memory cache (regardless of HTTP header configuration)
  7. Stores the response to the Service Worker’s Cache Storage (if the Service Worker’s script is called)cache.put())

In some cases

Just looking at the principles is boring. Let’s write some simple web pages and use examples to understand these principles.

1. memory cache & disk cache

We write a simple index.html and then reference three resources: index.js, index.css and mashroom.jpg.

Cache-control: max-age=86400, which means mandatory Cache for 24 hours. The following screenshots all use Chrome’s Incognito mode.

  1. The request for the first time

    No surprises all the network requests, because nothing is cached yet.

  2. Request Again (F5)

    Second request, all three requests come from the memory cache. Because we did not close TAB, the browser added the cache application to the memory cache. (It takes 0ms, that is, within 1ms)

  3. Close TAB, open a new TAB and request again

    Since TAB is disabled, the memory cache is also emptied. But the Disk cache is persistent, so all resources come from the disk cache. (It takes about 3ms because the file is a bit small)

    Comparing 2 to 3, it is clear that memory cache is still much faster than disk cache.

2. no-cache & no-store

We have some code in index. HTML that accomplishes two goals:

  • Each resource is requested (synchronously) twice
  • Added asynchronous request images for scripts
<! Change all resources to request twice -->
<link rel="stylesheet" href="/static/index.css">
<link rel="stylesheet" href="/static/index.css">
<script src="/static/index.js"></script>
<script src="/static/index.js"></script>
<img src="/static/mashroom.jpg">
<img src="/static/mashroom.jpg">

<! -- Asynchronous request image -->
<script>
    setTimeout(function () {
        let img = document.createElement('img')
        img.src = '/static/mashroom.jpg'
        document.body.appendChild(img)
    }, 1000)
</script>
Copy the code
  1. When the server response is set to cache-control: no-cache, we find that all three resources are requested only once after the page is opened.

    This suggests two things:

    • For synchronous requests, the browser will automatically store the current HTML resources in the memory cache. In this way, images with the same SRC will be automatically read from the cache (but not displayed in the Network).

    • For asynchronous requests, the browser also reads the cache return without sending the request. But again, it will not be displayed in Network.

    In general, as described above, no-cache semantically means that the next request does not use the cache directly, but requires comparison, and does not limit the current request. So the browser can safely use caching when processing the current page.

  2. When the server response is set to cache-Control: no-store, the situation changes and all three resources are requested twice. The image has one more asynchronous request for a total of three. (The red box is for that asynchronous request.)

    It also shows that:

    • As mentioned earlier, although the memory cache ignores HTTP headers, the no-store is special. In this setting, the memory cache also has to request resources every time.

    • Asynchronous requests follow the same rules as synchronous requests. In the case of no-store, the request is still sent each time without any caching.

3. Service Worker & memory (disk) cache

We tried to add Service workers to the mix. We’ll write a serviceworker.js and write something like this :(basically pre-caching 3 resources and matching the cache and returning the actual request)

// serviceWorker.js
self.addEventListener('install', e => {
  // When certain resources are determined to be accessed, request them in advance and add them to the cache.
  // This pattern is called "precaching"
  e.waitUntil(
    caches.open('service-worker-test-precache').then(cache= > {
      return cache.addAll(['/static/index.js'.'/static/index.css'.'/static/mashroom.jpg'])
    })
  )
})

self.addEventListener('fetch', e => {
  // If you can find it in the cache, return it; if you can't find it, return it.
  // This cache policy is called CacheFirst.
  return e.respondWith(
    caches.open('service-worker-test-precache').then(cache= > {
      return cache.match(e.request).then(matchedResponse= > {
        return matchedResponse || fetch(e.request).then(fetchedResponse= > {
          cache.put(e.request, fetchedResponse.clone())
          return fetchedResponse
        })
      })
    })
  )
})
Copy the code

The code for registering SW will not be described here. Cache-control: max-age=86400 Our goal is to look at the priorities of the two.

  1. When we first visited, we saw three additional requests from the browser (specifically, the Service Worker) in addition to the regular request. This comes from pre-cached code.

  2. The second visit (either by closing TAB and restarting, or by pressing F5 directly to refresh) will see all requests marked as from SerciceWorker.

    From ServiceWorker only means that the request passed the ServiceWorker. There is no way to know whether the request hit the cache or continue with the fetch() method. Therefore, we have to cooperate with the subsequent Network records. Since there are no additional requests, the cache is judged to have been hit.

    It is also obvious from the server log that none of the three resources are re-requested, that is, they hit the internal cache of the Service Worker.

  3. If you change the fetch event listening code of serviceworker.js to the following:

    // This is also called NetworkOnly caching strategy.
    self.addEventListener('fetch', e => {
      return e.respondWith(fetch(e.request))
    })
    Copy the code

    You can see that the effect on subsequent access is exactly the same as before the modification. (That is, the Network only has a few requests marked as from the ServiceWorker, and the server does not print access logs for three resources)

    It is obvious that the Service Worker layer does not read its own cache, but uses fetch() directly for the request. Cache-control: max-age=86400 (memory/disk Cache) Only the browser knows whether it is memory or disk, because it doesn’t tell us explicitly. (My guess is memory, because it’s more like a cache, both in terms of taking 0ms and never closing a TAB.)

Browser behavior

The behavior of the browser refers to the cache policies that the user triggers when the browser does something. There are three main types:

  • Open the web page and enter the address in the address bar to check whether there is a match in the disk cache. Use if available; If no network request is sent.
  • Plain flush (F5) : Since TAB is not closed, memory cache is available and will be used preferentially (if a match is made). Disk cache comes next.
  • Forced refresh (Ctrl + F5) : The browser does not use caching, so requests are sent with a headerCache-control: no-cache(And for compatibilityPragma: no-cache). The server simply returns 200 and the latest content.

The application pattern of caching

Knowing the principle of caching, we may be more concerned about how to use it in real projects, so as to better enable users to reduce load time, save traffic, etc. Here are a few common patterns for your reference

Pattern 1: Resources that don’t change very often

Cache-Control: max-age=31536000
Copy the code

It is common to deal with such resources by setting their cache-control to a large max-age=31536000 (a year) so that subsequent browser requests for the same URL will hit the forced Cache. In order to solve the update problem, you need to add dynamic characters such as hash and version number to the file name (or path), and then change the dynamic characters to change the reference URL, thus invalidating the previous mandatory cache (actually not immediately invalidating, just no longer used).

Libraries that are available online (jQUERy-3.3.1.min.js, lodash.min.js, etc.) use this pattern. CDN can also be cached if public is added to the configuration.

A variation of this pattern is to add parameters (e.g.? V = XXX or? _= XXX), which eliminates the need to include dynamic parameters in file names or paths, satisfying some perfectionist preferences. Updating additional parameters (for example, to the current time at build time) each time the project is built ensures that the browser always requests the latest content after each build.

Note: Care should be taken with sw-register.js(registering Service workers) and serviceworker.js (Service workers themselves) when dealing with Service workers. If these two files also use this pattern, you must think a lot about possible future updates and countermeasures.

Pattern 2: Constantly changing resources

Cache-Control: no-cache
Copy the code

Resources here are not just static resources, but also web resources, such as blog posts. The peculiarity of such resources is that the URL cannot change, but the content can (and often does) change. We can set cache-control: no-cache to force the browser to go to the server to verify that the resource is valid on every request.

Since validation is mentioned, ETag or last-Modified is required. These fields are automatically added by common libraries that specialize in static resources, such as KOA-static, without much developer concern.

As mentioned above with the negotiated cache, the savings in this mode are not the number of requests, but the size of the request body. Therefore, its optimization effect is not as significant as that of Mode 1.

Pattern 3: Very dangerous combination of pattern 1 and 2 (counterexample)

Cache-Control: max-age=600, must-revalidate
Copy the code

I wonder if any developers took a cue from patterns 1 and 2: Pattern 2 sets no-cache, equivalent to max-age=0, and must-revalidate. My application is not that time-sensitive, but I don’t want to enforce caching for too long. Can I set a compromise like max-age=600 and must-revalidate?

On the surface this looks nice: resources can be cached for 10 minutes, cached for 10 minutes, and verified with the server 10 minutes later, combining the best of the two modes, but in reality there are hidden risks online. As mentioned above, the browser cache has an automatic cleanup mechanism that the developer has no control over.

For example: when we have three resources: index.html, index.js, and index.css. After the above configuration, we assume that index.js has been cleaned by the cache and does not exist, but index. HTML and index. CSS still exist in the cache. The browser then asks the server for the new index.js and presents it to the user with the old index.html and index.css. The risks are obvious: when different versions of resources are combined, an error is the most likely outcome.

In addition to automatic cleanup, different request times for different resources can also cause problems. For example, page A requests A.js and all.css, while page B requests B.js and all.css. If pages are accessed in A -> B order, all.css will be cached earlier than B.js. Then there will also be a hidden danger of resource version mismatch when visiting PAGE B in the future.


A developer friend (WD2010) asked a very good question in the comments section of Zhihu:

If I don’t use must-revalidate and just cache-control: max-age=600, won’t the browser Cache be automatically cleared? If the automatic cleanup mechanism of the browser cache is implemented, the situation will be the same if the subsequent index.js is cleared.

There are a few small points involved in this question, which I would like to add:

  1. What is the difference between ‘max-age=600’ and ‘max-age=600,must-revalidate’?

    No difference. After the max-age is specified, the must-revalidate will have the same effect. Browsers will verify that the cache is available after the max-age is specified.

    The HTTP specification only states the effect of must-revalidate, but does not state how browsers should resolve cache expiration issues if they do not list must-revalidate. Therefore, this is an implementation decision for the browser. (There may be a few browsers that choose to continue using expired caching when the source site is inaccessible, but that depends on the browser.)

  2. Does ‘max-age=600’ cause problems?

    Yes. ‘must-revalidate’ is not included in the list. There will still be JS/CSS version mismatches. Therefore, when a regular website needs to use different JS CSS files on different pages, if you want to use max-age for strong caching, do not set a too short time.

  3. So where exactly can a shorter max-age be used?

    Since there is a version mismatch problem, there are two ways to get around it.

    1. The entire site uses the same JS and CSS, the merged files. This is suitable for small sites that might otherwise be too redundant and affect performance. (However, it may still be cleared due to the browser’s own cleanup strategy, which is still a risk)

    2. Resources are used independently and do not need to be combined with other files to take effect. RSS, for example, falls under this category.


Afterword.

This article is a bit long, but it covers the vast majority of front-end caching, including HTTP caches, Service workers, and some Chrome optimizations (Memory caches). I encourage developers to make good use of caching, as it is often the easiest to think of and the most improved performance optimization strategy.

Refer to the article

A Tale of Four Caches (But this article prioritizes Service workers between memory and disk Caches, which is not consistent with my experiment. Could it be a 2 year change in Chrome policy?

Caching best practices & max-age gotchas