Caching is a simple and efficient way to optimize performance, which can significantly reduce the cost of network traffic. For a data request, it can be divided into three steps: initiating a network request, back-end processing and browser response. Browser caching helps us optimize performance in the first and third steps.

The cache location

From the point of view of the cache position, there are four kinds, and each has priority. When the cache is searched for once and none is hit, the network will be requested

  1. Service Wroker
  2. Memory Cache
  3. Disk Cache
  4. Push Cache
  5. Network request

Service Worker

The Service Worker’s cache is different from the browser’s other built-in caching mechanisms in that it gives us freedom to control the cached files, how the cache is matched, how the cache is read, and the cache is persistent. When the Service Worker misses the cache, the fetch function is called to fetch data. That is, if the Service Worker does not hit the cache, the lookup data is fetched based on the cache lookup priority. But if the data is retrieved from the Memory Cache or from a network request, the browser displays what we retrieved from the Service Worker.

Memory Cache

A Memory Cache is a Cache in Memory that can read data faster than a disk. However, the memory cache reads fast, but the cache has a short duration and will be released as the process is released. Once we close the Tab page, the cache in memory is freed.

When we visit the page and refresh it again, we can see that a lot of the data comes from the in-memory cache.

If the in-memory cache is so fast, why can’t all the data be stored in memory? This is because the memory capacity of a computer is generally much smaller than that of a hard disk, so the operating system needs to be careful about the use of memory, so there is not much memory available.

Disk Cache

A Disk Cache is a Cache stored on a hard Disk. It is slower to read, but everything can be stored on a secondary DiskcapacityandStorage timeOn.

Disk Cache coverage is by far the largest of all browser caches. Based on the fields in the HTTP Header, it determines which resources need to be cached, which resources can be used without being requested, and which resources have expired and need to be re-requested. In addition, if cross-site data exists in a timely manner, resources with the same address will not be requested again once saved by hard disks.

Push Cache

Push Cache is a part of HTTP/2. It is used only when none of the above three caches are hit, and the Cache duration is short. It exists only in the Session and is released once the Session ends.

Network request

If none of the caches are hit, a request must be made to retrieve the resource.

Caching strategies

There are generally two types of browser caching strategies: strong caching and negotiated caching. And caching policies are implemented by setting HTTP headers.

Strong cache

Strong caching can be implemented with two types of HTTP headers: Expires and cache-Control. Strong caching means that no requests are required during caching, with a state-code of 200.

Expires

expires:Sat, 17 Dec 2022 03:30:46 GMT
Copy the code

Expires is an artifact of HTTP/1 that means a resource will expire after Sat, 17 Dec 2022 03:30:46 GMT and needs to be requested again. In addition, Expires is limited to local time, and changing the local time can invalidate the cache.

Cache-control

cache-control:max-age=31536000
Copy the code

Cache-control appears in HTTP/1.1 and takes precedence over Expires. The value indicates that the resource will expire after 31536000s and needs to be requested again.

The function of common instructions

instruction role
public Indicates that it can be cached by both client and proxy servers
private Indicates that the response value can be cached by the client
max-age=30 The cache expires in 30 seconds and requires a new request
s-maxage=30 Overwriting max-age has the same effect and only takes effect on the proxy server
no-store No requests are cached
no-cache The resource is cached but invalid immediately. The next request is sent to verify whether the resource has expired
max-stale=30 For 30 seconds, the cache continues to be used even if it expires
min-fresh=30 You want to get the latest response within 30 seconds

Negotiate the cache

If the cache expires, a request needs to be made to verify that the resource has been updated. Negotiated caching can be implemented by setting HTTP headers: last-Modified and Etag. When the browser makes a request to validate the resource, if no changes have been made to the resource, the server returns a 304 status code and updates the browser cache expiration date.

Last-modified and if-Modified-since last-modified indicates the Last Modified date of the local file. If-modified-since will send the last-modified value to the server and ask the server whether the resource has been updated Since that date. If there is an update, the new resource is sent back, otherwise 304 status code is returned.

Disadvantages of last-Modified:

  • If the server opens the cache file, it will cause a problem even if the file is not modifiedLast ModifiedThe server could not hit the cache and then send the same resource.
  • becauseLast-ModifiedIt can only be measured in seconds. If the file is modified in an imperceptible amount of time, the server will assume that the resource is still hit and will not return the correct resource.

Because of these two drawbacks, Etag appears in HTTP/1.1.

Etag and if-none-match Etag is similar to file fingerprints. If-none-match sends the current Etag to the server and asks whether the Etag of the changed resource has changed. If it has changed, the server sends the new resource back. Etag has a higher priority than last-Modified.

If no caching policy is set, the browser uses a heuristic that takes 10% of the Date minus last-Modified value from the response entry as the cache time.

The cache policy is applied in actual scenarios

Frequently changing resources

For resources that change frequently, cache-control: no-cache is used to make the browser request the server each time, and then Etag and Last-Modified are used to verify that the resource is valid. This approach does not save the number of requests, but significantly reduces the size of the response data.

Code files

This refers to code files outside of HTML, because HTML files are generally not cached or have a very short cache time. Generally speaking, the code is now packaged using a packaging tool, so we can hash the file name and only generate a new file name when the code changes. To do this, we can set the Cache validity period for the code file cache-Conttrol: max-age=3153600, so that the latest code file will only be downloaded if the file name introduced by the HTML file changes, otherwise the Cache will always be used.