background

In many cases, when you open the developer tools of the browser and check the network request, the Size option of the resource (Size), in addition to the specific numeric Size, and from memory cache, from Disk cache, etc.

So there’s a lot of questions here, what do these fields mean? Who decides these fields?

The cache location

As you can probably guess from the literal meaning, these fields represent cache locations. According to priority, the Size option field can be divided into:

  • from Service Worker
  • from memory cache
  • from disk cache
  • A real network request (showing the exact size of the resource)

Service Worker

Essentially acting as a proxy server between server and client, along with the PWA. Compared with LocalStorage and SessionStorage, the latter two are simply interface data caching, such as user information (an object) and list information (an array), while the former can cache static resources. It even intercepts network requests and makes different caching strategies according to network conditions. Of course, that’s not the focus of this article.

memory cache

As the name implies, this caches the resource in memory. In fact, all network requests are cached in memory by the browser. Of course, memory capacity is limited and the cache cannot be stored in memory indefinitely, so it is destined to be a short-term cache.

Control of the memory cache resides in the browser, not in the front or back end.

disk cache

As opposed to an in-memory cache, this is where resources are cached on hard disk. Hard drives are much slower to read than memory, but they’re better than nothing.

Control of the hard disk cache on the back end, by what? Control via HTTP response headers, which is the focus of this article.

Caching strategies

Disk cache is also called HTTP CAhCE because it strictly follows HTTP response header fields to determine which resources are to be cached and which have expired. Most caches are disk caches.

Disk CAhCE is divided into mandatory cache and comparison cache.

Mandatory cache

There are two types of HTTP response header fields that control the forced cache:

Expires: Fri, 08 Feb 2019 05:37:33 GMT

The value of the field represents the expiration time of the resource, but this value is relative to the client, and the client’s local time can be modified arbitrarily, so this field is not reliable. The Expires field is Http 1.0, and Http 1.1 replaces it with the cache-Control field:

Cache-Control: max-age=2592000

The cache-Control field uses the absolute time, expressed in seconds, that is, the maximum valid time. During the valid time, the client directly reads resources from the hard disk.

Cache-control: max-age=2592000; if cache-control: max-age=2592000, the server will print every request:

const server = http.createServer((req, res) = > {
    console.log('Received the request at:${req.url}`);
    fs.readFile(path.resolve(__dirname, './image.png'), (err, file) => {
        if (err) {
            res.end(err.message);
        }

        res.setHeader('Cache-control'.'max-age=2592000');
        res.end(file);
    });
}).listen(3000);

console.log('Localhost :3000 service enabled! ');
Copy the code

First Visit:

Second visit:

As you can see, for the first request, the browser caches the resource to the hard disk according to the cache-Control field in the response header. For the second request, the browser reads the resource directly from the hard disk without sending the network request to the server.

The cache-control field has the following values:

  • Max-age = XXX, the maximum validity period
  • Must -revalidate, if the max-age time is exceeded, a request must be sent to the server to verify the validity of the resource
  • No-cache, equivalent to max-age=0, determines whether to cache resources by comparing caches
  • No -store, no cache in real sense
  • Public, all content can be cached
  • Private, all content can only be cached by the client, not the proxy server. The default value

Compared to the cache

Unlike forced caching, the browser directly determines whether cached resources are valid based on the cache-control field in the response header.

Last-Modified & If-Modified-Since

The server tells the browser when the resource was Last Modified via the last-modified response header:

Last-Modified: Fri, 08 Feb 2019 15:20:04 GMT

When the resource is requested again, the browser needs to confirm to the server that the resource is expired. This proof is the if-Modified-since field in the request header, which is the value of the last-Modified field in the response header in the previous request:

If-Modified-Since: Fri, 08 Feb 2019 15:20:04 GMT

The server receives the value of the if-Modified-since field compared to the last modification time of the requested resource

If the value of if-modified-since is greater than the last modification time of the requested resource, the browser cache is still valid and the server returns a 304 status code telling the browser to fetch the cache. The server returns only Http headers, not bodies (otherwise there is no point in caching).

Otherwise, as normal, the server returns a 200 status code with the latest resources.

Here’s an example of how to modify the node.js code slightly:

const server = http.createServer((req, res) = > {
    console.log('Received the request at:${req.url}`);

    const filename = path.resolve(__dirname, './image.png');

    fs.stat(filename, (err, stat) => {
        const lastModified = stat.mtime.toUTCString();

        if (lastModified === req.headers['if-modified-since']) {
            res.writeHead(304.'Not Modified');
            res.end();
        }
        else {
            fs.readFile(filename, (err, file) => {
                if (err) {
                    res.end(err.message);
                }
                
                res.setHeader('Last-Modified', lastModified); res.end(file); }); }}); }).listen(3000);

console.log('Localhost :3000 service enabled! ');
Copy the code

First request:

Second request:

Comparing the two requests shows that in addition to the status code changing to 304, the resource size has also decreased from 57.8K to 90B, which also proves that the response contains no HTTP body.

Etag & If-None-Match

Last-modiflied, like Expires, is flawed. If the resource changes at a time interval of less than seconds, such as milliseconds, or is directly dynamically generated, then the last-Modified resource is up to date all the time and has been Modified!

So, Etag & if-node-match is designed to solve this problem.

The value of the Etag field is a special identifier of the file, which is usually generated by hash. The server stores the Etag value of the resource. The rest of the process is the same as lST-Modified & if-Modified-since, except that the comparison value changes from the last Modified time to the Etag value.

The nice thing about Etag is that there is no modification time for dynamic resources or JSON data returned by popular Restful apis, but the Http standard does not specify how Etag values are generated, so we generate them ourselves in code. Of course, calculating Etag values can cost server performance.

priority

The force cache and the contrast cache can exist together, and the force cache takes precedence over the contrast cache. In practice, they are used together.

Example: cache-control and last-Modified are added to the response header:

res.setHeader('Cache-control'.'max-age=600');
res.setHeader('Last-Modified', lastModified);
Copy the code

First request:

Second request:

As you can see, resources are fetched directly from the hard disk, despite the last-Modified field.

conclusion

Http caching strategy, in fact, is only a small part of the front-end cache, but there are still a lot of messy knowledge. In the end, the cache is still the browser, and this may vary from browser to browser, so be careful in practice.

Using Http caching properly can be very helpful for front-end performance optimization!