We know that requests made over HTTP are resource-intensive, require a three-way handshake to establish a connection, and are slow to load big data. For users and developers, we hope to achieve cache reuse of data with little change, improve data loading speed and relieve server pressure.

Caching mechanisms

First let’s look at the network request state in the following two scenarios.

Scenario 1: Page forward and backward scenario

Scenario 2: Press F5 to refresh the page

Let’s take a look at the network state, which can be roughly divided into the following states.

  1. Memory cache: A memory cache that contains images and JS stored in a browser process
  2. Disk Cache: Hard disk cache, stored locally in the browser.
  3. Return code 304: File has not changed (server return)
  4. Real network request: An HTTP network request is made. Generally, the cache expires, or does not exist, or is not suitable for caching.

As you can see, network requests have the above cache states, and we will explain the logic and implementation mechanism of these caches later.

1.1 HTTP Headers

Let’s take a look at the difference between the request and response headers for these requests.

The following HTTP headers are involved

  1. Generic header information
field describe
Cache-Control Cache control. For details, see the following table
Pragma If the value of this field is no-cache, the client will not read the cache for this resource

Cache-control Specifies the value range

field describe
private Only clients can cache, not the proxy layer
public Both client and proxy servers can be cached
max-age=xxx Cached contents will expire in XXX seconds
no-cache A comparative cache is required to validate cached data
no-store Nothing is cached, forced caching, comparison caching is not triggered
  1. Request header:
field describe
If-Match Compare eTAGS for consistency
If-None-Match Compare eTAGS for inconsistencies
If-Modified-Since Compares the last update times of resources to be consistent
If-Unmodified-Since Compares whether resources were last updated at inconsistent times
  1. Entity header information:
field describe
ETag Unique identifier of a resource. A new resource is generated when the resource changes
Expires Validity Period: The expiration time returned by the server
Last-Modified Time of last modification

1.2 Browser Cache logic

The overall cache logic is as follows:

  1. The first step is to determine whether a cache exists, and if so, whether it has expired. Here you need to use Expires and cache-Control in the Cache response header.
    • The system preferentially reads the value of cache-control. If a max-age exists, the system compares the packet generation time with the current time to check whether the packet expires. If no-cache is specified, the system directly requests the network without caching. No-store Network request results are not stored locally and no cache is generated.
    • If the above cache-control value. Compare an expiration time in Expires(a relic of HTTP1.0) to the current time. Determine whether it is expired.
  2. Second, if the cache has not expired, determine whether the Last cache result has ETag and last-Modified. If no HTTP request is made, go to the next step. Because these two values are used by the server to determine whether the cache is available (that is, whether the server file has changed)
  3. Third, the request header carries if-none-match and if-Modified-since. The main judgment logic is as follows:
    1. If-none-match carries the Etag of the cache entity. When it reaches the server, the server first checks whether the current Etag matches if-none-match. If the Match is 304, the browser reads the cache on the hard disk and displays it. If there is no match, the server entity is returned and the new ETag is written to the response header.
    2. If-modified-since is the last-Modified field of the response header in the cache entity. The request header carries this value to the server, which will determine whether the current entity has changed after this time. If it has changed, the new request result will be returned, otherwise 304 will be returned. Browsers use caching.
  4. If none of these caches are hit, the latest entity results need to be sent back to the browser over the network, including the response header information.