Caching is a simple and efficient way to optimize performance. A good caching strategy can shorten the distance of web page request resources, reduce latency, and reduce bandwidth and network load because cached files can be reused. We will explore the browser caching mechanism through cache locations, cache policies, and the application of cache policies in real-world scenarios.
The cache location
The cache location can be divided into four types, and each has a priority. When the cache is searched in sequence and no match is found, the network will be requested.
- 1, the Service Worker
- 2, the Memory Cache
- 3, the Disk Cache
- 4, Push the Cache
Memory Cache
The Memory Cache is an in-memory Cache that contains resources captured on the current page, such as styles, scripts, and images that have been downloaded on the page. It is definitely faster to read data from memory than from disk. The memory cache is efficient, but it is short and will be released as the process is released. Once we close the Tab page, the cache in memory is freed.
It is important to note that memory caching does not care about the HTTP cache header cache-control value of the returned resource, and the matching of resources is not only for URL matching, but also for conten-Type, CORS and other characteristics.
An important resource in the memory cache is preloader-related instructions (e.g. &)
Preload Preloading. Is a declarative FETCH that forces the browser to request a resource without blocking the Document’s onload event.
Prefetch Prefetch. Tell the browser that the resource may be needed in the future, but it’s up to the browser to decide when to load it.
Disk Cache
A Disk Cache is a Cache stored on a hard Disk. It has a slower read speed and a higher storage capacity and lifetime than a Memory Cache.
In all browser caches, Disk Cache reads which resources need to be cached, which resources can be used without requests, and which resources have expired and need to be re-requested based on the fields in the HTTP Herder.
Resources existing on hard disk: large files, limited file storage on hard disk if current memory usage is high
Caching strategies
Therefore, for the sake of performance, most interfaces should choose a good cache policy. Generally, browser cache policy is divided into two types: strong cache and negotiated cache, and the cache policy is implemented by setting HTTP headers.
- Each time the browser sends a request, it first looks up only the result of the request and the cache identifier.
- Each time the browser receives a return request result, it stores the result and the cache id in the browser cache.
Strong cache
The strong cache does not send a request to the server, but directly flusher the cache to read the resource, returns 200 status code, and size displays from disk cache or from memory cache. Strong caching can be based on cache-control or Expires
1. Expires Cache expiration time, which specifies the expiration time of the resource on the server. Expries is a Web server response header field that tells the browser in the corresponding HTTP request that the browser can retrieve data directly from the browser cache before the expiration date without having to request again. In HTTP /1.1, for example, cache-control: max-age=300 means that the Cache will be hit if the resource is reloaded within 5 minutes of the correct return time of the request (which the browser records).
Cache-control can be set in either the request header or the response header, and can be combined with multiple instructions:
The judgment of strong cache is based on whether it has exceeded a certain time or period, regardless of whether the server file has been updated, which may lead to the loaded file is not the latest content on the server side, in this case, we need to use the negotiation cache policy.
Negotiate the cache
Negotiation cache is a process in which the browser sends a request to the server with the cache identifier after the cache is invalid, and the server decides whether to use the cache based on the cache identifier. It is mainly implemented according to the HTTP Header: Last-Modified and ETag
1. Last-modified Time of the Last modification
The client requests the resource, and a last-Modified attribute marks the time when the file was last modified on the server. The client requests this URL a second time, according to the HTTP protocol. The browser sends an if-modified-since header to the server asking If the file has been Modified Since the event, and returns 304 If it has not
Disadvantages: 1, some files may change periodically, but their contents do not change (only change the modification time), at this time we do not want the client to think that the file has been modified, and GET again; 2. Some servers cannot accurately determine the last modification time of a file. 3. Some files are modified very frequently, such as in less than seconds
2. ETag Data signature (Hash)
The client requests the resource, and an Etag attribute marks the file. The client requests this URL a second time, according to the HTTP protocol. The browser sends an if-non-match header to the server asking If the file has been modified since the event, and returns 304 If it has not
Disadvantages: 1. If ETag comparison is generated for each scan, the modification time is obviously slower than that of direct comparison. 2
3. Comparison between the two:
First, Etag is superior to Last-Modified in accuracy. Last-modified time is in seconds. If a file changes several times within a second, their last-Modified time is not actually Modified, but Etag changes each time to ensure accuracy. If the server is load-balanced, the last-Modified generated by each server may also be inconsistent.
Second, in terms of performance, Etag is inferior to Last-Modified, because last-Modified only takes time to record, whereas Etag requires the server to compute a hash value through an algorithm. Third, server verification takes Etag as the priority