Browser cache

Refer to the articles of yourself up again, the original link: www.jianshu.com/p/54cc04190…

Why cache

Caching is a performance optimization that reduces the time of network requests and improves file reuse.

2. Cache location

In order of priority, the cache has the following priorities:

Service Worker
Memory Cache
Disk Cache
Push Cache

1.Service Worker

First, this cache location has usage conditions: the transport protocol uses HTTPS because the Service Worker involves request interception and needs HTTPS to ensure security.

A Service Worker is an independent thread running behind the browser, and the specific implementation of caching is divided into three steps:

Registration Service Worker,
After listening for the install event, you can cache the required files,
The next access queries whether the cache exists by intercepting the request. If yes, the cache is read. Otherwise, the data is requested

When the third step above does not hit the cache, the fetch is called, which looks up the data based on the priority of the cache, but the browser displays what it got from the ServiceWorker, whether it was from another cache location or a network request.

Features: Freedom to control which files to cache, how to match the cache, how to read the cache, the cache is sustainable.

2. Memory Cache

This is the in-memory cache that contains styles, scripts, images, etc. that have been retrieved from the page.

Features: efficient read, but small capacity, short cache persistence, once the page is closed, memory heavy cache will be released.

The Size column below shows data from the memory cache.

Related instruction: preloader

Note: When using memory caching, HTTP Cache header cache-control information is not cared about. Resource matching is not only used for URL matching, but also for content-Type,CORS and other characteristics

3. Disk Cache

A cache stored on a hard drive, where everything can be stored.

Features: large capacity, but slow reading speed, the largest bit of storage time (long time does not expire).

In contrast to the in-memory cache, the hard disk cache determines which resources need to be cached and which resources can be used directly and which resources need to be re-requested based on the fields in the HTTP header. If a resource with the same URL is cached by the hard disk, no more data is requested. (HTTP header cache information in the next section)

Which files are lost on the hard disk and which are lost in memory?

Large files, large probability exists in the hard disk; Small file, most likely in memory
When the memory usage is high, files are stored on hard disks first

4. Push Cache

Push Cache exists only when none of the above three caches are hit.

Features: Only exists in a Session, is released once the Session ends, has a very short cache time (chrome 5min), and does not strictly enforce HTTP headers.

Caching process

Each time the browser makes a request, it first looks in the cache for the request result and cache id
The browser stores the result and the cache id in the browser cache each time it receives a request

The caching process can be divided into strong caching and protocol caching according to whether HTTP requests need to be re-sent to the server

Strong cache

The strong cache does not send a request to the server, but reads the resource directly from the cache:

The status is 200, and the Size is the data in Memory Cache or Disk Cache.

Strong cache implementation:

Expires (HTTP Response Header)
Cache-control (HTTP Response/Request Header)

1.Expires (HTTP/1)

Indicates the expiration time of the cache, which is the specific point in time on the server.

Expires = max-age + request time (Date). This parameter must be used in combination with last-Modified

Disadvantages:

It can only be accurate to the second. If you quickly modify the resource content within a second, you cannot detect the change
Changing the local time may invalidate the cache

2. The cache-control HTTP / 1.1) (

Example: cache-control :max-age=300 Indicates that the Cache is valid for 30 seconds.

Both public clients and proxy servers can cache
Private Only clients can cache. Proxy servers cannot cache
The no-cache client does not use cache-control cache Control for pre-validation, and uses Etag or Last-Modified field to Control the cache, i.e. the protocol cache
No-store all content is not cached, neither mandatory nor protocol caching
Max-age indicates the number of seconds before the cache expires. It is used for common cache
The s-maxage value is also a time value, but is only valid in the proxy server and is used for proxy caching. It takes precedence over max-age and ignores max-age and Expires
Max-stale Maximum expiration time allowed: indicates that a client is willing to accept a response that is stale by max-stale seconds. If this time is not specified, the browser is willing to accept a response no matter how long the resource expires.
The minimum degree of freshness that min-Fresh can tolerate, indicating that the client is unwilling to receive a response that has been cached for longer than min-Fresh seconds.

3.Expires vs. Cache-Control

Cache-control takes precedence over Expires

Negotiate cache

Strong cache Indicates the period in which the cache is determined. During this period, the server file is not updated. As a result, the loaded file may not be the latest content of the server. To know if the contents of the server have been updated, a negotiated cache policy is used.

Definition: After cache invalidation is enforced, the browser sends a request to the server with a cache identifier. The server decides whether to use the cache based on the cache identifier

Negotiating a cache has two results:

Negotiation cache takes effect, return 304 Not Modified
Negotiation cache invalid, return 200 and request result

This is implemented with two HTTP headers:

Last-Modified
Etag

1.Last-Modified

When the browser requests a resource for the first time and the server returns the resource, last-Modified information is displayed in the Response Header, indicating the Last modification time of the resource on the server. The browser caches the data files and headers after receiving them.

The next time the browser requests the resource, the browser detects a last-Modified Header and adds the last-Modified time to the if-Modified-since Header.

When the server receives the resource request, it compares the time in if-modified-since with the last modification time of the resource in the server. 304 is returned If the last modification time is larger than the time in if-modified-since. 200 and the new resource file are returned.

Disadvantages:

If the cache file is opened locally, last-Modified will be Modified even if it is not Modified. The server will think that the resource is updated and send the same resource
Last-modified can only be timed in seconds. If it is Modified more than once in a second, the server will not perceive it and will return 304 without returning a new resource (like Expires)

2.Etag

Etag is a unique identifier (generated by the server) of the current resource file returned when the server responds to the request. As long as the resource file changes, Etag will be generated again.

The next time the browser requests a resource, it puts the last Etag in the if-none-match of the request Header. The server compares if-none-match with the Etag in its own server to determine whether to return the resource.

3. Compare the two

High accuracy of Etag
Last-modified is efficient, and the HASH of Etag is calculated according to the content of the file
Etag has a high priority

6. Caching mechanism

The strong cache takes precedence over the negotiated cache. If the strong cache takes effect, the resources with the strong cache are used directly. Otherwise, the negotiated cache is used and the server decides whether to use the cache.

If no caching mechanism is set, the browser defaults to 10%*(date-last-Modified).

Vii. Actual usage scenarios

1. Frequently changing resources

Using a negotiated Cache, using cache-control: no-cache to make the browser request the server each time, can significantly reduce the size of the response data.

2. Resources that are not constantly changing

With strong caching, set max-age to a large amount of time (such as a year), after which the cached content can be used by requesting the same URL. If the file has been updated but the max-age has not expired, you need to add hash, version number and other information to the URL.

The impact of user behavior on browser cache

1. Reload normally

If the cache is hit, the resource in the cache is used directly.

2. Hard reload

Clear the shallow cache, some of the cache is still there.

2. Clear the cache and reload it

All caches are cleared, this page and other pages are wrapped, and all resources are retrieved.