Refer to the articles of yourself up again, the original link: www.jianshu.com/p/54cc04190…
Why cache
Caching is a performance optimization that reduces the time of network requests and improves file reuse.
2. Cache location
In order of priority, the cache has the following priorities:
- Service Worker
- Memory Cache
- Disk Cache
- Push Cache
1.Service Worker
First, this cache location has usage conditions: the transport protocol uses HTTPS because the Service Worker involves request interception and needs HTTPS to ensure security.
A Service Worker is an independent thread running behind the browser, and the specific implementation of caching is divided into three steps:
- Registration Service Worker,
- After listening for the install event, you can cache the required files,
- The next access queries whether the cache exists by intercepting the request. If yes, the cache is read. Otherwise, the data is requested
When the third step above does not hit the cache, the fetch is called, which looks up the data based on the priority of the cache, but the browser displays what it got from the ServiceWorker, whether it was from another cache location or a network request.
Features: Freedom to control which files to cache, how to match the cache, how to read the cache, the cache is sustainable.
2. Memory Cache
This is the in-memory cache that contains styles, scripts, images, etc. that have been retrieved from the page.
Features: efficient read, but small capacity, short cache persistence, once the page is closed, memory heavy cache will be released.
The Size column below shows data from the memory cache.
Related instruction: preloader
Note: When using memory caching, HTTP Cache header cache-control information is not cared about. Resource matching is not only used for URL matching, but also for content-Type,CORS and other characteristics
3. Disk Cache
A cache stored on a hard drive, where everything can be stored.
Features: large capacity, but slow reading speed, the largest bit of storage time (long time does not expire).
In contrast to the in-memory cache, the hard disk cache determines which resources need to be cached and which resources can be used directly and which resources need to be re-requested based on the fields in the HTTP header. If a resource with the same URL is cached by the hard disk, no more data is requested. (HTTP header cache information in the next section)
Which files are lost on the hard disk and which are lost in memory?
- Large files, large probability exists in the hard disk; Small file, most likely in memory
- When the memory usage is high, files are stored on hard disks first
4. Push Cache
Push Cache exists only when none of the above three caches are hit.
Features: Only exists in a Session, is released once the Session ends, has a very short cache time (chrome 5min), and does not strictly enforce HTTP headers.
Caching process
- Each time the browser makes a request, it first looks in the cache for the request result and cache id
- The browser stores the result and the cache id in the browser cache each time it receives a request
The caching process can be divided into strong caching and protocol caching according to whether HTTP requests need to be re-sent to the server
Strong cache
The strong cache does not send a request to the server, but reads the resource directly from the cache:
The status is 200, and the Size is the data in Memory Cache or Disk Cache.
Strong cache implementation:
- Expires (HTTP Response Header)
- Cache-control (HTTP Response/Request Header)
1.Expires (HTTP/1)
Indicates the expiration time of the cache, which is the specific point in time on the server.
Expires = max-age + request time (Date). This parameter must be used in combination with last-Modified
Disadvantages:
- It can only be accurate to the second. If you quickly modify the resource content within a second, you cannot detect the change
- Changing the local time may invalidate the cache
2. The cache-control HTTP / 1.1) (
Example: cache-control :max-age=300 Indicates that the Cache is valid for 30 seconds.
- Both public clients and proxy servers can cache
- Private Only clients can cache. Proxy servers cannot cache
- The no-cache client does not use cache-control cache Control for pre-validation, and uses Etag or Last-Modified field to Control the cache, i.e. the protocol cache
- No-store all content is not cached, neither mandatory nor protocol caching
- Max-age indicates the number of seconds before the cache expires. It is used for common cache
- The s-maxage value is also a time value, but is only valid in the proxy server and is used for proxy caching. It takes precedence over max-age and ignores max-age and Expires
- Max-stale Maximum expiration time allowed: indicates that a client is willing to accept a response that is stale by max-stale seconds. If this time is not specified, the browser is willing to accept a response no matter how long the resource expires.
- The minimum degree of freshness that min-Fresh can tolerate, indicating that the client is unwilling to receive a response that has been cached for longer than min-Fresh seconds.
3.Expires vs. Cache-Control
Cache-control takes precedence over Expires
Negotiate cache
Strong cache Indicates the period in which the cache is determined. During this period, the server file is not updated. As a result, the loaded file may not be the latest content of the server. To know if the contents of the server have been updated, a negotiated cache policy is used.
Definition: After cache invalidation is enforced, the browser sends a request to the server with a cache identifier. The server decides whether to use the cache based on the cache identifier
Negotiating a cache has two results:
- Negotiation cache takes effect, return 304 Not Modified
- Negotiation cache invalid, return 200 and request result
This is implemented with two HTTP headers:
- Last-Modified
- Etag
1.Last-Modified
When the browser requests a resource for the first time and the server returns the resource, last-Modified information is displayed in the Response Header, indicating the Last modification time of the resource on the server. The browser caches the data files and headers after receiving them.
The next time the browser requests the resource, the browser detects a last-Modified Header and adds the last-Modified time to the if-Modified-since Header.
When the server receives the resource request, it compares the time in if-modified-since with the last modification time of the resource in the server. 304 is returned If the last modification time is larger than the time in if-modified-since. 200 and the new resource file are returned.
Disadvantages:
- If the cache file is opened locally, last-Modified will be Modified even if it is not Modified. The server will think that the resource is updated and send the same resource
- Last-modified can only be timed in seconds. If it is Modified more than once in a second, the server will not perceive it and will return 304 without returning a new resource (like Expires)
2.Etag
Etag is a unique identifier (generated by the server) of the current resource file returned when the server responds to the request. As long as the resource file changes, Etag will be generated again.
The next time the browser requests a resource, it puts the last Etag in the if-none-match of the request Header. The server compares if-none-match with the Etag in its own server to determine whether to return the resource.
3. Compare the two
- High accuracy of Etag
- Last-modified is efficient, and the HASH of Etag is calculated according to the content of the file
- Etag has a high priority
6. Caching mechanism
The strong cache takes precedence over the negotiated cache. If the strong cache takes effect, the resources with the strong cache are used directly. Otherwise, the negotiated cache is used and the server decides whether to use the cache.
If no caching mechanism is set, the browser defaults to 10%*(date-last-Modified).
Vii. Actual usage scenarios
1. Frequently changing resources
Using a negotiated Cache, using cache-control: no-cache to make the browser request the server each time, can significantly reduce the size of the response data.
2. Resources that are not constantly changing
With strong caching, set max-age to a large amount of time (such as a year), after which the cached content can be used by requesting the same URL. If the file has been updated but the max-age has not expired, you need to add hash, version number and other information to the URL.
The impact of user behavior on browser cache
1. Reload normally
If the cache is hit, the resource in the cache is used directly.
2. Hard reload
Clear the shallow cache, some of the cache is still there.
2. Clear the cache and reload it
All caches are cleared, this page and other pages are wrapped, and all resources are retrieved.