Until now, the browser cache has been described only in general, but the underlying principles can not be described; Finally in the front end of the two interview process was asked to fall, in order to vent hatred, I looked up some information and finally had a deeper understanding of it, no more nonsense, hurry to look at the browser cache those things, there is wrong, please don’t hesitate to comment ah.
This article mainly explains the browser side of the cache, the role of the cache is self-evident, can greatly improve the performance of the web page, improve user experience.
1. Browser cache
Caching, you have to fetch the resource the first time, and then tell the client how to cache the resource based on the information returned, whether it’s a strong cache or a negotiated cache, depending on the header content of the response. To get a sense of how the browser cache works, here are two diagrams.
The first time the browser requests:
When the browser makes a subsequent request:
As you can see from the figure above, there are two types of browser cache: strong cache (also called local cache) and negotiated cache.
- When a browser requests a resource, it first obtains the header information of the resource cache to determine whether the strong cache (cache-control and Expires information) is matched. If a match is matched, the browser directly obtains the resource information, including the header information, from the cache. This request does not communicate with the server at all; Under Firebug you can view information returned from a strongly cached resource, such as a strongly cached JS file viewed by local Firebug
- If the strong cache is not hit, the browser sends a request to the server with the header fields returned from the first request (last-modified/if-modified-since and Etag/ if-none-match). According to the relevant header information in the request, the server compares whether the result is negotiated cache hit. If a match is made, the server returns a new response header that updates the corresponding header in the cache, but does not return the resource content, which tells the browser that it can be retrieved directly from the cache; Otherwise, the latest resource content is returned
The difference between strong cache and negotiated cache can be described in the following table:
Form of resource acquisition | Status code | Send the request to the server | |
Strong cache | From the slow access | 200 (from cache) | No, cache access directly |
Negotiate the cache | From the slow access | 304 (not modified) | Yes, as the name suggests, the server tells you whether the cache is available |
2, strong cache related header fields
Strong caching, as described above, fetching resources directly from the cache without passing through the server; There are two header fields associated with strong caching:
- Expires, which was the http1.0 specification; The value is a time string in absolute time GMT format, such as Mon, 10 Jun 2015 21:31:12 GMT. If the request is sent before Expires, the local cache is always valid, otherwise the request is sent to the server to retrieve the resource
- Cache-control: Max – age = number, this is http1.1 header information, mainly use the field max-age value to determine, it is a relative value; If the request time is earlier than the current request time, the Cache will be hit. If the request time is earlier than the current request time, the Cache will not be hit. Cache-control In addition to this field, there are several common Settings:
- No-cache: no local cache is used. Cache negotiation is required to verify with the server whether the returned response has been changed. If there is an ETag in the previous response, the request will be verified with the server. If the resource has not been changed, the re-download can be avoided.
- No-store: directly forbids the browser to cache data. Each time the user requests the resource, a request will be sent to the server, and the complete resource will be downloaded each time.
- Public: can be cached by all users, including end users and intermediate proxy servers such as CDN.
- Private: the device can be cached only by the browser of the terminal user and cannot be cached by a trunk cache server such as the CDN.
Note: If both cache-control and Expires exist, cache-control takes precedence over Expires
3. Negotiate cache-related header fields
The negotiated cache is always determined by the server to determine whether the cache resource is available. Therefore, the client and server communicate with each other through some kind of identifier, so that the server can determine whether the requested resource can be accessed by the cache. This mainly involves the following two groups of header fields. If a last-Modified or Etag field is not included in the response header of the first request, the subsequent request will be included in the corresponding request field (if-Modified-since or if-none-match). If no last-Modified or Etag field is included in the response header of the first request, There will be no corresponding fields in the request header.
- Last-Modified/If-Modified-Since
Both values are time strings in GMT format.
- The first time a browser requests a resource from the server, the server returns the resource with a last-Modified header added to the respone header, which indicates when the resource was Last Modified on the server
- When the browser requests the resource from the server again, it adds an if-modified-since header to the request’s header, whose value is the last-modified value returned from the previous request
- When the server receives a resource request again, it determines whether the resource has changed according to if-modified-since and the time when the resource was last Modified on the server. If the resource has Not changed, 304 Not Modified is returned, but the resource content is Not returned. If there are changes, the resource content is returned as normal. When the server returns a 304 Not Modified response, the last-Modified header is Not added to the response header, because since the resource has Not changed, the last-Modified header will Not change. This is the response header when the server returns 304
- When the browser receives the response from 304, it loads the resource from the cache
- If the negotiated cache is not hit and the browser loads the resource directly from the server, the last-Modified Header will be updated when it is reloaded, and if-modified-since will enable the last-modified value returned the previous time on the next request
- The Etag/ if-none-match values are unique identification strings for each resource generated by the server and change whenever the resource changes; The process is similar to last-Modified/if-Modified-since. Unlike last-Modified, when the server returns a 304 Not Modified response, the ETag has been regenerated. The response header will also return this ETag, even if it’s the same ETag as the previous one.
4, is born last-modified how Etag
You might think that using Last-Modified is enough to let the browser know if the local cached copy is new enough, so why Etag? Etag was introduced in HTTP1.1 to solve several last-Modified problems:
-
Some files may change periodically, but their contents do not change (just change the modification time). At this point we do not want the client to think that the file has been modified and GET again;
-
Some files are Modified very frequently, such as If they are Modified less than seconds (say N times in 1s), and if-modified-since the granularity that can be checked is s-level, and such changes cannot be determined (or the UNIX record MTIME is only accurate to seconds).
-
Some servers do not know exactly when a file was last modified.
In this case, the cache can be more accurately controlled using eTAGS, because etags are unique identifiers on the server side that are automatically generated by the server or generated by the developer for the corresponding resource.
Last-modified and ETag can be used together. The server will verify the ETag first. If the ETag is consistent, the server will continue to compare last-Modified, and finally decide whether to return 304.
5. The impact of user behavior on cache
Stealing a map from the web basically describes the impact of user behavior on the cache
6. How does strong cache reload cached resources
With a strong cache, the browser does not send a request to the server. According to the set cache time, the browser keeps retrieving resources from the cache. During this time, if the resources change, the browser will not get the latest resources during the cache period.
By updating the resource path referenced in the page, let the browser voluntarily abandon the cache and load the new resource.
As shown below:
This generates a new Query value every time the file changes, so that the page references a different resource path and the browser ignores the previously cached resource because the resource request path is changed.
Concrete is highly recommended to see zhihu yun-long zhang in the quiz answer www.zhihu.com/question/20…