Web caching is a technique for saving a copy of a Web resource and using that copy directly on the next request.

Web cache can be divided into the following types: browser cache, CDN cache, server cache, database data cache. Because replicas may be used directly to avoid resending requests or simply to confirm that the resource has not changed without retransmitting the resource entity, Web caching can reduce latency and speed up Web page openings, reuse resources to reduce network bandwidth consumption, reduce the number of requests, or reduce the amount of content being transferred to reduce server stress.

This article focuses on the browser HTTP caching mechanism, which is closely related to the front end. Browser HTTP caching can be divided into strong caching and negotiated caching. The biggest and most fundamental difference between strong and negotiated caching is: A strong cache hit does not send a request to the server (for example, 200 from Memory cache in Chrome). The negotiation cache must send a request to the server to verify whether the resource hit the negotiation cache. If the negotiation cache hit, the server returns the request. Instead of returning the entity of the resource, the client is notified that the resource can be loaded from the cache (304 not Modified). The brief flow chart is as follows:

The browser HTTP cache is determined by the header field of HTTP packets

The fields that control the strong cache are described by priority

  1. Pragma is a generic header field left over from HTTP/1.1 and used only as backward compatibility with HTTP/1.0. Although it is a generic header, its behavior in response to a message is non-canonical and depends on the browser implementation. In the RFC, this field has only one optional value, no-cache. It informs the browser not to use cache directly and requires the browser to send a request to the server to check freshness. Because it has the highest priority, it must not hit the strong cache when it exists.

  2. Cache-control Cache-control is a common header field that is used in HTTP/1.1 to Control the browser Cache. Related to the browser cache are the following response commands:

instruction parameter instructions
private There is no Indicates that the response can only be cached by a single user, not as a shared cache (that is, the proxy server cannot cache it)
public Can be omitted Indicates that the response can be cached by any object, including the client that sent the request, the proxy server, and so on
no-cache Can be omitted You must verify its validity before caching
no-store There is no No content of the request or response is cached
max-age=[s] necessary Maximum response
  • max-age(unit: s) Sets the cache existence time relative to the request sending time. Only the response header is setCache-ControlFor non-zeromax-ageOr set to a value greater than the requested dateExpiresIt is possible to hit a strong cache. When this condition is met, the response packet header is displayedCache-ControlThere is nono-cache,no-storeThe request header does not existPragmaField, will actually hit the strong cache. All images below are screenshots of refresh (Command +R).

  • no-cacheIndicates that the request must confirm with the server that the cache is valid before it can be used.Negotiate the cache), the strong cache will not be hit in either the response header or the request header. Chrome hard reload (Command+ Shift +R) will be added at the beginning of the requestPragma: no cacheandThe cache-control: no - Cache.
  • No-store forbids the browser and all intermediate caches to store any version of the return response. Strong cache and negotiated cache must not occur. It is suitable for personal privacy data or economic data.
  • Public indicates that the response can be cached by the browser, CDN, and so on.
  • privateThe response is only a private cache and cannot be cached by CDN etc. If HTTP authentication is required, the response is automatically set toprivate.
  1. ExpiresExpires is a response header field that specifies a date/time before the HTTP cache is considered valid. An invalid date such as 0 indicates that the resource has expired. If both are setCache-ControlOf the response header fieldmax-age,ExpiresWill be ignored. It is also a generic header field left over from HTTP/1.1 and is used only as backward compatibility with HTTP/1.0.

Fields that control the negotiation cache

  1. Last-Modified/If-Modified-SinceIf-modified-since is a request header field and can only be used inGETorHEADIn the request.Last-ModifiedIs a response header field that contains the date and time the resource that the server identified was modified. As with theIf-Modified-SinceWhen the header calls the server to request resources, the server checksLast-ModifiedIf theLast-ModifiedEarlier than or equal toIf-Modified-SinceOne without a body is returned304Response, otherwise the resource is returned.

If-Modified-Since: , :: GMT Last-Modified: , :: GMT

  1. ETag/If-None-Match        ETagIs a response header field that is a hash string generated from the entity content to identify the state of the resource and is generated by the server.If-None-MatchIs a conditional request header. If this field is added to the header of a resource request, the value is from the resource returned by the serverETag, if and only if there are no resources on the serverETagThe server will only return a 200 response with the requested resource entity if the value of the property is listed in this header, otherwise the server will return an entity without304The response.ETagPriority thanLast-ModifiedHigh, simultaneous existence toETagShall prevail.

If-none-match: <etag_value> if-none-match: <etag_value>, <etag_value>,… If-None-Match: *

The comparison between ETag attributes adopts a weak comparison algorithm, that is, two files can be considered identical if their contents are identical except that each bit is identical. For example, two pages are considered the same if they differ only in the generation time of the footer.

Because of the nature of ETag, it has some advantages over last-Modified:

1. In some cases, the server cannot obtain the last modification time of the resource. 2. If the resource is Modified very frequently and is Modified in seconds or less, last-Modified is only accurate to secondsCopy the code

The whole process

Please visit my blog