Why is there an HTTP cache?

HTTP caching exists because of performance bottlenecks on the Web front end, mostly because HTTP transfers take too long. If you can reduce the HTTP request time, the performance of the web page can be greatly improved, and the user experience can be greatly improved.

HTTP cache identifier

HTTP caches can be strong (cache-Control and Expires) and negotiated (Etag and last-Modified).

The difference between a strong cache and a negotiated cache is that if a strong cache is hit, the resource is read directly from the cache without asking the server. The negotiated cache asks the server to confirm whether the resource has expired. This is also the order of cache validation, using strong caching first and negotiated caching later.

The following is a brief description of the identities of the two cache types

Cache-Control

Cache-control Specifies the validity period. Max-age is the relative value of the time.

Cache-control: Cache-control: cache-control

value description
public Indicates that the response can be cached by any object, including the client that sent the request, the proxy server, and so on.
private Indicates that the response can only be cached by a single user, not as a shared cache (that is, the proxy server cannot cache it), and that the response content can be cached.
no-cache Force the cache to submit the request to the original server for validation before publishing the cache copy.
no-store The cache should not store anything about client requests or server responses.
max-age=<seconds> Sets the maximum period for which the cache is stored, after which the cache is considered expired in seconds. In contrast to Expires, time is the time relative to the request.
must-revalidate The cache must verify the state of the old resource before using it, and it must not use expired resources.
proxy-revalidate It has the same effect as must-revalidate, but applies only to shared caches (such as proxies) and is ignored by private caches.

For more detailed information, please refer to this link: developer.mozilla.org/zh-CN/docs/…

Expires

The Expires response header contains the date/time after which the response Expires.

The Expires value is an absolute value.

Expires: Wed, 21 Oct 2015 07:28:00 GMT
Copy the code

Etag

The Etag HTTP response header is the identifier for a specific version of the resource. This makes caching more efficient and saves bandwidth because the Web server does not need to send the full response if the content does not change. If content changes, using ETag helps prevent simultaneous updates of resources from overwriting each other (” midair collisions “).

The Etag value is a hash value generated from the content of the resource file.

Etag: 33a64df551425fcc55e4d42a148795d9f25f89d4
Copy the code

Last-Modified

Last-modified contains the date and time that the resource identified by the source server was Modified. It is often used as a validator to determine whether received or stored resources are consistent with each other. This is a backup mechanism because it is less accurate than ETag.

The last-modified value is the absolute value of a time.

Last-Modified: Wed, 21 Oct 2015 07:28:00 GMT
Copy the code

HTTP cache validation rules

The validation order of the HTTP cache

  1. Use cache-Control > Expires to determine whether a resource Expires.
  2. If the resource is not expired, it is directly read from the cache. If the resource is expired, the negotiated cache is used.
  3. A negotiated cache (Etag > last-modified) is used to request the server to confirm whether the resource has been Modified (if-none-match or if-modified-since is added to the request header).
  4. Returns if the resource has not been modified304The status code tells the client that the cache is readable; Returns if the resource has been modified200Status code to retrieve resources;

Here is:

www.cnblogs.com/xiaohuochai…

HTTP cache validation instructions

The following HTTP cache validation instructions are based on the assumption that the request has four cache header tags;

  1. Check whether cache-control is set for HTTP Cache expirationmax-ageors-max-age, ignores Expires if it is set and validates Expires, otherwise validates Expires; In fact, according to the MDN documentation, last-Modified can also calculate a cache time;

The following is the description of the MDN document:

For requests with a particular header, the cache lifetime is calculated. For example, cache-control: max-age=N, the corresponding Cache lifetime is N. Typically, requests that do not contain this attribute are checked to see if the Expires attribute is included, and the cache is still valid by comparing the Expires value to the Date attribute in the header. If neither max-age nor Expires attributes are present, look for last-Modified information in the header. If so, the lifetime of the cache is equal to the Date value in the header minus the last-modified value divided by 10.

Calculation formula of cache time:

expirationTime = responseTime + freshnessLifetime – currentAge

  1. If the cache is not expired, the request is not sent to the server, but is read directly from the cache, which is the status code we can get from the HTTP request information 200 (from cache memory || from disk memory) If the cache expires, the negotiated cache is used.
  • First try to read from memory, then try to read from hard disk.
  • 200 Form memory cache Does not access the server. The form memory cache is usually loaded and cached in the server’s memory. The form memory cache is directly read from the memory. When the browser closes, the data will not exist (the resource has been freed), and when the same page is opened again, the from Memory cache will not appear.
  • 200 From Disk Cache Does not access the server. The resource has been loaded at a previous time, and the cache is directly read from the disk. After the browser is closed, the data still exists.
  1. The negotiated cache is used to verify whether the cache resource is modified. Etag > last-modified, Etag is used when Etag is present, last-modified is used otherwise; And the request will carry a specific identifier;
  • If- none-match is used for Etag requests
  • Last-modified requests are accompanied by if-modified-since
  1. Returns when it is confirmed that the resource has not been modified 304 (Not Modified)Tell the client to continue using the cache or return200Retrieve new resources.

The early days of HTTP caching were Expires and Last-Modified, so why cache-Control and Etag?

First, let’s talk about Expires and cache-Control. The Expires value is an accurate time. The comparison is based on the Date value of the returned header, but if no Date header is returned, the comparison is based on the local time of the client. Local time can have different values due to various factors, resulting in a cached Expires time not being the desired one. Cache-control sets the relative value of max-age and controls the granularity more accurately. And the other values of cache-control give us more flexibility in controlling the Cache.

As for last-Modified and Etag, the last-Modified value is also an exact time, accurate to the second; Using time to determine whether a resource is modified may cause the following problems:

  1. Changes made within 1 second may not be detected and the cache may not be updated.
  2. The resource may be just a few extra Spaces or unchanged, but the last-modified time has changed;

In view of the above problems, the existence of Etag is introduced. Etag compares the hash value generated by the resource content to determine whether the resource is updated, and the control granularity is more precise than last-Modified.

conclusion

The nature of HTTP cache is to exchange space for time. The existence of cache is to reduce the number of HTTP requests and the size of HTTP transfer content as much as possible, which is also an important part of front-end performance optimization. Properly setting the cache of page resources can help you improve your page experience.