Browser cache

How does the browser cache work?

First, a normal (proxy caching is not considered here) request-reply flow looks like this:

  • The browser finds no data in the cache and sends a request to the server for the resource
  • The server responds to the request by returning the resource and marking its expiration date
  • The browser caches resources for reuse

How do servers and browsers communicate caching mechanisms?

Cache-Control

The server and browser can use the cache-Control field to negotiate Cache rules

  • Cache-control :no-store, not allowed to Cache, used for some data that changes very frequently, such as the red envelope page.
  • Cache-control :no-cache, which can be cached, but must be checked by the server before using it.
  • Cache-control :must-revalidate, if the Cache does not expire, you can continue to use it, but if you want to continue to use it, you must check with the server.
  • Cache-control :max-age=30, which tells the browser, “This page can only be cached for 30 seconds before it expires.” (The starting point of calculating max-age time is the creation time of the response message (that is, the Date field, that is, the time to leave the server).)

In addition, when the “refresh” button is clicked, the browser will add “cache-control: max-age=0” to the request header, Ctrl+F5 “force refresh” request header added “cache-control: “No-cache” has the same meaning as “max-age=0”, depending on the interpretation of the background server, usually the two have the same effect.

if-Modified-Since & Last-modified

“Last-modified” represents the date and time when the resource identified by the source server was modified. The first response message needs to provide “Last-Modified” in advance, and the “if-Modified-since” can be added with the original value in the cache for the second request to verify whether the resource is the latest.

The server will only return the requested resource if its contents have been modified after a given date and time, with a status code of 200. If the requested resource has not been Modified since then, a 304 response is returned with no message body and with the Last Modified time in the last-Modified header. “If-modified-since” can only be used in GET or HEAD requests.

If-None-Match & ETag

“ETag is a unique identification of a resource and is used to solve the problem of file changes that cannot be accurately identified by modification time.

For example, a file may be modified several times in a second, but since the change time is in seconds, the new version within that second is indistinguishable.

For example, if a file is updated periodically, but sometimes it is the same content that has not actually changed, it will take time to change it, and it will waste bandwidth to send it to the browser. Using ETag allows accurate identification of resource changes, allowing the browser to make more efficient use of the cache.

ETag can also be “strong” or “weak”.

A strong ETag requires that resources match exactly at the byte level, while a weak ETag has a “W/” tag in front of the value. It requires that resources remain semantically unchanged, but may have some internal changes (such as reordering tags in HTML, or adding a few extra Spaces).

The server provides “ETag” on the first request, and then if-none-match on the second request takes the value of “ETag” in the cache. If the resource has Not changed, the server responds with a “304 Not Modified” message indicating that the cache is still valid.

Such as:

When the page is refreshed, the browser sends both the cache control header “max-age=0” and the conditional request header “if-none-match”, and If the cache is valid the server returns 304.

Proxy cache

Client caches are different from proxy caches in that the client caches are for the user’s own use, whereas proxy caches may serve a very large number of clients.

  • Cache-control: public, indicating that the response can be cached by any object (including the client that sent the request, the proxy server, and so on)

  • Cache-control: private, indicating that the response can only be cached by the client and not as a shared Cache (that is, the proxy server cannot Cache it).

  • Cache-control: s-maxage, which represents how long the resource can be cached on the agent.

  • Cache-control: max-stale =5, if the Cache on the agent expires, it is acceptable, but not too much, and will not be discarded after 5 seconds.

  • Cache-control: min-fresh = 5, which means that the Cache must be valid and must remain valid after 5 seconds.

  • Cache-control: only-if-cached: indicates that only the data cached by the proxy is accepted, not the response from the source server. If there is no cache on the proxy or the cache expires, a 504 (Gateway Timeout) should be returned to the client.

  • Cache-control: no-transform. The proxy sometimes optimizes cached data, such as generating images into PNG or WebP formats for future request processing. No-transform forbids this and does not allow the proxy to convert or transform resources.

The last

Thank you for reading, if you have any questions please correct!