[HTTP] caching mechanism

preface

When a client requests data from the server, it first looks at the browser cache, and if the browser has a copy of the resource that needs to be requested, it can retrieve the resource directly from the browser cache without having to request it again.

However, if the browser uses the cached resource every time and does not send the request again, this may also result in the fetched resource not being up to date, so the browser cache has two strategies:

1. Strong cache 2. Negotiated cache

Strong cache

Definition: An HTTP request is directly obtained from the client cache as long as the current time is within the cache validity period. After the cache expires, the client sends a request to the server to obtain resources again.

Strong caching is primarily determined by the browser based on two fields in the request header:

1. Expires (HTTP / 1.0)

2. The cache-control HTTP / 1.1) (

Expires

Expires is a response header returned by the server that tells the browser to retrieve data directly from the cache before the expiration date, without having to request it again. Like this:

Expires: Wed, 22 Nov 2021 08:42:00 GMT

// Indicates that the resource will expire at 08:41 on 22 November 2019. If it expires, a request will be sent to the server.

Disadvantages: Cache life cycle errors can occur if the time representation is incorrect, the server time is inconsistent with the browser time, or the correct time zone is not converted due to the specific time. So this approach was quickly abandoned in later versions of HTTP1.1.

Cache-Control

Cache-control was added to HTTP1.1 to remedy the Expires flaw. This field is found in both the browser’s request header and the server’s response header.
Priority: When both Expires and cache-control exist, cache-control takes precedence over Expires
It differs from Expires: instead of using a specific expiration date, it controls the cache with an expiration date, corresponding to the field max-age.

Cache-control: Max – age = 3600

// indicates that the response will be valid within 3600 seconds, or one hour

Other configurations:

attribute	describe
max-age	Set the maximum period for which the cache is stored, after which the cache is considered expired (in seconds)
public/private	Public indicates that both client and proxy servers can cache. Private indicates that only the user’s browser can cache
no-cache	Skip the current strong cache and send the HTTP request (i.e. go directly to the negotiation cache phase)
no-store	Not strong cache, also step negotiation cache
must-revalidate	The cache must verify the state of the old resource before using it, and it must not use expired resources

Disadvantages: If the server resource is updated or a bug is fixed, the strong cache time is too long, causing the client to lag behind in updating.

Negotiate the cache

When the cache resource expires, that is, the strong cache is invalidated, the browser invokes a negotiated cache policy

Definition: the browser sends a request to the server with a cache tag in the request header. The server decides whether to use the cache based on this tag.

Cache tag classification: 1. Last-modified 2.Etag

Last-Modified

Last-modifies: The last modification time, which the server adds to the response header after the browser first sends the request to the server.

The next time a request is made, the browser carries the if-Modified-since field in the request header, whose value is the last-modified time sent from the server.

If modified-since is used in the request header, it will be compared with the last modification time of the resource on the server:

If the value in the request header is less than the last modification time, it is time to update. Returns the new resource, as in the normal HTTP request response process, with 200 and the request result
Otherwise 304 is returned, telling the browser to use the cache directly

ETag

ETag: a unique identifier generated by the server based on the contents of the current file. This value changes whenever the contents of the file are changed. The server sends this value to the browser through the response header

The browser receives the value of ETag and sends it to the server in the request header as if-none-match on the next request.

If the server receives if-none-match, it compares it with the ETag of the resource on the server:

If they are different, an update is made and a new resource is returned, just as a normal HTTP request response would
Otherwise, 304 is returned, telling the browser to use cache directly

Compare last-Modified and ETAG

1. The resource only changes the modification time, and the contents remain unchanged. Last-modified will cause the cache to fail, but eTAG will not
The last-Modified time is seconds. Last-modified triggers caching when a file is modified multiple times within a second, but eTAG does not
3. If both last-Modified and ETAG are configured, the server takes eTAG into account

Strong cache vs. negotiated cache

Cache type	Form of resource acquisition	Status code	Send the request to the server
Strong cache	From the slow access	200	No, cache access directly
Negotiate the cache	From the slow access	304	If yes, the server tells you whether the cache is available

conclusion

HTTP caching always starts with the second request.

The first time a resource is requested, the server returns the resource with cached data in the response header (e.g. Expires, Cache-Control, Last-Modified, Etag).
On the second request, the browser passes firstCache-ControlVerify that strong caching is available:
- 1. Yes, use cache directly
- 1. Otherwise it enters the negotiation cache, which sends an HTTP request through the server in the request headerIf-Modified-SinceorIf-None-MatchField checks whether the resource is up to date
  - 1. If the resource is updated, the resource and 200 status code are returned
  - 1. Otherwise, return 304, telling the browser to fetch the resource directly from the cache