Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”

This article also participated in the “Digitalstar Project” to win a creative gift package and creative incentive money

Background: Why cache?

Using caching reduces the number of requests, greatly reduces bandwidth, and improves the user experience.

HTTP cache is divided into strong cache and negotiation cache. Strong caching is better than negotiated caching

Strong cache:

Strong cache case 1:

It is the same as sending the request for the first time. The request is directly sent to the server without caching information

Strong cache case 2:

Strong cache success: successfully retrieving data from the browser cache without requesting the server

Strong cache case 3:

If the strong cache expires, the negotiated cache is performed,

The first thing to do in this case is to pass a strong cache through a field check. HTTP /1.0 uses the Expires field and HTTP /1.1 uses the cache-Control field.

A expires is a point in time that is present in the response header returned by the server. It is used to indicate that data can be retrieved from the cache without sending the request until a certain time. However, there is a bug in this field mode, that is, there will be a difference between the server time and the client time, then there is a problem with this expiration time. So cache-Control was used in HTTP1.1.

The cache-contro field can use max-age=time to set a time. Used to explain how long data can be stored. This is fundamentally different from Expires. Of course, the cache-control field has many other properties:

Public: allows both client and proxy servers to cache.

Private: only the browser cache can be used. The intermediate proxy cannot share the cache.

No-store: cannot be cached.

S-maxage: indicates the proxy server cache time.

Must-revalidate: If the cache expires, it Must be returned to the source server for validation.

No-cache: skips the current strong cache and sends HTTP requests to enter the negotiation cache.

Where is the strong cache stored?

To solve this problem, we need to understand from memory cache and from disk cache, as follows:

• From memory cache: The memory cache has two features: fast read and timeliness. • Fast read: The memory cache directly stores the compiled and parsed files into the memory of the process, occupying certain memory resources of the process, and facilitating the quick read of the files in the next run. • Timeliness: Once the process is closed, its memory is emptied. • From disk cache: The cache is directly written to the disk file. To read the cache, I/O operations are performed on the disk file stored in the cache and the cache content is parsed again. The reading of the cache is complex and slower than that of the memory cacheCopy the code

In the browser, js and image files are directly stored in the memory cache after being parsed. Therefore, when refreshing the page, you only need to read from the memory cache directly. CSS files are stored in disk files, so every rendering page needs to be read from disk cache.

Negotiation cache:

Negotiation cache is a process in which the browser sends a request to the server with the cache ID after the cache is invalid, and the server decides whether to use the cache based on the cache ID. There are two main situations:

Succeeded in negotiating cache:

Failed to negotiate cache:

As you can see, one important piece of data in the negotiation cache is to get the identity of the cached data. After entering the negotiation cache, the browser carries the corresponding identifier in the request header and sends the request to the server, which decides whether to use the cache. There are two types of cached identifiers: Last-Modified and ETAG.

Last-modified Time of last modification. After the browser first sends a request to the server, the server places this field in the response header. The browser will carry an if-modified-since field in the request header when the last modification was made from the server. If the time in the request header is less than the last modified time, the resource has been modified, and the new resource is returned as the HTTP request process. If it is not, the server returns the status 304, telling the browser to cache the resource directly.

Etag is a unique identifier generated by the server based on the contents of the current file. This value changes whenever the contents of the file are changed. The server gives this value to the browser via the response header. The browser receives the value of ETag and sends it to the server as if-none-match in the request header on the next request. If the server receives if-none-match, it compares it with the ETag of the resource on the server:

• If they are different, it's time for an update. Return the new resource, just like the normal HTTP request response flow. • Otherwise return 304, telling the browser to use the cache directly.Copy the code

Both comparisons

1. ETag is superior to Last-Modified in accuracy. ETag is superior to ETag, which identifies resources according to the content, so that changes of resources can be accurately sensed. Last-modified, on the other hand, does not accurately detect resource changes in special cases. There are two main cases: • A resource file has been edited but its contents have not changed, which also causes cache invalidation. • Last-Modified is the unit of time that can be sensed in seconds. If the file changes multiple times within a second, then last-Modified does not reflect the change. 2. In terms of performance, Last-Modified is superior to ETag. It is easy to understand that last-Modified only records a point in time, whereas ETag needs to generate hash values according to the specific contents of the file.Copy the code

In addition, if both approaches are supported, the server will prioritize ETag.