If the point

HTTP cache mechanism is what, whether there is no cache, address bar enter, F5, CTRL +F5 difference, and currently more recommended cache scheme, etc.

HTTP caches mainly static files such as CSS, JS, images, etc., which are not updated frequently.

HTTP cache request response headers

To give you an idea of what these cache fields are, we’ll talk about them later to give you an idea.

1, the cache-control

The request/response header, the cache control field, is arguably the highest directive governing HTTP caching, including whether to cache it or not. It has the following common values:

  1. No-store: All content is not cached
  2. No-cache: cache, but before using the cache, the browser will ask the server to determine whether the cache resource is up to date. This is a noble existence, because it only uses the cache that does not expire (http1.1, old: Pragma:no-cache).
  3. Max-age =x(in seconds): X seconds after a request is cached, it is an HTTP1.1 attribute, similar to the following http1.0 attribute, but with a higher priority than Expires.
  4. S-maxage =x(in seconds): X seconds after the proxy server requests the source cache, it does not initiate a request, only for the CDN cache (more on this later).
  5. Public: both the client and the trunking proxy server (CDN) can be cached. Default value
  6. Private: Only clients can cache
  7. Must-revalidation/proxy-revalidation: If the cached content is invalid, the request must be sent to the server/proxy for revalidation.

2, Expires

The response header, representing the resource expiration time, provided by the server’s return, the GMT format date, is an attribute of HTTP1.0 and has a lower priority in coexistence with max-age(http1.1).

3, last-modified

The response header, when the resource was last modified, is told by the server to the browser.

4, if – Modified – Since

The request header, the Last time the resource was Modified, is told by the browser to the server, and is compared with last-Modified.

5, the Etag

The response header, the resource identifier, is told to the browser by the server.

6, the if – None – Match

The request header, the cache resource identifier, is told to the server by the browser (actually the Etag given by the server last time), and it is compared with the Etag.

Using HTTP caching

Have the server and browser agree on a file expiration date (GMT format).

Daily Request dialogue

First request

Browser: server server, I need an A.js file now, help me find it and give it to me.

Server: If you want to bother me every time, you can give me the file. We can agree on a time (Expires). Don’t bother me until the time Expires.

Follow-up request…..

The browser first compares whether the current Expires date is already greater than Expires, which is to determine if the file has exceeded the agreed expiration date.

When time runs out, do not initiate the request and use the local cache directly.

When the time expires, initiate the request and continue the above browser-server conversation daily.

Question: Suppose Expires and the browser requests the server again, but A.js hasn’t changed anything since last time, is there any way to avoid this request?

2, ask the server and browser to add a comparison between last-modified and if-modified-since to the file expiration date

Daily Request dialogue

First request

Browser: server server, I now need an A.JS, you found to me, incidentally give me an expiration time, time did not arrive I guarantee not to bother you!

Server: Ha sell batch, ok, I will give you the expiration time, and then I will give you a file Last Modified time, then the file Expires, I check the file modification time, if you are right, don’t bother me, return A.js +Expires+ last-Modified.

Follow-up request….

Expires is not expired, and browsers cleverly use local caches to avoid being beaten up.

Expires. The browser takes the Last modification date of the file, if-modified-since, and the server compares if-modified-since to last-modified.

If-modified-since is not equal to last-Modified. The server looks for the latest A.js and returns Expires again with the new last-Modified

If-modified-since is the same as last-modified, the server returns the status code 304, the file has not been Modified, and you still use your local cache.

The request header and the response header file take the same time to modify, so 304 is returned, using local cache:

Problem: The browser can change Expires at will, Expires is unstable, and Last-Modified is only accurate to the second, assuming that the file was changed within 1s, and last-Modified doesn’t sense the change, in which case the browser never gets the latest file (imagine the extreme case).

3. Add Etag and if-none-match to Expires+ last-Modified. Oh, and Expires is not stable, so here we add a max-age (one of the cache-control values) to replace it.

Daily conversations

First request

Browser: server server, you know ~~~~~~

Server: I don’t understand! A. S I give you, I also give you the expiration time, and GIVE you a max-age=60(unit of seconds), last-Modified you also give me close, and add a file content unique identifier Etag.

Follow-up request….

For 60 seconds, do not initiate a request and use the local cache directly. (Max-age =60 means that a request will not be initiated within 60 seconds after the request is successfully cached, which is similar to Expires, but max-age has a higher priority than Expires. The differences will be explained later.)

60 seconds later, the browser uses if-modified-since and if-none-match to initiate the request. The server compares if-none-match with Etag. The server does not compare if-modified-since with last-modified.

If -none-match is not equal to Etag, the a.js content has been Modified, and the server returns the latest A.js and new Etag and max-age=60 and last-Modified and Expires

If -none-match is equal to Etag, the contents of the a.js file have not changed, and 304 is returned, telling the browser to continue using the previous local cache.

As shown below, the server Etag is the same as if-none-match, so it returns the status code 304. Although there is also if-modified-since and last-Modified, there is no comparison here.

Problem: We can accurately compare server files with local cached files, but there is a major flaw in the evolution of the above solutions. Max-age or Expires do not expire, and the browser cannot actively detect changes in server files.

HTTP caching scheme

1. The md5 hash/cache

By adding MD5 or hash ids to static files without caching HTML, the browser can proactively detect file changes without skipping the cache expiration time.

Why do you do that? What is the implementation principle?

The HTTP cache scheme we talked about earlier, the comparison of file modification time between the server and the browser, and the comparison of file content identification, are based on the premise that the two file paths are exactly the same.

Module /js/a-hash1.js and module/js/a-hash2.js are two completely different files. The file reference becomes module/js/a-hash2.js, and the browser will simply request a-hash2.js again, because these are two completely different files. There is no comparison between HTTP cache files. We can fundamentally solve the problem of the browser not being able to request the server before the expiration time. So we just need to add a different MD5 or Hash identifier to the modified static file each release iteration of the project.

In this way, every time the HTML is loaded and rendered, the file changes can be sensed. Anyway, the file is not changed or the local cache is used, the file name is changed, so it is good to request cache again.

This can be done using methods such as [name].[chunk].js in the WebPack configuration

CDN Cache (Extended)

1. What is CDN

CDN is a content distribution network built on the network. Relying on the edge servers deployed in various places and through the function modules of load balancing, content distribution and scheduling of the central platform, users can obtain the content they need nearby, reduce network congestion and improve user access response speed and hit ratio. The key technology of CDN mainly includes content storage and distribution technology (relatively official description).

We can summarize the value of CDN as follows:

1.CDN greatly reduces the access pressure of the source station through the form of shunt.

2. It is just like living in a remote area. Every time you buy a ticket, you have to go to the center of the city. CDN also solves the problem of cross-regional access and fundamentally provides accelerated access.

2. What is CDN cache

CDN edge nodes cache data, and when the browser requests, CDN will judge and process the request instead of the source site.

Daily Request dialogue

First request

Browser: Server brother, I need A.JS.

Server :(become angry from embarrassment) file I gave my younger brother CDN, later you want this to seek CDN, don’t seek me. Success returns a.js to the CDN, which is cached, and the CDN is returned to the browser, which also caches (cache-control public is used for this).

Subsequent requests…

Browser: server, my cache time is up, please compare the files and see if you want to return them to me.

CDN node: Stop, stop, call what, my eldest brother is busy, file to me to see, the request was acting.

Case 1: The CDN node’s own cache file is not expired, so it returns 304 to the browser, calling back the request.

Case 2: THE CDN node finds that the file cached by itself is expired. To be on the safe side, it sends a request to the server (source site), successfully retrits the latest data, and then delivers it to the browser.

In fact, CDN cache has the same problem as HTTP cache, CDN cache time does not expire, the browser is always blocked, can not get the latest file.

But let’s go back to the nature of the HTTP cache problem. The cache itself is targeted at static files that are not updated frequently. Secondly, CDN cache provides other advantages of shunting and access acceleration. I have asked students here, and the information is that CDN is similar to a platform, which can be manually updated by logging in to CDN cache, thereby solving the problem that browser cache cannot be manually controlled.

Impact of browser operations on HTTP cache, such as enter in the browser address bar, open a new window, refresh F5, and CTRL+F5 refresh

An interesting problem is that most sites’ cache-control is set to no-cache.

No-cache is not no cache, but cache, but in the case of negotiation cache, the browser will unconditionally request the server to determine whether its cache is the latest, if yes, continue to use, not request the latest file, cache and use, this cycle.

Is it necessary to set Expires and max-age? There are!

When we browse a page for the first time and close it, opening it again for the second time is still a new window behavior. If the cache time is set, the new window will strengthen the cache, which can avoid repeated file download, speed up page rendering and improve user experience.

Conclusion:

  1. The browser hits a hit in the address bar, or hits the jump button, or goes forward, or goes back, or opens a new window, and so on, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth, and so forth. (Strong cache)
  2. F5 refreshing the browser, or using the refresh button in the browser’s navigation bar, will override Expires, Max-age and force a request. Last-modified and Etag also work in this case. (Negotiating cache)
  3. CTRL+F5 is a mandatory request that all cached files are not used and all are re-requested for download, so Expires, Max-age, Last-Modified and Etag are all invalid.

But in fact, we rarely use the address bar return, the address bar jump, so to trigger the cache time, there is a specific operation, for my understanding, to have so many sites cache-control set to no-cache (because the common operation does not determine the expiration time, In order to ensure that the latest files can be obtained, you need to set it to no-cache.

Strong vs. negotiated caching (weak caching)

Strong caching: Don’t make an HTTP request, use local caching directly, such as the browser address bar return, or click the jump button, forward, back, open a new window, or if Expires or max-age is valid.

Negotiated cache (weak cache) : Before using the local cache, negotiate with the server to check whether the cache file is up to date. For example, if cache-control=no-cache is set, no matter what you do, the request will be issued. This type of cache is called negotiation cache.

The difference between Max-age and Expires

  1. Max-age is an HTTP1.1 attribute, and Expires is an HTTP1.0 attribute. However, in a 1.1 environment, Max-age takes precedence over Expires.

  2. Max-age is the relative expiration time, and Expires is the absolute expiration time. Max-age after the browser successfully caches the file, it only needs to stop making the request after the request succeeds. However, Expires always requires the server to return an accurate DATE in GMT format and use this date as the standard to determine whether the cache Expires. It is difficult to correct. That’s why something like Max-Age exists to take its place.

Pragma:no-cache and cache-control :no-cache do the same thing, but there are also Pragma:no-cache and cache-control :no-cache. One is a 1.0 property and one is a 1.1 property.