I have been wondering about cache, recently I studied some relevant articles, and I plan to sort them out and share them with you. Welcome to discuss and exchange ~

So just to get a general idea of what caching is all about;

First, browser cache

What is browser caching?

Browser caching, also known as HTTP caching, means that when we visit a website, the server stores a local copy so that the next time we visit the same site, we can use the local cache instead of connecting to the server. The server program can control the Cache behavior through HTTP Cache Headers, reduce the burden of the server, shorten the response time, and significantly improve the performance of the website.

What are the two dimensions of the browser cache?

For browser-side caches, these rules are defined in HTTP headers and Meta tags in HTML pages. They use freshness and checksum to determine whether the browser can use the cached copy directly, or whether it needs to go to the source server to get the newer version.

Freshness (expiration mechanism) : That is, the cache copy expiration date. A cached copy must satisfy the following conditions for the browser to consider it valid and new enough:

  • Contains the complete expiration control header (HTTP header) and is still valid.
  • The browser has already used this cache copy and has already checked for freshness in a session

In either case, the browser will fetch the copy directly from the cache and render it.

Check value (validation mechanism) : When the server returns a resource, it sometimes carries the Entity Tag (Etag) of the resource in the control header, which can be used as the check identifier for the browser to request again. If the check labels do not match, the resource has been modified or expired, and the browser needs to obtain the resource content again.

Freshness means that the first field allows the client to decide whether to send a request to the server. For example, if the cache time is not expired, the data can be cached locally.

Calibration value is a resource to save on the client cache time has expired, but this time the server is not updated the resources, let the server know that the client now, with the cache file with all the documents are consistent, then told the client directly “this thing you can directly use buffer, my side not updated yet, I won’t pass it on again.”

What are the main HTTP header fields that are relevant to caching?

Common cache-related headers in HTTP request and response headers are:

Summary of HTTP caching mechanism

Caching behavior is primarily determined by caching policy, which is set by the content owner. These policies are articulated primarily through specific HTTP headers.

When a user initiates a static resource request, the browser obtains the resource through the following steps:

  • In the local cache phase, this resource is searched locally. If this resource is found and has not expired, this resource is used without sending HTTP requests to the server. Contains Expires, cache-control
  • Consultation stage cache: if find corresponding resource in the local cache, but don’t know whether the resource is expired or already expired, it sends an HTTP request to the server, and then the server to judge the request, if the request of the resource on the server is not altered, it returns 304, the browser USES the local find the resources; Includes last-modified & if-modified-since, etag&if-none-match, and etag&last-modified
  • Cache failure phase: When the server discovers that the requested resource has been modified, or that it is a new request (the resource was not found originally), the server returns the data for the resource and returns 200, 404 if the resource was found, or 404 if the resource is not on the server.

What are the connections and differences between cache header fields? See the contrast!

**** What is the priority of the HTTP cache header field?


The priorities from high to low are Pragma -> cache-Control -> Expires.

If both the Pragma header and the cache-control header are present, then the Pragma is used

If both the Pragma header and Expires header are present, then the Pragma will work

If both Expires and cache-control are present in a packet, cache-Control prevails.

What is the relationship between user actions and caching?

  • Impact of operations in the browser on the cache:

  • Forced refresh – When CTRL +F5 is pressed to refresh the page, the browser bypasses various caches (local and negotiated caches) and sends the server back the latest resource directly;

  • Normal refresh – When F5 is pressed to refresh the page, the browser bypasses the local cache to send the request to the server, and the negotiated cache is valid

  • Carriage return or Redirection – All caches take effect when entering carriage return in the address bar or pressing the jump button

What does the Web cache do?

  • Reduce network bandwidth consumption
  • Reduce server pressure
  • Reduce network latency, speed up page opening user operation behavior and cache

Caching practices:

Given the comparison of the various HTTP cache control headers and the possible browser refresh behavior of the user, we actually use most of the header fields mentioned above when we do HTTP caching on a project.

Conclusion:

  • Use Expires when you need to be HTTP1.0 compliant, or consider using cache-Control directly
  • ETag is used only if it needs to handle multiple modifications in a second, or other last-Modified conditions that cannot be processed, otherwise last-Modified is used.
  • For all cacheable resources, you need to specify a Expires or cache-Control and last-Modified or Etag.
  • The 304 response can be reduced by identifying the file version name and lengthening the cache time.

Server side cache

1. The CDN cache

What is CDN cache?

CDN cache, also known as gateway cache, reverse proxy cache. The browser first initiates a WEB request to the CDN gateway, and one or more load balancing source servers corresponding to the gateway server will dynamically forward the request to the appropriate source server according to their load requests.

What is the CDN cache policy?

CDN edge node Cache policies vary with different service providers, but generally follow the HTTP standard protocol and set the CDN edge node data Cache time through the cache-Control: max-age field in the HTTP response header.

When the client requests data from the CDN node, the CDN node will judge whether the cached data has expired. If the cached data has not expired, the CDN node will directly return the cached data to the client. Otherwise, the CDN node will send a back to the source request to pull the latest data from the source, update the local cache, and return the latest data to the client.

CDN service providers generally provide multiple dimensions based on file suffixes and directories to specify CDN cache time to provide users with more refined cache management.

CDN cache time has a direct impact on the “back source rate”. If the CDN cache time is short, the data on the edge nodes of CDN will often fail, leading to frequent source back, increasing the load of source station and increasing access delay. If the CDN cache time is too long, the data update time is slow. Developers need to add specific business to do specific data cache time management.

CDN cache refresh CDN edge nodes are transparent to developers. Compared with the browser’s Ctrl+F5 forced refresh to inactivate the browser’s local cache, developers can clear CDN edge node cache through the “cache refresh” interface provided by CDN service providers. In this way, after updating data, developers can use the “refresh cache” function to force the expiration of the data cache on the CDN node to ensure that the client can pull the latest data when accessing.

What are the advantages of CDN?

  • CDN node solves the problem of cross-operator and cross-region access, and the access delay is greatly reduced.
  • Most of the requests are completed at the edge of the CDN, which plays a shunt role and reduces the load of the source station.

What are the disadvantages of CDN?

  • When the website is updated, if the data on the CDN node is not updated in time, even if the user inactivates the cache on the browser side by using Ctrl +F5 in the browser, the CDN edge node does not synchronize the latest data, resulting in user access abnormalities.

2. com bo service

What is the Combo service?

The Combo service, in other words, when we finally assemble and generate page resource references, does not generate multiple independent link labels, but splices resource addresses into a URL path to request an online dynamic resource combination service, thus reducing the demand for HTTP requests.

/?? fle1,file2,file3,… Url request response is provided by the dynamic Combo service. Its principle is very simple, that is, to find multiple corresponding files according to the URL, combine them into one file to respond to the request, and cache them to speed up access.

What are the defects of the Combo service?

  • Browsers have URL length limits, so you can’t merge resources indefinitely.
  • If the user jumps between two pages of the website with common resources, the combo URLS of the two pages are different, so the user cannot use the browser cache to speed up the access to the common resources. If either file in the Combo URL changes, the entire URL cache is invalidated, resulting in reduced browser cache utilization.

HTML5 caching

1.HTML5 offline app cache

Causes:

  • Users can access your application offline, which is especially important for mobile users who can’t stay connected all the time
  • User access to cached files locally usually means faster access
  • Only modified resources are loaded to avoid multiple requests from the same resource to the server, which greatly reduces the access pressure on the server

The manifest file lists files that need to be cached.

There are a few things to note about this process:

  • If the server updates an offline resource, the manifest file must be updated before the resource can be re-downloaded by the browser. If the resource is updated without updating the manifest file, the browser does not re-download the resource, which means that the resource stored offline is still used.
  • You need to be very careful when caching the manifest file, because there could be a situation where you update the manifest file, but HTTP caching rules tell the browser that the locally cached manifest file is not out of date, in which case the browser still uses the original manifest file. So it’s best not to set caching for manifest files.
  • When the browser downloads resources from the manifest file, it downloads all resources at once. If a resource fails to download for some reason, then all updates fail and the browser will still use the original resource.
  • After updating the resources, the new resources will not take effect until the next time you open the app. If the resources need to take effect immediately, you can use themwindow.applicationCache.swapCache()The reason for this is that the browser loads the page using offline resources first and then checks the manifest for updates, so it doesn’t take effect until the next time the page is opened.

2.LocalStorage

LocalStorage. Fresh = "vfresh.org"; Var a = localstorage.fresh; //API // Clear storage localstorage.clear (); // Set a key value localstorage.setitem (" fresh ", "vfresh.org"); // Get a key localstorage.getitem (" fresh "); //return "vfresh.org" // get the name of the specified subscript key(as in Array) localstorage.key (0); //return "fresh" // remove a key localstorage.removeItem (" fresh ");Copy the code

The main differences from sessionStroage are storage time and scope.

In addition, localStorage is more like a local data store like cookies strictly speaking. But in addition to standard caching, developers can use some of the browser’s capabilities to implement custom client-side “caching.”

About localStorage trampling guide:

Points jero must understand using localStorage

Suggestions for building cacheable sites

  • The same resource ensures URL stability
  • Add HTTP cache headers to Css, JS, images, etc., and force entry Html not to be cached
  • Reduce your dependence on cookies
  • Reduce the use of HTTPS
  • Use Get to request dynamic Cgi
  • Dynamic CGI can also be cached

Conclusion, welcome to discuss ~