For a site, performance is all about user experience, and if you open the site in less time, you’ll retain more users. If your page takes 10 seconds to open, good user interaction is useless.
Cache control is performance optimization of the site is very common and important one annulus, cache control, in addition to improve the website in terms of performance in financial aspect also has important improvement: a better cache policy means less request, less traffic, less bandwidth, thereby saving a lot of or the cost of CDN server.
The cache control policy is the HTTP caching policy. The most effective policy is usually very simple. At the simplest level, all you need to know about HTTP cache is a cache-control header.
A good caching policy requires only two parts, and they are only controlled by cache-control:
- Fingerprinted resources: permanent cache
- Resources without fingerprints: Check the freshness each time
The diagram is as follows:
Fingerprinted resources: permanent cache
Cache-Control: max-age=31536000
Copy the code
The world of martial arts, invincible, only fast not broken. The fastest way to request a resource is not to send a request to the server.
- Static resources have hash values, which are fingerprints
- Setting an expiration time of one year for a resource, i.e. 31536000, is generally considered permanent cache
- The browser does not need to send requests to the server during permanent caching
Why can resources with hash values be cached permanently?
Because when the contents of the file change, a URL with a new hash value is generated. The front end will initiate a request for a new URL.
Resources without fingerprints: Check the freshness each time
Cache-Control: no-cache
Copy the code
- Since there is no fingerprint, the freshness of the resource must be verified each time. (Fetched resource from cache, possibly expired resource)
- If validated as the latest resource, the resource is loaded from the browser’s cache
Index. HTML is a fingerprint-free resource. If you put it in the cache, how can you ensure that when the server refreshes the data, the browser can get fresh resources?
Therefore, when cache-control: no-cache is used, the client checks the freshness of the server each time.
PS: What is the difference between no-cache and no-store?
There is no need to download the resource from the server every time even if the freshness is checked: if the browser /CDN cache is not expired. This is called a negotiated cache, and the HTTP status code returns 304, meaning Not Modified, or no change.
Fortunately, you don’t need to manage or configure negotiated caches, as nGINx or some OSS will automatically configure negotiated caches.
They have their own algorithm for the negotiated cache, which is based on the response header Last-Modified/ETag. Each time a browser requests a resource, it carries the ETag/Last-Modified of the Last server response as a flag and compares it with the ETag/ last-Modified of the server to determine the content change.
How is the ETag value generated in the HTTP response header?
At the bottom of the operating system, last-Modified is usually generated by the Mtime attribute in the file system. ETag provides finer granularity than last-Modified and is generated by the hash or mtime/size of the file contents. Of course, that comes later.
Be sure to add a cache-control response header to your resource
I often come across sites whose resource files do not have cache-control headers. The reason for this is that the cache policy configuration job is unclear, and sometimes it requires coordination between the front end and operations.
So if YOU don’t addCache-Control
What happens to the response header?
Does it automatically go to the server to check freshness every time? Unfortunately, it is not. In this case, resources are forcibly cached, and expired resources without fingerprint information are likely to be obtained. If an expired resource exists on the browser, you can also force the browser to refresh to get the latest resource. However, if the expired resources exist on the edge nodes of the CDN, the CDN refresh will be much more complicated and may require multi-party cooperation.
What is the default mandatory cache time
The first step is to clarify what the two response headers represent:
-
Date
: indicates the time when the source server responds to the packet generation, which is equivalent to the time when the source server sends a request -
Last-Modified
: indicates the time when the static resource was last modifiedmtime
The LM Factor algorithm considers that when a server is requested, if cache-control is not set, the longer the last-Modified time is, the longer the forced Cache time will be generated.
The formula is expressed as follows, where factor is between 0 and 1:
MaxAge = (Date - LastModified) * factor
Copy the code
Bundle Splitting: Minimize resource changes
Thanks to the development of single-page applications and front-end engineering, almost all resources are packaged with fingerprint information, which means that all resources can be permanently cached. The packaging strategy is shown below:
But is that all?
If all your JS resources are packaged into one file, it does have the advantage of permanent caching. But when a line of files is modified, the fingerprint information of the large package changes and the permanent cache is invalidated.
So what we need to do now is to cause minimal cache invalidation when modifying files. Packaging tools such as WebPack, which have a lot of performance optimizations built into Optimization, don’t do this for you; you need to do it yourself.
At this point, we can carry out the packaging scheme of hierarchical cache of resources, which is a suggested scheme:
-
webpack-runtime
: In applicationwebpack
The version of the more stable, separated, to ensure long-term permanent cache -
react/react-dom
:react
Are also updated less frequently -
vendor
: Common third-party modules are packaged together, such aslodash
.classnames
Almost every page is referenced, but they are updated more frequently. In addition to the low frequency use of the third party module do not call -
pageA
: A page, when the components of A page change, its cache will be invalidated -
pageB
B: page -
echarts
: Uncommonly used and oversized third-party modules are packaged separately -
mathjax
: Uncommonly used and oversized third-party modules are packaged separately -
jspdf
: Uncommonly used and oversized third-party modules are packaged separately
With the development of HTTP2, especially multiplexing, the static resources of the initial page are not affected by the number of resources. Therefore, for better caching and on-demand loading, there are also many solutions that suggest packing all third-party modules into a single module.
summary
What the interviewers are looking at