Write in the front
Caching plays an important role in Web development, especially in heavy load Web system development.
For more information on performance optimization, see HTML, CSS, and JS section optimization, page loading speed optimization, and network transport layer optimization
Recommended reading time: 5 minutes
Read this article
- Caching concepts
- The role of cache
- Caching mechanisms
- Cache policy details
Conceptual knowledge of caching
- Classification of caches: server caches (proxy server caches, CDN caches), third party caches, browser caches, etc.
- Terms related to caching:
- Cache hit ratio: The ratio of the number of requests for data from the cache to the total number of requests. Ideally, the higher the better.
- Expired content: Content marked as’ stale ‘after the set validity period. Usually expired content cannot be used to reply to client requests, and you must either re-request new content from the source server or verify that cached content is still available.
- Validation: Verifies that the expired content in the cache is still valid, and refreshes the expiration time or policy if so.
- Invalidation: Invalidation is the removal of content from the cache. Invalid content must be removed when content changes.
- Also: The browser cache is the least expensive because the browser cache relies on the client and consumes almost no resources on the server (equivalent to purely static pages in extreme cases).
The role of cache
- Reduce network bandwidth consumption
- Reduce server stress
- Reduce network latency and speed up page opening
Caching mechanisms
- The strong cache takes precedence over the negotiated cache. If the strong cache takes effect, the strong cache is used. If the strong cache fails, the negotiated cache is used
- The server decides whether to use the negotiation cache. If the negotiation cache is invalid, the request cache is invalid, and the request result is obtained again and stored in the browser cache. If effective, return 304 and continue using cache
Cache policy image source: IMWeb front end
Http-headers involving caching mechanisms
Expires (strong caching mechanism)
- valueIs an absolute time in GMT format,
Expires
The date time must be Greenwich Mean Time (GMT), not local time. For example:Expires: Fri, 30 Oct 1998 14:19:41
- What it does: Tells the cache how long the relevant copy has been fresh. After that time, the cache sends a request to the source server to check if the document has been modified.
- compatibility: Is supported by almost all caching servers
Expires
Expiration time attribute - Rule: Based on the last time the client viewed the copy (last access time) or the last time the document was modified on the server
- application:
- Especially useful for setting up the cache of static image files such as navigation bars and image buttons; Because these images are rarely modified, you can give them an extremely long expiration time, which will make your site very responsive to users
- It’s also useful for controlling web pages that change regularly. For example, if you update a news page at 6 a.m. every day, you can set the expiration time of the copy to that same time, so that the cache server knows when to pick up an updated version without the user having to press the browser’s “refresh” button.
- The expiration time header value can only be the date and time in HTTP format. Anything else will be resolved to “before” the current time and the copy will expire
- limitations: Although the expiration time attribute is useful, it has some limitations.
- First: the date is involved, so the Web server’s time and the cache server’s time must be synchronized. If something is not synchronized, either the cached content has expired earlier or the expired results are not updated.
- If you set the expiration time to a fixed time, and if you return the content without updating the next expiration time, then all subsequent access requests will be sent to the source Web server, increasing the load and response time
Cache-control (strong Cache mechanism)
- value:
Max - age = [s]
– The maximum time for executing the cache is considered to be the latest.- Relative time, not absolute time
- In seconds: The number of seconds between the start of the request time and the expiration time.
- What it does: Gives website publishers more control over their content and location of expiration time limits. It was introduced in HTTP 1.1 to compensate for the Expires bug.
- Related control field:
S - maxage = [s]
– Similar to the max-age attribute, except that it applies to the shared (e.g., proxy server) cachepublic
– Content authenticated by tags can also be cached. Generally speaking, the output of content that can be accessed only after HTTP authentication cannot be cached automatically.no-cache
– Forces each request to be sent directly to the source server without being verified by the local cache version. This is useful for applications that require validation (which can be used in conjunction with public), or for applications that require strict use of up-to-date data (at the expense of all the benefits of using caching);no-store
– Mandatory caching Do not keep any copies under any circumstancesmust-revalidate
– Tells the cache to follow all the freshness you give copiesproxy-revalidate
– andmust-revalidate
Similar, except that it only works on cached proxy servers
Last-modified/if-modified-since (negotiated cache mechanism)
- Usually the server knows when the data you requested was last modified, and HTTP provides a way for the server to send the latest modified data along with the data you requested.
- If you request the same data a second (or third, or fourth) time, tell the server the last modification date it received: Send one in the request
If-Modified-Since
Header information that contains the date last obtained from the server along with the data. - If the data hasn’t changed since then, the server will return a special HTTP status code 304, which means “this data hasn’t changed since the last request.”
- When the server sends status code 304, the data is not resold. So you don’t need to download the same data over and over again when the data isn’t updated
- compatibility: All modern browsers support (
last-modified
).
ETag/ if-none-match (negotiation cache mechanism)
- Function: Do not re-download data if there is no change
- Way to work :
Etag
Is returned by the server the last time the resource was loadedresponse header
Is a unique identification of the resource. As long as the resource changes,Etag
It will regenerate- When the browser sends a request to the server next time it loads a resource, it returns the previous request
Etag
Values inrequest header
In theIf-None-Match
In, the server compares the client sentIf-None-Match
With the resource on its own serverETag
Are consistent - If the server discovers
ETag
If it doesn’t match, then it’s just normalGET 200
The return form will bring new resources (including, of course, newETag
) to the client; ifETag
If yes, return 304 and tell the client to use the local cache directly.
Comparison of several caching strategies
Comparison of two strong caching mechanismsExpires
VS Cache-Control
- There was so little difference betweenThe difference is
Expires
是HTTP1.0
The product ofCache-Control
是HTTP1.1
The product of - Priority onIf both exist at the same time,
Cache-Control
Priority overExpires
,Expires
More like an alternative, in some not supportedCache-Control
In the environment - The common disadvantage of both is that the strong caching mechanism only cares whether the cache has exceeded or exceeded a certain expiration time, and does not care whether the server resources have been updated. Therefore, the use of these two caching strategies alone will result in the client’s resources not being up to date
Comparison between the two negotiation cache mechanismsLast-Modified/If-Modified-Since
VS ETag/If-None-Match
- On the accuracy of.
ETag
To be significantly better than the former,Last-Modified/If-Modified-Since
The time unit of the policy is seconds, which means that on second level requests, you can’t really update, butETag
It is changed on each request to ensure accuracy, and on servers that use load balancing, the individual servers generateLast-Modified
It could be different - On the performance.
ETag
To be worse thanLast-Modified/If-Modified-Since
Strategy, after allLast-Modified/If-Modified-Since
Strategy just keeps track of time, andETag
A step hash is required - Priority on, the server will take precedence
ETag
The impact of user behavior on caching policies
Normal caching is not enabled for all operations and can be skipped for certain user actions
- Address bar access, link redirection is normal user behavior, will trigger the browser caching mechanism
- F5 refresh, the browser will set
max-age=0
, the strong cache judgment is skipped, and the negotiated cache judgment is performed - CTRL +F5 refresh, skip the strong cache and negotiation cache, and pull resources directly from the server