Browser caching is a very important optimization tool in our daily development. It plays an important role in saving bandwidth, increasing loading and rendering speed, reducing network congestion, and improving user experience.
1. The DNS cache
Usually we enter a web address, it contains a domain name and port can be assigned a unique IP address, and then establish a connection for communication, and the domain name to find the IP address process is DNS resolution.
www.dnscache.com (domain name) - DNS resolution -> 11.222.33.444 (IP address)
This process takes a toll on network requests, so the browser caches the IP address the first time it gets it. The next time a request is made from the same domain name, the browser looks in the local cache first. If the cache is valid, the browser returns the IP address directly. Otherwise, the browser continues to address the IP address.
- First search the browser’s own DNS cache, if it exists, then
Domain name resolution
That’s it. - if
The browser's own cache
If no corresponding entry is found, it tries to read itHosts file of the operating system
Check whether a mapping exists. If yes, domain name resolution is complete. - If no mapping exists in the local hosts file, search for it
Local DNS Server
(ISP server, or your own manual DNS server), if present, the domain name is resolved. - If the local DNS server is not already found, it will query the
The root server
Make a request for a recursive query.
2. Memory cache
This is the browser’s own optimization to speed up the read cache, and is not controlled by the developer, nor is it constrained by HTTP protocol headers. After a resource is stored in the memory, the next request will not go through the network, but directly access the memory. When the page is closed, the resource is released from the memory. When the same page is opened again, the from memory cache will not be displayed.
When are resources put into the memory cache?
The answer is: almost all network requests are automatically added to the memory cache by the browser according to relevant policies. But memory caches are destined to be “short-term storage” because of both their large size and the fact that the browser can’t take up unlimited memory. When the amount of data is too large, the cache will fail even if the page is not closed.
The memory cache ensures that two requests (e.g. with the same SRC and with the same href) will be requested at most once on a page to avoid waste.
HTTP cache (Disk cache)
The hard disk cache, also known as the HTTP cache, is the most important part of the browser cache. Because as you can see, DNS cache is basically doing an IP address lookup and it’s autonomously done, and memory cache is out of control, it’s kind of a black box. Therefore, the importance of the remaining hard disk cache that we can control is self-evident, and most of the optimization is for hard disk cache. HTTP caching is divided into strong caching and negotiated caching.
3.1 strong cache
The fields that Control it are: Expires and Cache-Control, where cache-control takes precedence over Expires.
When a client makes a request to the server, the server wants you to cache the resource, so it adds this content to the response header
Cache-Control: max-age=3600I want you to cache this resource for a cache time of3600Seconds (1Hours)Expires: Thu, 10 Nov 2020 08:45:11The GMT expires at the specified timeDate: Thu, 30 Apr 2020 12:39:56 GMT
Etag:W/"121-171ca289ebf"This resource is numbered W/"121-171ca289ebf"
Last-Modified:Thu, 30 Apr 2020 08:16:31GMT, the last time this resource was modifiedCopy the code
Cache-control and Expires are HTTP/1.1 and HTTP/1.0, respectively. In order to be compatible with HTTP/1.0 and HTTP/1.1, we set both fields in the actual project.
After the browser receives the response, it does the following
- The browser returns the response body to the request
Cache to a local file
In the - The browser
Marks the request method and request path for this request
- The browser
Mark the time of this cache
Is 3600 seconds - The browser
Log the server response time
It’s Greenwich TimeThe 2020-04-30 12:39:56
This record is important because it provides a basis for future browser requests to the server.
Later, when the client is ready to request the same address again, it suddenly remembers: Is what I need in the cache?
At this point, the client will search the cache to see if there are cached resources, as shown below
To determine whether the cache is valid or not, use themax-age + Date
, and see if the expiration time is greater than the current time. If so, it indicates that the cache has not expired and is still valid; if not, it indicates that the cache has not expiredCache invalidation
.
3.2 Negotiated Cache
Once it finds that the cache is invalid, it doesn’t simply delete the cache, but rather, on a glimmer of hope, it wants to ask the server, has my cache changed since it was last modified? Can I continue to use it?
The browser then issues a cached request to the server.
Cached requests include the following headers:
If-Modified-Since: Thu, 30 Apr 2020 08:16:31GMT dear, you once told me that the last time this resource was modified was Greenwich Mean Time2020-04-30 08:16:31Has this resource changed after this time? If-None-Match: W/"121-171ca289ebf"Dear, you once told me that the number of this resource is W/"121-171CA289EBF, has the number of this resource changed?Copy the code
The reason for sending two messages is to be compatible with different servers, because some servers only recognize if-modified-since and some servers only recognize if-none-match, Some servers recognize both, but in general if-none-match takes precedence over if-modified-since
There are two possible outcomes
Cache invalidation: very simply, the server again gives a normal response (response code 200 with response body) with a new cache instruction attached, and the browser caches the new content
The cache works: the server returns a 304 redirect, and the response header is loaded with a new cache instruction, and the browser acts accordingly.
Cache-Control
Cache-control can also set one or more of the following values:
public
: indicates that the server resource is public. Let’s say there’s a page resource where everyone sees the same thing. This value is not meaningful to the browser, but may be useful in some scenarios. On the principle of “I tell, you do,”http
A lot of the time in a protocol the client or server tells the other side the details, and it’s up to the other side to decide whether to use it or not.private
: indicates that the server resource is private. Let’s say you have a page resource that each user sees differently. This value is not meaningful to the browser, but may be useful in some scenarios. On the principle of “I tell, you do,”http
A lot of the time in a protocol the client or server tells the other side the details, and it’s up to the other side to decide whether to use it or not.no-cache
: tell the client that you can cache this resource, but don’tdirectlyIt is used. After you cache, each subsequent request needs to be accompanied by a cache directive that lets the server tell you if the resource has expired.no-store
: tells the client to do no caching for this resource, and each subsequent request will proceed as normal. If this value is set, the browser will not cache the resource.max-age
: No further elaboration
For example, cache-control: public, max-age=3600 indicates that this is a public resource. Please Cache it for 1 hour.
Cache-control: no-cache not only in the response header, but also in http1.1, cache-control: no-cache in the request header says to the server: Don’t worry about any caching, give me a normal result. This is the same functionality as the HTTP1.0 version of the header field pragma.
expire
In HTTP1.0, the expiration point is specified via the Expire response header, for example:
Expire: Thu, 30 Apr 2020 23:38:38 GMT
Copy the code
In http1.1, this has been changed to cache-control max-age.
4. To summarize
When the browser accesses an already accessed resource again, it does this:
-
If a strong cache is hit, the cache is used.
-
If the strong cache is not hit, a request is sent to the server to check for a hit to the negotiation cache.
-
If the negotiated cache is hit, the server returns 304 telling the browser to use the local cache.
-
Otherwise, return the latest resource.