This is the seventh day of my participation in the August More text Challenge. For details, see:August is more challenging
How to make the first screen load faster?
Why is it so much faster to open the page the second time?
How to keep the data from being cleared after refreshing or closing the browser?
This is mainly because the first time the page is loaded, some data is cached, and later loads are retrieved directly from the cache without having to request the server, so it is faster and the server is less stressed
Whether it’s interviewing or performance tuning, caching is an important and essential part of the front end. The main content of this article is a detailed summary of this piece, if it is helpful to you, it will support a wave of like it
There are two main aspects to this process
The network cache is divided into three sections: THE DNS cache, the HTTP cache, and the CDN cache. Some people call the HTTP cache here the browser cache, but you get the idea
Then there’s the local: the browser’s local and offline storage, which makes the first screen load faster and makes the page fly
DNS cache
When entering the page, the system performs A DNS query to find the IP address of the server corresponding to the domain name, and then sends the request
There are a lot of flowcharts on the web, and I’ve taken two from them
DNS domain name lookup is performed recursively on the client, as shown in the figure
Finding at any step ends the search process, and the client only issues a query request once during the process
If the forwarder configured on the DNS server is not set, the DNS server sends a resolution request to the root 13. Here is the iterative query, as shown in the figure
13 root: There are 13 root server IP addresses in the world, not 13 servers. With the help of anycast technology, mirror sites of these IP addresses can be set up around the world, so it is not the only host that is accessedCopy the code
Obviously, multiple query requests are made throughout the process
After entering the page for the first time, the DNS resolves the address record to be cached in the client, and further forward at least does not need to initiate subsequent iterations of the query, which is faster
HTTP cache
The idea is to store the page resources obtained by HTTP requests locally and then load them later, fetching them directly from the cache instead of requesting the server, resulting in a faster response. Look at the picture first:
Strong cache
On the first request, the server notifies the browser of the expiration time of the resource through the Expires and cache-control fields in the response header. If the browser requests the resource again, it will determine whether the resource has expired. If the resource has not expired, it will be used directly without sending a request to the server, which is called strong caching
Expires
Specifies the absolute time at which the resource expires and is added to the response header when the server responds.
expires: Wed, 22 Nov 2021 08:41:00 GMT
Copy the code
Note: Failure may occur if server and browser time are inconsistent. For example, if the current time is August 1, expires expires is August 2, and the client changes the computer time to August 3, it will not be able to use the cache
Cache-Control
Specify a resource expiration time in seconds, as follows, indicating that the resource is available within 300 seconds after this request is correctly returned, otherwise it will expire
cache-control:max-age=300
Copy the code
Why do you need two fields to specify the cache expiration time?
Some browsers only know cache-control, and some don’t, so if you don’t know Expires, look for Expires
The difference between Expires and cache-control
- Expires is
HTTP / 1.0
Where cache-control isHTTP / 1.1
The; - Expires is intended for compatibility and is not supported
HTTP / 1.1
In the case of - Cache-control has a higher priority than Expires if both are present.
Cache-ControlRequest header
Common properties
Field (in seconds) | instructions |
---|---|
max-age=300 | Reject resources longer than 300 seconds. The value 0 indicates that the latest resource is obtained |
max-stale=100 | The cache is still used for 100 seconds after it expires |
min-fresh=50 | The cache expires in 50 seconds, so I don’t get it. It’s not fresh |
no-cache | Negotiated cache validation |
no-store | No caching |
only-if-chached | Error 504 is reported if no cache is used |
no-transform | No conversion or transformation of resources is permitted. Content-Encoding, Content-Range, HTTP headers such as content-type cannot be modified by the proxy. But it is of no damn use |
The number of seconds is custom, but I’ll write death here for the sake of understanding
Cache-ControlResponse headers
Common properties
Field (in seconds) | instructions |
---|---|
max-age=300 | The cache validity period is 300 seconds |
s-maxage=500 | Valid for 500 seconds, priority higher than max-age, applicable to shared caches (such as CDN) |
public | It can be cached by any terminal, including proxy servers, CDN, etc |
private | Can only be cached by the user’s browser terminal (private cache) |
no-cache | Check with the server to see if the resource has changed |
no-store | Don’t cache |
no-transform | As in the request directive above |
must-revalidate | The client cache expires and authenticates to the source server |
proxy-revalidate | When the proxy cache expires, retrieve it from the source server |
Disadvantages of strong caching
After the cache expires, the resource will be renewed regardless of whether the resource has changed or not
What we want is that if the resource file is not updated, we will continue to use the old resource without re-obtaining the resource even if it expires
So negotiate cache it comes, in the case of strong cache expiration, then go through the process of negotiating cache, to determine whether the file has been updated
Negotiate the cache
When the resource is requested for the first time, the server will return the browser with the expiration time stated above, and will add a last-Modified field to the response header to tell the browser when the resource was Last Modified
last-modified: Fri, 27 Oct 2021 08:35:57 GMT
Copy the code
When the browser requests it again, it sends that time to the server via another field, if-modified-since
if-modified-since: Fri, 27 Oct 2021 08:35:57 GMT
Copy the code
The server then compares the time of these two fields. If they are the same, it means that the file has not been updated, and returns status code 304 and empty response body to the browser. The browser directly takes expired resources to continue to use. If the comparison is different and the resource has been updated, the status code 200 and the new resource are returned, as shown in the figure
Last-modified/if-modified-since
disadvantages
- If the cache file is opened locally, this will still happen even if the file has not been modified
Last-Modified
The server cannot match the cache and sends the same resource - because
Last-Modified
If a file is modified in an undetectable amount of time, the server considers it a hit and cannot return the correct resource - If the resource changes periodically, for example, after the resource is modified, it changes back to the original appearance within a cycle, we think that the cache before this cycle can be used, but
Last-Modified
Think otherwise
Because of these shortcomings, there is an additional pair of ETag/ if-none-matches to compare the contents of the file
ETag
/If-None-Match
When a resource is requested for the first time, the server returns an Etag field in addition to an Expires, cache-Control, and last-Modified header, which represents a unique identifier for the current resource file. This identifier is generated by the server based on the content encoding of the file. It can accurately sense the changes of the file. Whenever the content of the file is different, the ETag will be generated again
etag: W/"132489-1627839023000"
Copy the code
When the browser requests it again, it sends the file id to the server with another field, if-none-match
if-none-match: W/"132489-1627839023000"
Copy the code
If the server finds that the two fields are the same, it means that the file has not been updated. It returns status code 304 and empty response body to the browser. The browser directly takes expired resources to continue to use. If the comparison is different and the resource has been updated, the status code 200 and the new resource are returned
The difference between last-modified and ETag
Etag
Perception file accuracy is higher thanLast-Modified
- When used at the same time, the server verification priority
Etag
/If-None-Match
Last-Modified
In terms of performanceEtag
Because theEtag
It is not a complete replacement because it costs the server extra and affects the performance of the serverLast-Modified
Can only be used as a supplement and reinforcement
The difference between strong cache and negotiated cache
- Look for the strong cache first. If there is no hit, look for the negotiated cache
- The strong cache doesn’t send a request to the server, so sometimes the browser doesn’t know if the resource has been updated, but the negotiated cache will send a request to the server, and the server will know if the resource has been updated
- Most projects currently use cached copywriting
- Negotiation cache generally stores:
HTML
- Strong cache general storage:
css
.image
.js
, file name belthash
- Negotiation cache generally stores:
Heuristic cache
If there is no Expires, cache-control: max-age, or cache-control :s-maxage in the response, and the response does not contain any other Cache restrictions, the Cache can use heuristics to calculate the Cache expiration
The cache time is typically calculated by subtracting 10% of the last-Modified value from the Date field (the time the packet was created) in the response header
max(0, (Date - Last-Modified)) % 10
Copy the code
Cache actual usage policies
For frequently changing resources:
Using cache-control: no-cache, in which the browser requests data each time, and then uses Etag or last-Modified to verify that the resource is valid, can significantly reduce the size of the response while not saving the number of requests
For infrequently changing resources:
You can set their cache-control to a large max-age=31536000(a year), so that the browser will hit the strong Cache when it requests the same URL later. To solve the update problem, you need to add hash, version number and other dynamic characters to the file name (or path), and then change the dynamic characters. This allows you to change the reference URL to invalidate the previous strong cache (it’s not immediately invalidated, it’s just no longer in use)
Cache location, and read priority
The priorities are in the following order
1. Service Worker
See another of my articles for more details
2. Memory Cache
The resource is stored in memory and the next access is read directly from memory. For example, when refreshing a page, much of the data comes from the memory cache. Generally stored scripts, fonts, pictures.
The advantage is that the reading speed is fast; Disadvantages: Once the Tab page is closed, the cache in memory is released, so the capacity and memory time is poor
3. Disk Cache(hard Disk)
That is, the resource is stored in the hard disk, and the next access is directly read from the hard disk. Based on the fields in the request header, it determines which resources need to be cached, which resources can be used without a request, and which resources have expired and need to be rerequested. And even in the case of cross-domain sites, resources at the same address, once cached by the disk, are not requested again.
The advantage is cached in the hard disk, large capacity, and storage time is longer; The disadvantage is that the reading speed is slower
4. Push Cache
This is the push cache, which is what’s in HTTP/2, and it’s only used when none of the above three caches are hit. It only exists in the Session and is released once the Session is over, so the Cache time is short and the Cache in the Push Cache can only be used once
CDN cache
When we send a request and the browser’s local cache is invalid, the CDN helps us calculate where the short and fast path to get the content is.
For example, in Guangzhou request guangzhou server than the request Xinjiang server response speed is much faster, and then to the nearest CDN node request data
The CDN will determine whether the cached data is expired. If not, the cached data will be returned to the client directly, thus speeding up the response. If the CDN determines that the cache is expired, it will send back a source request to the server, pull the latest data from the server, update the local cache, and return the latest data to the client.
CDN not only solves the problem of cross-carrier and cross-region access, greatly reduces the access delay, but also plays the role of diversion, reducing the load of the source server
The CDN you can’t help knowing
Several differences between refresh and carriage return
- use
Ctrl+F5
When the page is forcibly refreshed, the local cache file is directly expired, and the strong cache and negotiated cache are skipped, and the server is directly requested - Click Refresh or
F5
Refresh the page when the local cache file expires and then takeIf-Modifed-Since
andIf-None-Match
Initiate negotiation cache validation freshness - Browser type URL press Enter, browser search
Disk Cache
If yes, use; if no, send network request
The local store
Cookie
The earliest proposed local storage method carries cookies in each HTTP request to determine whether multiple requests are initiated by the same user. The characteristics are as follows:
-
There is a security problem, if it is intercepted, you can get all the Session information, and then forward the Cookie to achieve the purpose. (See my other post about attacks and defenses to understand browser security (same origin restriction /XSS/CSRF/ man-in-the-middle attack))
-
The number of cookies in each domain name cannot exceed 20 and the size cannot exceed 4KB
-
Cookies are sent whenever a new page is requested
-
The Cookie name cannot be modified after it has been created
-
Cookies cannot be shared across domain names
There are two ways to share cookies across domain names
- Use the Nginx reverse proxy
- After logging in to one site, write cookies to other sites. The Session on the server is stored on a node, and the Cookie stores the Session ID
Cookie usage scenarios
- The most common use of Cookie and Session is to store the SessionId in a Cookie. Each request will carry the SessionId so that the server knows who made the request
- Can be used to count the number of clicks on the page
What are the fields in a Cookie
Name
,Size
So the name impliesValue
: Saves the user login status. This value should be encrypted and cannot be used in plaintextPath
: The path where this Cookie can be accessed. For example, juejin.cn/editor, path is /editor, and only /editor can read cookieshttpOnly
: indicates that cookies are not accessed through JS, reducing XSS attacks.Secure
: can only be carried in HTTPS requestsSameSite
: Specifies that browsers cannot carry cookies in cross-domain requests to reduce CSRF attacksLook hereDomain
: Domain name, cross-domain or Cookie whitelist, allow a subdomain to obtain or operate the parent domain Cookie, implement single sign-on can be very usefulExpires
/Max-size
: Specifies the time or number of seconds to expire. If not set, the browser will expire just like Session if you close the browser
LocaStorage
H5 is a new feature, is to store information to the local, its storage size is much larger than the Cookie, 5M, and is permanent storage, unless actively clean, or will remain
Restricted by the same origin policy, that is, port, protocol, host address, any different cannot access, and in the browser set to private mode, cannot read the LocalStorage
It can be used in many scenarios, such as storing website themes, storing user information, and so on. It can be used for data that has a large amount of data or does not change much
SessionStorage
SessionStorage is also a new feature of H5. It is mainly used to temporarily save data in the same window or TAB. The data is not deleted when the page is refreshed, but is deleted after the window or TAB is closed
SessionStorage and LocalStorage are LocalStorage, and can not be crawled by crawlers, and the same source policy restrictions, but SessionStorage is more strict, only in the same browser under the same window can be shared
Its API is the same as LocalStorage getItem, setItem, removeItem, Clear, key
Its use scenarios are generally time-sensitive, such as the storage of some websites’ visitor login information, as well as temporary browsing records
indexDB
Is a browser local database with the following features
-
Key and value pair storage: internal object warehouse to store data, all types of data can be directly stored, including JS objects, in the form of key and value pairs, each data has a corresponding primary key, the primary key is unique
-
Asynchronous: While indexDB is operating, it is still possible for users to perform other operations. Asynchronous design is designed to prevent large amounts of data being read and written, slowing down the performance of the web page
-
Support transactions: for example, when you modify the entire table, an error is reported in the middle of the modification. At this time, all the data will be restored to the state of the unmodified level
-
Same-origin restriction: Each database must have its own domain name. Web pages can only access databases under their own domain name
-
Large storage space: Generally no less than 250MB, or even no upper limit
-
Binary storage is supported, such as ArrayBuffer objects and Blob objects
In addition to the above four front-end storage methods, there is WebSQL, similar to SQLite, is a relational database in the real sense, can use SQL to operate, but with JS to convert, more trouble
The difference between the top four
Cookie | SessionStorage | LocalStorage | indexDB | |
---|---|---|---|---|
Storage size | 4k | 5 m or more | 5 m or more | infinite |
Storage time | Can specify a time, not specified to close the window to expire | Invalid when the browser window is closed | permanent | permanent |
scope | Same browser, all the same tags | Current TAB | Same browser, all the same tags | |
Exists in the | Back and forth in the request | Local client | Local client | Local client |
The same-origin policy | The same browser can only be accessed from the same page with the same path | For my own use | Same browser, can only be accessed by the same page shared |
Offline storage
Service Worker
The Service Worker is an independent thread running outside the main js thread behind the browser, and naturally cannot access the DOM. It is equivalent to a proxy server, which can intercept the user’s request, modify the request or directly respond to the user without contacting the server. Such as loading JS and images, which allow us to use web applications when offline
It is used for functions such as offline caching (to improve the loading speed of the first screen), message push, and network proxy. The HTTPS protocol must be used for Service workers because the Service Worker is involved in request interception and HTTPS is required for security
Caching using a Service Worker involves three steps:
- It is a registered
- You can then cache files after listening for install events
- The next time you access it, you can intercept the request and return the cached data directly
/ / index. Js registration
if (navigator.serviceWorker) {
navigator.serviceWorker .register('sw.js').then( registration= > {
console.log('Service worker registered successfully')
}).catch((err) = >{
console.log('Servcie worker registration failed')})}// sw.js listens for the 'install' event and caches the required files in the callback
self.addEventListener('install'.e= > {
// Open the specified cache file name
e.waitUntil(caches.open('my-cache').then( cache= > {
// Add files to cache
return cache.addAll(['./index.html'.'./index.css'])}})))// Intercepts all requested data in the request event cache
self.addEventListener('fetch'.e= > {
// Look for the response hit by the cache in the request
e.respondWith(caches.match(e.request).then( response= > {
if (response) {
return response
}
console.log('fetch source')}})))Copy the code
conclusion
Praise support, hand stay fragrance, and have glory yan
Thanks for seeing this, come on!
reference
- How browsers work and practice
- Ruan Yifeng Web API tutorial
- winty
- Lvan-Zhang