I have learned some strategies for collating the web cache. If there are any errors, please correct them.
Browser side cache rules
Caching behavior is primarily determined by caching policy, which is set by the content owner. These policies are articulated primarily through specific HTTP headers.
When a user initiates a static resource request, the browser obtains the resource through the following steps:
- In the local cache phase, this resource is searched locally. If this resource is found and has not expired, this resource is used without sending HTTP requests to the server.
- Consultation stage cache: if find corresponding resource in the local cache, but don’t know whether the resource is expired or already expired, it sends an HTTP request to the server, and then the server to judge the request, if the request of the resource on the server is not altered, it returns 304, the browser USES the local find the resources;
- Cache failure phase: When the server discovers that the requested resource has been modified, or that it is a new request (the resource was not found originally), the server returns the data for the resource and returns 200, 404 if the resource was found, or 404 if the resource is not on the server.
The local cache
This can be done by setting the request header
Expires
res.setHeader('Expires', new Date(Date.now() + 30 * 1000).toUTCString())
Copy the code
Cache Control
res.setHeader('Cache-Control'.'max-age=30')
Copy the code
Note: Cache-control: This was added in HTTP 1.1 to address the Expires flaw. If max-age is set, max-age overrides expires
The complete code
http.createServer(function(req, res) {
let { pathname } = url.parse(req.url, true)
let filepath = path.join(__dirname, pathname)
fs.stat(filepath, (err, stat) = > {if (err) {
sendError(req, res)
} else {
send(req, res, filepath)
}
})
}).listen(8080)
function sendError(req, res) {
res.statusCode = 404
res.end('Not Found')}function send(req, res, filepath) {
res.setHeader('Content-Type', mime.getType(filepath))
res.setHeader('Expires', new Date(Date.now() + 30 * 1000).toUTCString())
res.setHeader('Cache-Control'.'max-age=30')
fs.createReadStream(filepath).pipe(res)
}
Copy the code
Negotiate the cache
Last-Modified & if-modified-since
- Last-modified and if-Modified-since are a pair of headers that belong to HTTP 1.0.
- Last-modified is the time that the WEB server believes an object was last modified, such as the last modification of a file or the last generation of a dynamic page.
http.createServer(function(req, res) {
let { pathname } = url.parse(req.url, true)
let filepath = path.join(__dirname, pathname)
fs.stat(filepath, (err, stat) = > {if (err) {
res.statusCode = 404
res.end('Not Found')}else {
let ifModifiedSince = req.headers['if-modified-since']
let LastModified = stat.ctime.toGMTString()
if (ifModifiedSince === LastModified) {
res.statusCode = 304
res.end()
} else {
send(req, res, filepath, stat)
}
}
})
}).listen(8080)
function send(req, res, filepath, stat) {
res.setHeader('Content-Type', mime.getType())
res.setHeader('Last-Modified', stat.ctime.toGMTString())
fs.createReadStream(filepath).pipe(res)
}
Copy the code
ETag & If-None-Match
- ETag and if-none-match are a pair of packets and belong to HTTP 1.1.
- ETag can be used to solve this problem. ETag is a unique identifier for a file. Like a hash or fingerprint, each file has a separate flag that changes whenever the file is changed.
- The ETag mechanism is similar to the optimistic lock mechanism. If the ETag of the request packet is different from that of the server, the resource has been modified and the latest content needs to be sent to the browser.
- If both headers Match if-modified-since and if-none-match, the server returns 304 If the Modified time and Etag Match the server’s, otherwise it sends the latest content to the browser.
http.createServer(function(req, res) {
let { pathname } = url.parse(req.url, true)
let filepath = path.join(__dirname, pathname)
fs.stat(filepath, (err, stat) = > {if (err) {
sendError(req, res)
} else {
let ifNoneMatch = req.headers['if-none-match']
let out = fs.createReadStream(filepath)
let md5 = crypto.createHash('md5')
out.on('data'.function(data) {
md5.update(data)
})
out.on('end'.function() {
let etag = md5.digest('hex')
if (ifNoneMatch === etag) {
res.statusCode = 304
res.end()
} else {
send(req, res, filepath, etag)
}
})
}
})
}).listen(8080)
function sendError(req, res) {
res.statusCode('404')
res.end('Not Found')}function send(req, res, filepath, etag) {
res.setHeader('Content-Type', mime.getType())
res.setHeader('ETag', etag)
fs.createReadStream(filepath).pipe(res)
}
Copy the code
The Etag/lastModified process is as follows:
- The client requests A page (A).
- The server returns to page A and adds A Last-Modified/ETag to page A.
- The client renders the page and caches it along with last-Modified /ETag.
- The client requests page A again and passes to the server the last-Modified /ETag returned from the Last request.
- The server checks the last-Modified or ETag, determines that the page has not been Modified since the Last client request, and returns response 304 with an empty response body.
Server-side cache
CND cache
CDN cache, also known as gateway cache, reverse proxy cache. The browser first initiates a WEB request to the CDN gateway, and one or more load balancing source servers corresponding to the gateway server will dynamically forward the request to the appropriate source server according to their load requests.
CDN cache policy
- CDN edge node Cache policies vary with different service providers, but generally follow the HTTP standard protocol and set the CDN edge node data Cache time through the cache-Control: max-age field in the HTTP response header.
- When the client requests data from the CDN node, the CDN node will judge whether the cached data has expired. If the cached data has not expired, the CDN node will directly return the cached data to the client. Otherwise, the CDN node will send a back to the source request to pull the latest data from the source, update the local cache, and return the latest data to the client.
- CDN service providers generally provide multiple dimensions based on file suffixes and directories to specify CDN cache time to provide users with more refined cache management.
- CDN cache time has a direct impact on the “back source rate”. If the CDN cache time is short, the data on the edge nodes of CDN will often fail, leading to frequent source back, increasing the load of source station and increasing access delay. If the CDN cache time is too long, the data update time is slow. Developers need to add specific business to do specific data cache time management.
- CDN cache refresh CDN edge nodes are transparent to developers. Compared with the browser’s Ctrl+F5 forced refresh to inactivate the browser’s local cache, developers can clear CDN edge node cache through the “cache refresh” interface provided by CDN service providers. In this way, after updating data, developers can use the “refresh cache” function to force the expiration of the data cache on the CDN node to ensure that the client can pull the latest data when accessing.
CDN advantage
- CDN node solves the problem of cross-operator and cross-region access, and the access delay is greatly reduced.
- Most of the requests are completed at the edge of the CDN, which plays a shunt role and reduces the load of the source station.
HTML5 Cache idea
- Users can access your application offline, which is especially important for mobile users who can’t stay connected all the time
- User access to cached files locally usually means faster access
- Only modified resources are loaded to avoid multiple requests from the same resource to the server, which greatly reduces the access pressure on the server