This article was first published on Array_Huang’s technology blog, Utility first. The original address: https://segmentfault.com/a/1190000010317802 if you are interest in this series, welcome to subscribe to here: https://segmentfault.com/blog/array_huang
preface
A mature project, naturally cannot leave iteration update; In this article, we will show you how to handle the browser cache with the help of WebPack (architecture).
In fact, I had wanted to write this part for a long time, but I was afraid to make a fool of myself because the plan I had mastered at that time was not satisfactory. Since WebPack moved to v2, and third-party plugins have become more abundant, we have more tools to handle caches.
A brief introduction to browser caching
Here’s a quick look at the browser cache and why I’m emphasizing “go, go, stay” in the title.
What is a browser cache?
Browser Cache is a function launched by the Browser to save network bandwidth and speed up website access. The browser cache works like this:
- The user visits a web page for the first time using a browser, and various static resources (JS/CSS/images/fonts…) are introduced to the page. The browser stores these static resources, even the page itself (HTML files), locally.
- If the user needs to request the same static resource (matching by URL) again in the subsequent access, and the static resource has not expired (the server side has a series of policies to determine whether the resource is expired, such as
Cache-Control
,Pragma
,ETag
,Expires
,Last-Modified
), the previously locally stored resources are directly used without repeated requests.
Since WebPack is only responsible for building the static resources that generate the front end of the web site and does not involve the server, this article does not discuss cache control strategies based on HTTP headers. So what are we talking about?
Because the browser determines whether a static resource is cached based on its URL, and the static resource’s file directory is relatively fixed, the focus is obviously on the file name of the static resource. We manipulate the file name of the static resource to decide whether to leave or not.
What about the browser cache?
If the file name of a static resource changes every time a new version is deployed, the browser determines that it is the first time to read the static resource. Then, even if the content of the static resource is exactly the same as the previous version, the browser will have to re-download the static resource, wasting network bandwidth and slowing down page loading.
Browser cache, what if you should go or not?
If the file name of the static resource does not change when a new version is deployed, the browser determines that the cached static resource can be loaded. So, even if the content of this static resource changes from the previous version, the browser doesn’t notice and uses the old static resource. So what are the implications? Can be large or small, small to the user see is still the old version of the resource, can not reach the purpose of online update version; Large to cause website operation error, layout dislocation and other problems.
How do you control the browser cache by manipulating the file names of static resources?
In webPack’s configuration of file name naming, there is a set of variables (or naming conventions) that allow you to name the files you want to generate, rather than having to preset a fixed name. In caching, we mainly use the [hash] and [chunkhash] variables. I introduced these two variables in my previous article, What are common parts of webPack configuration? I’ve already explained what it means, but I won’t go over it here.
Here’s a summary of the use of the [hash] and [chunkhash] variables:
- with
[hash]
Since this hash string is updated every time you build code using Webpack, it equalsForce the browser cache to flush. - with
[chunkhash]
A hash string is inserted into the file name based on the content of the chunk. In other words, the contents of chunk remain the same, so does the name of the file generated by the chunk. Therefore,The browser cache continues to be utilized.
What resources do you need to take into account the browser cache?
In theory, all files generated by WebPack need to deal with browser caching, except for HTML files, which need to be kept in a relatively fixed path and started on the server side.
js
Under the WebPack architecture, js files also have different types and therefore require different configurations:
- Entry file: in the WebPack configuration
output.filename
Parameter to the generated file name[chunkhash]
Can. - Chunk loaded asynchronously:
output.chunkFilename
Parameters, operation as above. - through
CommonsChunkPlugin
Generated file: inCommonsChunkPlugin
In the configuration parameters offilename
For this term, do the same thing. But be careful if you use[chunkhash]
Webpack builds with an error. So what do you do? Use[hash]
, thiscommon chunk
Doesn’t it force a refresh every time a new version comes online? This is actually because the WebPack Runtime && Manifest will be stored uniformly in yourcommon chunk
For a solution, see the webPack Runtime && Manifest section below.
css
For CSS, if you use style-loader to inline CSS directly into , then you should take care of the browser cache of the js files that are imported into the CSS.
If you are using extract-text-webpack-plugin to package CSS files separately, add [chunkhash] to the file name and add [contenthash] (thanks @flying_hbt). What is this contenthash thing? [chunkhash] [chunkhash] [chunkhash] [chunkhash] [chunkhash] [chunkhash] [chunkhash] [chunkhash] [chunkhash] However, [chunkhash] has already been used as the hash string for chunk content. Continuing to use [chunkhash] will cause the following problems.
Static resources such as images and font files
I heard webPack packs images and fonts? Url-loader or file-loader is used to process static resources.
For url-loader, you don’t need to worry about the browser cache because it converts static resources into dataurls rather than stand-alone files.
For file-loader, add [chunkhash] to the file name. Also note that url-loader is usually configured to downgrade files to file-loader (if the file loaded by loader is larger than one value, the file will be degraded to file-loader). Also add [chunkhash] to the file name configuration.
The webpackruntime && manifest
Runtime is the helper code that helps WebPack compile the built package and run it in the browser. In other words, the packaged file, in addition to your own source code and the NPM library, has a bit of helper code provided by WebPack.
The MANIFEST is a relationship table used by Webpack to find the real path of chunk. In simple terms, it is the relationship table of the chunk name corresponding to the chunk path. The manifest is typically hidden in the Runtime, so when we look at the Runtime, we can find it, but it’s not intuitive. It looks like this (common Chunk only) :
u.type = "text/javascript", u.charset = "utf-8", u.async = ! 0, u.timeout = 12e4, n.nc && u.setAttribute("nonce", n.nc), u.src = n.p + "" + e + "." + { 0: "e6d1dff43f64d01297d3", 1: "7ad996b8cbd7556a3e56", 2: "c55991cf244b3d833c32", 3: "ecbcdaa771c68c97ac38", 4: "6565e12e7bad74df24c3", 5: "9f2774b4601839780fc6" }[e] + ".bundle.js";Copy the code
runtime && manifest
Where did it go?
So, where is this runtime && Manifest snippet going to go? Typically, if the Common Chunk is not generated using the CommonsChunkPlugin, the Runtime && Manifest will be placed in a chunk headed by an entry file. In our multi-page application, A Runtime && manifest for each large package; The runtime && Manifest will migrate to common Chunk after using CommonsChunkPlugin.
runtime && manifest
tocommon chunk
Cache crisis
While moving runtime && Manifest to Common Chunk solved the problem of code redundancy, it created another problem: Since we use [chunkhash] for all of the static resource names mentioned above, we can change the name of at least one chunk if we make a slight change to the source code, which causes our Runtime && Manifest to change as well. This causes our Common Chunk to change as well, which is probably why Webpack states that common Chunk containing runtime && Manifest cannot use [chunkhash]. Might as well not have).
To solve the above problem (it’s a serious problem, common Chunk doesn’t have access to cache, it’s the biggest chunk), we need to separate the Runtime && Manifest. Add CommonsChunkPlugin to the CommonsChunkPlugin for Common Chunk:
/ * extract all the common parts of * / new webpack.optimize.Com monsChunkPlugin ({name: 'Commons/Commons', / / it is important to note that the chunk name cannot be the same!!! [chunkhash].js // chunkhash/chunkhash/chunks: 4,}), /* Extract the Runtime code from webpack to avoid changing commonChunk with minor changes to the entry file. Making an effective browser cache invalidation * / new webpack.optimize.Com monsChunkPlugin ({name: 'webpack - runtime, filename: 'Commons/Commons /webpack-runtime.[hash].js', // Note that runtime can only use [hash]}),Copy the code
As a result, the Runtime && Manifest fragment is packaged into a chunk called WebPack-Runtime. How does this work? It is said that in the case of CommonsChunkPlugin, Webpack packs the Runtime && Manifest into the last chunk of commonschunkplugin-generated chunk, and if there is no other code in that chunk, This is the natural goal of separating the Runtime && Manifest.
Note that if you use htML-webpack-plugin to generate HTML pages, you should insert the Chunk of the Runtime && MANIFEST into the HTML page, otherwise you can’t blame me for the page error.
The runtime && manifest is now a separate chunk, so the common Chunk can be named with [chunkhash]. The runtime && Manifest Chunk, on the other hand, is updated every time a Webpack is packaged and built.
It is necessary to take the manifest fromruntime && manifest
Is chunk independent?
Yes, don’t be surprised, there is such a dirty operation.
The rationale for separating out the manifest is this: Once the Manifest is separated, the Runtime part of the manifest is basically unchanged; So now we know that the runtime && manifest is actually changing in the manifest; Thus, separating out the manifest further leverages the browser cache (the Runtime cache can be retained).
How do you do that? There are two mainstream solutions:
- The manifest is generated as a JSON file using chunk-manifest-webpack-plugin and loaded asynchronously by Webpack.
- If you are using
html-webpack-plugin
To generate HTML pages, you can also useinline-chunk-manifest-html-webpack-plugin(html-webpack-plugin
To print the manifest directly to an HTML page, saving an Http request.
I tried the second plan and it worked, but I finally gave it up. Why?
After the manifest was isolated, the runtime chunk names were still [hash] and not [chunkhash], which made it impossible to use the browser cache at all. Later, I came up with a compromise, not even [hash], just write a file name; That way, indeed, the browser cache is preserved. But then I reversed myself, leaving the browser cache, but not “going where you need to go.” You might wonder, you said the runtime won’t change, so what’s the point of leaving the cache? Yes, it is true that the Runtime does not change within the same WebPack environment, but there is no telling what will happen to the Runtime if the WebPack environment changes. For example, webpack version is updated, WebPack configuration is changed, loader & Plugin version is updated, who can guarantee that the Runtime will never change? If this runtime uses the wrong expired cache, the whole system could crash. I really can’t afford to take this risk, so I have to leave it at that.
The Runtime && Manifest Chunk for Array-Huang/ WebPack-Seed is only 2KB. !
Cache issues miscellaneous
Cache issues caused by module IDS
When WebPack handles dependencies between modules, each module needs to be identified by an ID. The default naming convention for WebPack ids is to assign an integer (1, 2, 3…) based on the order in which modules are introduced. . When you add or remove a module dependency in the source code, it will have a significant impact on the entire ID sequence. So what direct impact does this have on our browser cache? The impact is that there is not necessarily a substantial change in each chunk, but the id of the referenced dependent module is changed, which obviously causes the filename of the chunk to change, thus affecting the browser cache.
The official webPack documentation recommends that we use a plugin already built into WebPack 2: HashedModuleIdsPlugin. The official documentation for this plugin is here.
In the webpack1 era, there is a NamedModulesPlugin, whose principle is to directly use the relative path of the module as the module ID, so that the module ID will not change as long as the relative path of the module. So how is HashedModuleIdsPlugin an improvement over NamedModulesPlugin?
Well, because the relative path of modules can be very long, it takes up a lot of space, which has been criticized by the community. However, HashedModuleIdsPlugin generates a configurable string (4 bits by default) as the module ID based on the relative path of the module (using md5 by default), so it takes up very little space and can be safely used.
To generate identifiers that are preserved over builds, webpack supplies the NamedModulesPlugin (recommended for development) and HashedModuleIdsPlugin (recommended for production).
NamedModulesPlugin is recommended for development environments and HashedModuleIdsPlugin for production environments. As far as I’m concerned, HashedModuleIdsPlugin should only be used in production, and the browser cache should not be used in development.
It’s easy to use, just add it to the plugin parameter:
Plugins: {/ / other plugin new webpack HashedModuleIdsPlugin (),}Copy the code
Failed to monitor file changes caused by some plugins
Some plugins generate separate chunk files, such as CommonsChunkPlugin or ExtractTextPlugin (which extracts CSS snippet from JS and generates a separate CSS file).
These plugins may generate chunk filenames without expecting subsequent changes to the code by other plugins (such as UglifyJsPlugin, which is used to obfuscate the code), so the resulting chunk filenames do not fully reflect changes in the file’s content.
In addition, the ExtractTextPlugin has a serious problem that the [chunkhash] it uses to generate the filename is taken directly from the JS chunk that references the CSS. In other words, if I just change the CSS code snippet and leave the JS code alone, the resulting CSS file name is still the same, which is a very serious browser cache “go or not go” problem. 2017-07-26 Changed: Use contenthash to avoid this problem, see CSS section above
There is a plugin that solves this problem: webpack-plugin-hash-output.
There are other webpack plugins for hashing out there. But when they run, they don’t “see” the final form of the code, because they run before plugins like webpack.optimize.UglifyJsPlugin. In other words, if you change webpack.optimize.UglifyJsPlugin config, your hashes won’t change, creating potential conflicts with cached resources.
The main difference is that webpack-plugin-hash-output runs in the last compilation step. So any change in webpack or any other plugin that actually changes the output, will be “seen” by this plugin, and therefore that change will be reflected in the hash.
In simple terms, webpack-plugin-hash-output revalues the MD5 value of all files at the end of webpack compilation, ensuring that any changes in file contents are reflected in the file name.
Usage is also relatively simple:
Plugins: {// Other plugin new HashOutput({manifestFiles: 'webpack-Runtime ', // specify the chunk containing the manifest}),}Copy the code
conclusion
Browser caching is very important, very important, very important. If something goes wrong, you don’t have to go after your boss. In addition, the details of this piece are particularly large, and must be taken care of in all aspects, otherwise the whole situation will be ruined if one aspect is wrong.
The sample code
This series of articles is best served with my scaffolding project on Github (laughs) : Array-Huang/webpack-seed (https://github.com/Array-Huang/webpack-seed).
Attached with a series of articles catalogue (synchronous update)
- Webpack Multi-page Application Architecture Series 1: Step-by-step solutions to architecture pain points
- Webpack Multi-page Application Architecture series (2) : What are the common parts of WebPack configuration?
- Webpack Multi-page Application Architecture Series iii: How to package common code to avoid duplication?
- Webpack Multi-page Application Architecture series (4) : The old jQuery plug-in can not be lost, how to compatibility?
- Webpack multi-page Application Architecture series (5) : I heard that Webpack can even pack less/ CSS?
- Webpack Multi-page Application Architecture series (6) : I heard that WebPack can even pack images and fonts?
- Webpack Multi-page Application Architecture series (7) : Development environment, production environment is not clear?
- Webpack Multi-page Application Architecture series (8) : Coach I want to write ES6! How does Webpack integrate Babel?
- Webpack Multi-page application Architecture series (9) : There are always unruly people want to harm me! ESLint blocks junk code for you
- Webpack Multi-page Application Architecture series (10) : How to build a custom Bootstrap
- Webpack Multi-page Application Architecture series (11) : Pre-packaged Dll to achieve webpack sonic compilation
- Webpack Multi-page Application Architecture series (12) : Use Webpack to generate HTML common web pages & page templates
- Webpack Multi-page Application Architecture Series 13: Building a simple template layout system
- Webpack Multi-page Application Architecture Series (xiv) : No copy and paste! Common infrastructure for multiple projects
- Webpack Multi-Page Application Architecture series 15: How does the Front End Survive in the Backend Rendering Development Mode
- Webpack Multi-page Application Architecture Series (16) : Make good use of the browser cache, go where you want, stay where you want
This article was first published on Array_Huang’s technology blog, Utility first. The original address: https://segmentfault.com/a/1190000010317802 if you are interest in this series, welcome to subscribe to here: https://segmentfault.com/blog/array_huang