About a year ago I was refactoring a large JavaScript codebase into smaller modules, when I discovered a depressing fact about Browserify and Webpack:

“The More I modularize my code, The bigger it gets. šŸ˜•” — Nolan Lawson

Later on, Sam Saccone published some excellent research on Tumblr and Imgurā€˜s page load performance, in which he noted:

“Over 400ms is being spent simply walking the Browserify tree.” — Sam Saccone

In this post, Iā€™d like to demonstrate that small modules can have a surprisingly high performance cost depending on your choice of bundler and module system. Furthermore, Iā€™ll explain why this applies not only to the modules in your own codebase, but also to the modules within dependencies, which is a rarely-discussed aspect of the cost of third-party code.

Web perf 101

The more JavaScript included on a page, the slower that page tends to be. Large JavaScript bundles cause the browser to spend more time downloading, parsing, and executing the script, all of which lead to slower load times.

Even when breaking up the code into multiple bundles — Webpack Code Splitting, Browserify factor bundles, Etc. — The cost is merely delayed until later in the page lifecycle. Sooner or later, the JavaScript piper must be paid.

Furthermore, because JavaScript is a dynamic language, and because the prevailing CommonJS module system is also dynamic, It’s fiendishly difficult to extract unused code from the final payload that gets shipped to users. You might only need JQuery’s $. Ajax, but by including jQuery, you pay the cost of the entire library.

The JavaScript community has responded to this problem by advocating the use of small modules. Small modules have a lot Of aesthetic and practical benefits — to maintain, to justify, Easier to plug together — but they also solve the jQuery problem by promoting the inclusion of small bits of Functionality rather than big “kitchen sink” libraries.

So in the ā€œsmall modulesā€ world, instead of doing:

Var _ = require('lodash') _. Uniq ([1,2,2,3])Copy the code

You might do:

Var uniq = require('lodash. Uniq ') uniq([1,2, 3])Copy the code

Rich Harris has already articulated why the “small modules” pattern is inherently beginner- friendly, even though it tends to make life easier for library maintainers. However, There’s also a hidden performance cost to small modules that I don’t think has been adequately explored.

Packages vs modules

Itā€™s important to note that, when I say ā€œmodules,ā€ Iā€™m not talking about ā€œpackagesā€ in the npm sense. When you install a package from npm, it might only expose a single module in its public API, but under the hood it could actually be a conglomeration of many modules.

For instance, consider a package like is-array. It has no dependencies and only contains one JavaScript file, so it has one module. Simple enough.

Now consider a slightly more complex package like once, which has exactly one dependency: wrappy. Both packages contain one module, so the total module count is 2. So far, so good.

Now letā€™s consider a more deceptive example: qs. Since it has zero dependencies, you might assume it only has one module. But in fact, it has four!

You can confirm this by using a tool I wrote called browserify-count-modules, which simply counts the total number of modules in a Browserify bundle:


$ npm install qs
$ browserify node_modules/qs | browserify-count-modules
4
Copy the code

What ‘s going on here? Well, if you look at the source for QS, you’ll see that it contains four JavaScript files, representing four JavaScript modules which are ultimately included in the Browserify bundle.

This means that a given package can actually contain one or more modules. These modules can also depend on other packages, which might bring in their own packages and modules. The only thing you can be sure of is that each package contains at least one module.

Module bloat

How many modules are in a typical web application? Well, I ran browserify-count-modules on a few popular Browserify-using sites, and came up with these numbers:

For the record, my own Pokedex.org (the largest open-source site Iā€™ve built) contains 311 modules across four bundle files.

Ignoring for a moment the raw size of those JavaScript bundles, I think itā€™s interesting to explore the cost of the number of modules themselves. Sam Saccone has already blown this story wide open in ā€œThe cost of transpiling es2015 in 2016ā€, but I donā€™t think his findings have gotten nearly enough press, so letā€™s dig a little deeper.

Benchmark time!

I put together a small benchmark that constructs a JavaScript module importing 100, 1000, and 5000 other modules, each of which merely exports a number. The parent module just sums the numbers together and logs the result:


// index.js
var total = 0
total += require('./module_0')
total += require('./module_1')
total += require('./module_2')
// etc.
console.log(total)
Copy the code

// module_0.js
module.exports = 0
Copy the code

// module_1.js
module.exports = 1
Copy the code

(And so on.)

I tested five bundling methods: Browserify, Browserify with the bundle-collapser plugin, Webpack, Rollup, and Closure Compiler. For Rollup and Closure Compiler I used ES6 modules, whereas for Browserify and Webpack I used CommonJS, so as not to unfairly disadvantage them (since they would need a transpiler like Babel, which adds its own overhead).

In order to best simulate a production environment, I used Uglify with the --mangle and --compress settings for all bundles, and served them gzipped over HTTPS using GitHub Pages. For each bundle, I downloaded and executed it 15 times and took the median, noting the (uncached) load time and execution time using performance.now().

Bundle sizes

Before we get into the benchmark results, itā€™s worth taking a look at the bundle files themselves. Here are the byte sizes (minified but ungzipped) for each bundle (chart view):

100 modules 1000 modules 5000 modules
browserify 7982 79987 419985
browserify-collapsed 5786 57991 309982
webpack 3954 39055 203052
rollup 671 6971 38968
closure 758 7958 43955

And the minified+gzipped sizes (chart view):

100 modules 1000 modules 5000 modules
browserify 1649 13800 64513
browserify-collapsed 1464 11903 56335
webpack 693 5027 26363
rollup 300 2145 11510
closure 302 2140 11789

What stands out is that the Browserify and Webpack versions are much larger than the Rollup and Closure Compiler versions. If you take a look at the code inside the bundles, it becomes clear why.

The way Browserify and Webpack work is by isolating each module into its own function scope, And then literal a top-level runtime loader that locates the proper module whenever require() is called. Here’s what our Browserify bundle looks like:

(function e(t,n,r){function s(o,u){if(! n[o]){if(! t[o]){var a=typeof require=="function"&&require; if(! u&&a)return a(o,! 0); if(i)return i(o,! 0); var f=new Error("Cannot find module '"+o+"'"); throw f.code="MODULE_NOT_FOUND",f}var l=n[o]={exports:{}}; t[o][0].call(l.exports,function(e){var n=t[o][1][e]; return s(n? n:e)},l,l.exports,e,t,n,r)}return n[o].exports}var i=typeof require=="function"&&require; for(var o=0; oCopy the code

Whereas the Rollup and Closure bundles look more like what you might hand-author if you were just writing one big The module. Here ‘s a Rollup:


(function () {
        'use strict';
        var total = 0
        total += 0
        total += 1
        total += 2
// etc.
Copy the code

The important thing to notice is that every module in Webpack and Browserify gets its own function scope, and is loaded at runtime when require()d from the main script. Rollup and Closure Compiler, on the other hand, just hoist everything into a single function scope (creating variables and namespacing them as necessary).

If you understand the inherent cost of functions-within-functions in JavaScript, and of looking up a value in an associative array, then youā€™ll be in a good position to understand the following benchmark results.

Results

I ran this benchmark on a Nexus 5 with Android 5.1.1 and Chrome 52 (to represent a low-to-mid-range device) as well as an iPod Touch 6th generation running iOS 9 (to represent a high-end device).

Here are the results for the Nexus 5 (tabular results):

And here are the results for the iPod Touch (tabular results):

At 100 modules, the variance between all the bundlers is pretty negligible, but once we get up to 1000 or 5000 modules, the difference becomes severe. The iPod Touch is hurt the least by the choice of bundler, but the Nexus 5, being an aging Android phone, suffers a lot under Browserify and Webpack.

I also find it interesting that both Rollup and Closureā€™s execution cost is essentially free for the iPod, regardless of the number of modules. And in the case of the Nexus 5, the runtime costs arenā€™t free, but theyā€™re still much cheaper for Rollup/Closure than for Browserify/Webpack, the latter of which chew up the main thread for several frames if not hundreds of milliseconds, meaning that the UI is frozen just waiting for the module loader to finish running.

Note that both of these tests were run on a fast Gigabit connection, so in terms of network costs, It’s really a best-case scenario. Using Chrome Dev Tools, we can manually throttle that Nexus 5 down to 3G and see the impact (tabular results):

Once we take slow networks into account, the difference between Browserify/Webpack and Rollup/Closure is even more stark. In the case of 1000 modules (which is close to Redditā€™s count of 1050), Browserify takes about 400 milliseconds longer than Rollup. And that 400ms is no small potatoes, since Google and Bing have both noted that sub-second delays have an appreciable impact on user engagement.

One thing to note is that this benchmark doesnā€™t measure the precise execution cost of 100, 1000, or 5000 modules per se, since that will depend on your usage of require(). Inside of these bundles, Iā€™m calling require() once per module, but if you are calling require() multiple times per module (which is the norm in most codebases) or if you are calling require() multiple times on-the-fly (i.e. require() within a sub-function), then you could see severe performance degradations.

Reddit’s mobile site is a good example of this. Even though they have 1050 modules, I clocked their real-world Browserify execution time as much worse than the “1000 modules” benchmark. When profiling on That same Nexus 5 running Chrome, I measured 2.14 seconds for Reddit’s Browserify require() function, And 197 milliseconds for the equivalent function in the “1000 Modules” script. (In Desktop Chrome on an i7 Surface Book I also measured it at 559ms vs 37ms, which is pretty astonishing given we’re talking desktop.)

This suggests that it may be worthwhile to run the benchmark again with multiple require()s per module, although in my opinion it wouldnā€™t be a fair fight for Browserify/Webpack, since Rollup/Closure both resolve duplicate ES6 imports into a single hoisted variable declaration, and itā€™s also impossible to import from anywhere but the top-level scope. So in essence, the cost of a single import for Rollup/Closure is the same as the cost of n imports, whereas for Browserify/Webpack, the execution cost will increase linearly with n require()s.

For the purposes of this analysis, though, I think it’s best to just assume that the number of modules is only a lower bound for the performance hit you might Feel. In reality, the “5000 modules” benchmark may be a better yardstick for “5000 require() calls.”

Conclusions

First off, the bundle-collapser plugin seems to be a valuable addition to Browserify. If youā€™re not using it in production, then your bundle will be a bit larger and slower than it would be otherwise (although I must admit the difference is slight). Alternatively, you could switch to Webpack and get an even faster bundle without any extra configuration. (Note that it pains me to say this, since Iā€™m a diehard Browserify fanboy.)

However, these results clearly show that Webpack and Browserify both underperform compared to Rollup and Closure Compiler, and that the gap widens the more modules you add. Unfortunately Iā€™m not sure Webpack 2 will solve any of these problems, because although theyā€™ll be borrowing some ideas from Rollup, they seem to be more focused on the tree-shaking aspects and not the scope-hoisting aspects. (Update: a better name is ā€œinlining,ā€ and the Webpack team is working on it.)

Given these results, Iā€™m surprised Closure Compiler and Rollup arenā€™t getting much traction in the JavaScript community. Iā€™m guessing itā€™s due to the fact that (in the case of the former) it has a Java dependency, and (in the case of the latter) itā€™s still fairly immature and doesnā€™t quite work out-of-the-box yet (see Calvinā€™s Metcalfā€™s comments for a good summary).

Even without the average JavaScript developer jumping on the Rollup/Closure bandwagon, though, I think npm package authors are already in a good position to help solve this problem. If you npm install lodash, You’ll notice that the main export is one giant JavaScript Module, Rather than what you might expect given Lodash’s hyper-modular nature (require(‘ Lodash /uniq’), require(‘ Lodash. Uniq ‘), etc.). For PouchDB, we made a similar decision to use Rollup as a prepublish step, Which produces the smallest possible bundle in a way that’s invisible to users.

I also created rollupify to try to make this pattern a bit easier to just drop-in to existing Browserify projects. The basic idea is to use imports and exports within your own project (cjs-to-es6 can help migrate), and then use require() for third-party packages. That way, you still have all the benefits of modularity within your own codebase, while exposing more-or-less one big module to your users. Unfortunately, you still pay the costs for third-party modules, but Iā€™ve found that this is a good compromise given the current state of the npm ecosystem.

So there you have it: one horse-sized JavaScript duck is faster than a hundred duck-sized JavaScript horses. Despite this fact, though, I hope that our community will eventually realize the pickle we’re in — government and people for a “small modules” philosophy That’s good for developers but bad for users — and improve our tools, so that we can have the best of both worlds.

Bonus round! Three desktop browsers

Normally I like to run performance tests on mobile devices, since thatā€™s where you see the clearest differences. But out of curiosity, I also ran this benchmark on Chrome 52, Edge 14, and Firefox 48 on an i7 Surface Book using Windows 10 RS1. Here are the results:

Chrome 52 (tabular results)

Edge 14 (tabular results)

Firefox 48 (tabular results)

The only interesting tidbits Iā€™ll call out in these results are:

  1. bundle-collapser is definitely not a slam-dunk in all cases.
  2. The ratio of network-to-execution time is always extremely high for Rollup and Closure; their runtime costs are basically zilch. ChakraCore and SpiderMonkey eat them up for breakfast, and V8 is not far behind.

This latter point could be extremely important if your JavaScript is largely lazy-loaded, because if you can afford to wait on the network, then using Rollup and Closure will have the additional benefit of not clogging up the UI thread, i.e. theyā€™ll introduce less jank than Browserify or Webpack.

Update: in response to this post, JDD has opened an issue on Webpack. Thereā€™s also one on Browserify.

Update 2: Ryan Fitzer has generously added RequireJS and RequireJS with Almond to the benchmark, both of which use AMD instead of CommonJS or ES6.

Testing shows that RequireJS has the largest bundle sizes but surprisingly its runtime costs are very close to Rollup and Closure. Here are the results for a Nexus 5 running Chrome 52 throttled to 3G:

Like this:

Like Loading…

Related