Where is the useless code? Tree-shaking of project weight reduction rollup

Build the column series catalog entry

Zuo Lin, front-end development engineer of Wedoctor Front-end Technology Department. In the Internet wave, love life and technology.

The version of the Rollup packaging tool used in this article is rollup V2.47.0.

From WebPack2.x to tree-shaking with plugins, to the recent hot Vite build tools that use rollup’s packaging capabilities, Vue and React are known to use rollup as well. Especially when we are creating packages for libraries like function libraries, utility libraries, etc., the first choice is rollup! So what’s the magic that keeps Rollup going? The answer may be in tree-shaking!

Understanding tree-shaking

1. What is tree-shaking?

The concept of tree-shaking has been around for a long time, but it’s only since rollup that it has been taken seriously. In the spirit of curiosity, let’s take a look at tree-shaking from rollup

So, what does Kangkang Tree-shaking do?

Tree-shaking in the packaging tool, implemented earlier by Rich_Harris’s rollup, is officially standard: essentially eliminating useless JS code. That is, when I introduce a module, I don’t introduce all the code in the module, I introduce only the code I need, and the useless code I don’t need is “shaken” away.

Webpack also implements tree-shaking. Specifically, in a Webpack project, there is an entry file that is like the trunk of a tree, and the entry file has many modules that depend on it, like branches of the tree. In reality, although our functionality files depend on a module, we use only some of the functionality, not all of it. With tree-shaking, modules that are not in use are shaken off so that useless code can be removed.

So we know that tree-shaking is a way to eliminate useless code!

It should be noted, however, that tree-shaking can eliminate useless code, but only for THE ES6 module syntax, which uses static analysis, analyzing the code literally. He has no idea what to do with CommonJS dynamic analysis modules that have to be executed before you know what to refer to, but we can use plugins to support CommonJS turning to ES6 and then doing treeshaking. As long as the idea doesn’t slide, it’s more difficult than it is.

In summary, rollup.js uses the ES module standard by default, but it can be made to support the CommonJS standard through the rollup-plugin-CommonJS plugin.

2. Why do you need tree-shaking?

Today’s Web applications can be very bulky, especially JavaScript code, but JavaScript is very resource-intensive for browsers to process. If we can get rid of the useless code and provide only valid code for browsers to process, we can greatly reduce the burden on browsers. And Tree-Shaking helps us do that.

From this perspective, the tree-shaking functionality falls into the category of performance optimization.

After all, reducing JavaScript garbage in a Web project means reducing the size of the file, which reduces the time it takes to load the file resources, thereby enhancing the user experience by reducing the waiting time for the user to open the page.

Second, in-depth understanding of tree-shaking principle

We’ve seen that the essence of tree-shaking is to eliminate useless JS code. So what is useless code? How to eliminate useless code? Let’s take a look at the mystery of DCE and find out

1. Elimination of dead code with DCE

Useless code is so common in our code that its elimination has its own term, dead code elimination (DCE). In effect, the compiler can figure out which code doesn’t affect the output and then eliminate it.

Tree-shaking is a new implementation of DCE. Unlike traditional programming languages, Javascript is mostly loaded over a network and then executed. The smaller the file size is loaded, the shorter the overall execution time is. Makes more sense for javascript. Tree-shaking is also different from traditional DCE, which eliminates code that is impossible to execute, whereas tree-shaking focuses on eliminating code that is not used.

DCE

Code will not be executed and is unreachable
The results of code execution are not used
The code only affects dead variables and is written, not read

Traditional compiled predictions are made by the compiler removing Dead Code from the AST (abstract syntax tree). So how does tree-shaking eliminate useless javascript code?

Tree-shaking is more focused on eliminating modules that are referenced but not used, a principle that relies on the module features of ES6. So let’s take a look at the ES6 module features:

ES6 Module

Appears only as a statement at the top level of the module
Import module names can only be string constants
Import binding is immutable

With these premises in mind, let’s test them in code.

2. The Tree – shaking

The use of tree-shaking has already been described. In the next experiment, the index.js is created as the entry file and the generated code is packaged into bundle.js. In addition, the a.js, util.js and other files are referenced as dependency modules.

1) Eliminate variables

As you can see from the figure above, the variables b and c we defined are not used. They do not appear in the packaged file.

2) Elimination function

As you can see from the figure above, the util1() and util2() function methods, which are only introduced but not used, are not packaged.

3) to eliminate class

When only references are added but no calls are made

When referring to the class file Mixer.js but not using any of the menu methods and variables in the actual code, we can see that the elimination of the class methods has been implemented in the new version of Rollup!

4) side effects

However, not all side effects are being rolled up. See the relevant article. Rollup has a big advantage over Webpack in eliminating side effects. But rollup can’t help with side effects in the following cases:

2) Variables defined in the module affect global variables

You can see the results clearly by referring to the figure below, and you can go there yourselfPlatform provided by the rollup websitePut your hands into practice:

summary

As we can see from the above packaging results, the Rollup tool is very lightweight and concise for packaging, keeping only the required code from importing the dependent modules from the entry file to the output of the packaged bundle. In other words, no additional configuration is required in rollup packaging, as long as your code conforms to the ES6 syntax, you can implement tree-shaking. Nice!

So, tree-shaking in this packaging process can be roughly understood as having two key implementations:

ES6’s module introduction is statically analyzed to determine exactly what code has been loaded at compile time.
Analyze the program flow, determine which variables are being used, referenced, and package the code.

The core of Tree-Shaking is contained in this process of analyzing program flows: Based on scope, an object record is formed for a function or global object in the AST process, and then the identification of the import is matched in the whole formed scope chain object. Finally, only the matched code is packaged, and the code that is not matched is deleted.

But at the same time, we should also pay attention to two points:

Write as little code as possible that contains side effects, such as operations that affect global variables.
Referring to a class instantiation and calling a method on that instance can also have side effects that rollup cannot handle.

So how does this generate records and match identifiers in the program flow analysis process work?

Next take you into the source code, find out!

Third, tree-shaking implementation process

Before we can parse the tree-shaking implementation in the process, we need to know two things:

Tree-shaking in Rollup uses Acorn for traversal parsing of an AST abstract syntax tree. Acorn is the same as Babel, but Acorn is much lighter. Before that, AST workflows must also be understood.
Rollup uses the magic-String tool to manipulate strings and generate source-map.

Let’s start from the source code and describe the process in detail according to the core principles of Tree-Shaking:

In the rollup() phase, the source code is parses, the AST tree is generated, each node on the AST tree is traversed, the include is determined, the tag is determined, the chunks are generated, and finally the chunks are exported.
The generate()/write() phase, which collects code based on the markup made in the rollup() phase, and finally generates the actual code used.

Get the source code debug up ~

// perf-debug.js
loadConfig().then(async config => // Get the collection configuration
	(await rollup.rollup(config)).generate( 
		Array.isArray(config.output) ? config.output[0] : config.output
	)
);
Copy the code

This is probably the code you’re most concerned with when you’re debugging. In a nutshell, you’re packaging input into output, which corresponds to the process above.

export async function rollupInternal(
	rawInputOptions: GenericConfigObject, // Pass in the parameter configuration
	watcher: RollupWatcher | null
) :Promise<RollupBuild> {
	const { options: inputOptions, unsetOptions: unsetInputOptions } = awaitgetInputOptions( rawInputOptions, watcher ! = =null
	);
	initialiseTimers(inputOptions);

	const graph = new Graph(inputOptions, watcher); // The graph contains entries and dependencies, operations, caching, etc. The AST transformation is implemented inside the instance, which is the core of rollup

	constuseCache = rawInputOptions.cache ! = =false; // Select whether to use caching from the configuration
	delete inputOptions.cache;
	delete rawInputOptions.cache;

	timeStart('BUILD'.1);

	try {
    // Call the plug-in driver method, call the plug-in, provide the context for the plug-in environment, etc
		await graph.pluginDriver.hookParallel('buildStart', [inputOptions]); 
		await graph.build();
	} catch (err) {
		const watchFiles = Object.keys(graph.watchFiles);
		if (watchFiles.length > 0) {
			err.watchFiles = watchFiles;
		}
		await graph.pluginDriver.hookParallel('buildEnd', [err]);
		await graph.pluginDriver.hookParallel('closeBundle'[]);throw err;
	}

	await graph.pluginDriver.hookParallel('buildEnd'[]); timeEnd('BUILD'.1);

	const result: RollupBuild = {
		cache: useCache ? graph.getCache() : undefined.closed: false.async close() {
			if (result.closed) return;

			result.closed = true;

			await graph.pluginDriver.hookParallel('closeBundle'[]); },// generate - Generate new code by processing the traversal tags as output from the abstract syntax tree
		async generate(rawOutputOptions: OutputOptions) {
			if (result.closed) return error(errAlreadyClosed());
      // The first parameter isWrite is false
			return handleGenerateWrite(
				false,
				inputOptions,
				unsetInputOptions,
				rawOutputOptions as GenericConfigObject,
				graph
			);
		},
		watchFiles: Object.keys(graph.watchFiles),
		// write - Generate new code by processing traversal tags through the abstract syntax tree as output
		async write(rawOutputOptions: OutputOptions) {
			if (result.closed) return error(errAlreadyClosed());
      // The first parameter isWrite is true
			return handleGenerateWrite(
				true,
				inputOptions,
				unsetInputOptions,
				rawOutputOptions asGenericConfigObject, graph ); }};if (inputOptions.perf) result.getTimings = getTimings;
	return result;
}
Copy the code

From this piece of code alone, of course, we can’t see anything, let’s read the source code together to comb the rollup packaging process and explore the concrete implementation of Tree-shaking, in order to understand the packaging process more simply and directly, we will skip the plug-in configuration in the source code, only analyze the core process of functional process implementation.

1. Module parsing

Gets the absolute file path

The resolveId() method resolves the address of the file to get the absolute path of the file. Getting the absolute path is our main purpose, and the details are not analyzed here.

export async function resolveId(
	source: string,
	importer: string | undefined,
	preserveSymlinks: boolean,) {
	// Non-entry modules that do not begin with a. Or/are skipped in this step
	if(importer ! = =undefined && !isAbsolute(source) && source[0]! = ='. ') return null;
  // Call path.resolve to change the valid file path to an absolute path
	return addJsExtensionIfNecessary(
		importer ? resolve(dirname(importer), source) : resolve(source),
		preserveSymlinks
	);
}

/ / addJsExtensionIfNecessary () implementation
function addJsExtensionIfNecessary(file: string, preserveSymlinks: boolean) {
	let found = findFile(file, preserveSymlinks);
	if (found) return found;
	found = findFile(file + '.mjs', preserveSymlinks);
	if (found) return found;
	found = findFile(file + '.js', preserveSymlinks);
	return found;
}

/ / findFile () implementation
function findFile(file: string, preserveSymlinks: boolean) :string | undefined {
	try {
		const stats = lstatSync(file);
		if(! preserveSymlinks && stats.isSymbolicLink())return findFile(realpathSync(file), preserveSymlinks);
		if ((preserveSymlinks && stats.isSymbolicLink()) || stats.isFile()) {
			const name = basename(file);
			const files = readdirSync(dirname(file));

			if(files.indexOf(name) ! = = -1) returnfile; }}catch {
		// suppress}}Copy the code

A rollup (phase)

The Rollup () phase does a lot of work, including collecting configurations and standardizing them, analyzing files and compiling the source to generate the AST, generating modules and resolving dependencies, and finally generating chunks. To figure out exactly where tree-shaking works, we need to parse the code that is processed more internally.

First, find the module definition of the entry file by starting from its absolute path, and get all the dependent statements of the entry module and return everything.

private async fetchModule(
	{ id, meta, moduleSideEffects, syntheticNamedExports }: ResolvedId,
	importer: string | undefined.// Import the reference module for this module
	isEntry: boolean // Whether to enter the path) :Promise<Module> { 
  ...
   // Create a Module instance
	const module: Module = new Module(
		this.graph, // The Graph is a globally unique Graph that contains entries and dependencies, operations, caching, etc
		id,
		this.options,
		isEntry,
		moduleSideEffects, // Module side effects
		syntheticNamedExports,
		meta
	);
	this.modulesById.set(id, module);
	this.graph.watchFiles[id] = true;
	await this.addModuleSource(id, importer, module);
	await this.pluginDriver.hookParallel('moduleParsed'[module.info]);
	await Promise.all([
	  // Handle static dependencies
		this.fetchStaticDependencies(module),
		// Handle dynamic dependencies
		this.fetchDynamicDependencies(module)]);module.linkImports();
  // Return the current module
	return module;
}
Copy the code

The dependent module is further processed in fetchStaticDependencies(module) and fetchDynamicDependencies(module), respectively, and the contents of the dependent module are returned.

private fetchResolvedDependency(
	source: string,
	importer: string,
	resolvedId: ResolvedId
): Promise<Module | ExternalModule> {
	if (resolvedId.external) {
		const { external, id, moduleSideEffects, meta } = resolvedId;
		if (!this.modulesById.has(id)) {
			this.modulesById.set(
				id,
				new ExternalModule( // Create an external Module instance
					this.options, id, moduleSideEffects, meta, external ! = ='absolute' && isAbsolute(id)
				)
			);
		}

		const externalModule = this.modulesById.get(id);
		if(! (externalModuleinstanceof ExternalModule)) {
			return error(errInternalIdCannotBeExternal(source, importer));
		}
	  // Return the dependent module contents
		return Promise.resolve(externalModule);
	} else {
    // If there is an external reference imported into the module, we recursively retrieve all the dependent statements of the entry module
		return this.fetchModule(resolvedId, importer, false); }}Copy the code

Each file is a Module, and each Module has a Module instance. In the Module instance, the code of the Module file is parsed into an AST syntax tree by traversing Acorn’s parse method.

const ast = this.acornParser.parse(code, { ... (this.options.acorn asacorn.Options), ... options });Copy the code

Finally, the source is parsed and set to the current module, the conversion from file to module is completed, and the ES Tree node and the syntax trees of various types contained within it are parsed.

setSource({ alwaysRemovedCode, ast, code, customTransformCache, originalCode, originalSourcemap, resolvedIds, sourcemapChain, transformDependencies, transformFiles, ... moduleOptions }: TransformModuleJSON & { alwaysRemovedCode? : [number, number][]; transformFiles? : EmittedFile[] |undefined;
}) {
	this.info.code = code;
	this.originalCode = originalCode;
	this.originalSourcemap = originalSourcemap;
	this.sourcemapChain = sourcemapChain;
	if (transformFiles) {
		this.transformFiles = transformFiles;
	}
	this.transformDependencies = transformDependencies;
	this.customTransformCache = customTransformCache;
	this.updateOptions(moduleOptions);

	timeStart('generate ast'.3);

	this.alwaysRemovedCode = alwaysRemovedCode || [];
	if(! ast) { ast =this.tryParse();
	}
	this.alwaysRemovedCode.push(... findSourceMappingURLComments(ast,this.info.code));

	timeEnd('generate ast'.3);

	this.resolvedIds = resolvedIds || Object.create(null);

	this.magicString = new MagicString(code, {
		filename: (this.excludeFromSourcemap ? null: fileName)! .// Do not include helper plug-ins in Sourcemap
		indentExclusionRanges: []
	});
	for (const [start, end] of this.alwaysRemovedCode) {
		this.magicString.remove(start, end);
	}

	timeStart('analyse ast'.3);
  // Ast context, wrapper some methods, such as dynamic import, export, etc., a lot of things, take a look at the overview
	this.astContext = {
		addDynamicImport: this.addDynamicImport.bind(this), // Dynamic import
		addExport: this.addExport.bind(this),
		addImport: this.addImport.bind(this),
		addImportMeta: this.addImportMeta.bind(this),
		code,
		deoptimizationTracker: this.graph.deoptimizationTracker,
		error: this.error.bind(this),
		fileName,
		getExports: this.getExports.bind(this),
		getModuleExecIndex: () = > this.execIndex,
		getModuleName: this.basename.bind(this),
		getReexports: this.getReexports.bind(this),
		importDescriptions: this.importDescriptions,
		includeAllExports: () = > this.includeAllExports(true), // Include related method markup determines whether it is tree-shaking
		includeDynamicImport: this.includeDynamicImport.bind(this), // include...
		includeVariableInModule: this.includeVariableInModule.bind(this), // include...
		magicString: this.magicString,
		module: this.moduleContext: this.context,
		nodeConstructors,
		options: this.options,
		traceExport: this.getVariableForExportName.bind(this),
		traceVariable: this.traceVariable.bind(this),
		usesTopLevelAwait: false.warn: this.warn.bind(this)};this.scope = new ModuleScope(this.graph.scope, this.astContext);
	this.namespace = new NamespaceVariable(this.astContext, this.info.syntheticNamedExports);
  // Instantiate Program to assign the AST context to the ast attribute of the current module
	this.ast = new Program(ast, { type: 'Module'.context: this.astContext }, this.scope);
	this.info.ast = ast;

	timeEnd('analyse ast'.3);
}
Copy the code

2. Mark whether the module can be tree-shaking

Continue to process the current module and introduce the module and ES Tree node according to the status of isExecuted and related configuration of Treeshakingy. IsExecuted true means that the module has been added, and there is no need to add it again in the future. Finally, according to isExecuted, all required modules are collected to implement tree-shaking.

// For example, includeVariable() and includeAllExports() methods are not listed in one
private includeStatements() {
	for (const module of [...this.entryModules, ...this.implicitEntryModules]) {
		if (module.preserveSignature ! = =false) {
			module.includeAllExports(false);
		} else {
			markModuleAndImpureDependenciesAsExecuted(module); }}if (this.options.treeshake) {
		let treeshakingPass = 1;
		do {
			timeStart(`treeshaking pass ${treeshakingPass}`.3);
			this.needsTreeshakingPass = false;
			for (const module of this.modules) {
        // Mark according to isExecuted
				if (module.isExecuted) {
					if (module.info.hasModuleSideEffects === 'no-treeshake') {
						module.includeAllInBundle();
					} else {
						module.include(); / / tag
					}
				}
			}
			timeEnd(`treeshaking pass ${treeshakingPass++}`.3);
		} while (this.needsTreeshakingPass);
	} else {
		for (const module of this.modules) module.includeAllInBundle();
	}
	for (const externalModule of this.externalModules) externalModule.warnUnusedImports();
	for (const module of this.implicitEntryModules) {
		for (const dependant of module.implicitlyLoadedAfter) {
			if(! (dependant.info.isEntry || dependant.isIncluded())) { error(errImplicitDependantIsNotIncluded(dependant)); }}}}Copy the code

Module. include is an ES tree node, and the initial NodeBase include is false, so there is a second condition to determine whether the node has side effects. Whether this has side effects depends on the implementation of the various Node subclasses that inherit from NodeBase, and whether it affects the whole world. Different types of ES nodes within Rollup implement different hasEffects implementations. In the continuous optimization process, side effects of class references are handled and unused classes are eliminated. This can be further understood in the context of tree-shaking elimination in Chapter 2.

include(): void{/ include() implementationconst context = createInclusionContext();
	if (this.ast! .shouldBeIncluded(context))this.ast! .include(context,false);
}
Copy the code

3. TreeshakeNode () method

TreeshakeNode () is a method in the source code to remove code that is not useful, and it is clearly noted when called – to prevent repeated declarations of the same variables/nodes, and to indicate whether the node code is included, if so, tree-shaking, The removeAnnotations() method is also provided to remove unwanted commented code.

// Eliminate useless nodes
export function treeshakeNode(node: Node, code: MagicString, start: number, end: number) {
	code.remove(start, end);
	if (node.annotations) {
		for (const annotation of node.annotations) {
			if(! annotation.comment) {continue;
			}
			if (annotation.comment.start < start) {
				code.remove(annotation.comment.start, annotation.comment.end);
			} else {
				return; }}}}// Remove comment nodes
export function removeAnnotations(node: Node, code: MagicString) {
	if(! node.annotations && node.parent.type === NodeType.ExpressionStatement) { node = node.parentas Node;
	}
	if (node.annotations) {
		for (const annotation of node.annotations.filter((a) = > a.comment)) {
			code.remove(annotation.comment!.start, annotation.comment!.end);
		}
	}
}
Copy the code

When you call the treeshakeNode() method is important! Tree-shaking and recursively render before rendering.

render(code: MagicString, options: RenderOptions, nodeRenderOptions? : NodeRenderOptions) {
		const { start, end } = nodeRenderOptions as { end: number; start: number };
		const declarationStart = getDeclarationStart(code.original, this.start);

		if (this.declaration instanceof FunctionDeclaration) {
			this.renderNamedDeclaration(
				code,
				declarationStart,
				'function'.'('.this.declaration.id === null,
				options
			);
		} else if (this.declaration instanceof ClassDeclaration) {
			this.renderNamedDeclaration(
				code,
				declarationStart,
				'class'.'{'.this.declaration.id === null,
				options
			);
		} else if (this.variable.getOriginalVariable() ! = =this.variable) {
			// tree-shaking prevents repeated declarations of variables
			treeshakeNode(this, code, start, end);
			return;
      // included 标识做 tree-shaking
		} else if (this.variable.included) {
			this.renderVariableDeclaration(code, declarationStart, options);
		} else {
			code.remove(this.start, declarationStart);
			this.declaration.render(code, options, {
				isCalleeOfRenderedParent: false.renderedParentType: NodeType.ExpressionStatement
			});
			if (code.original[this.end - 1]! = ='; ') {
				code.appendLeft(this.end, '; ');
			}
			return;
		}
		this.declaration.render(code, options);
	}
Copy the code

There are several places like this where tree-shaking shines!

// Sure enough, we saw "included" again.if(! node.included) { treeshakeNode(node, code, start, end);continue; }...if (currentNode.included) {
	currentNodeNeedsBoundaries
		 ? currentNode.render(code, options, {
	  	end: nextNodeStart,
		  start: currentNodeStart
		 })
   : currentNode.render(code, options);
} else {
   treeshakeNode(currentNode, code, currentNodeStart!, nextNodeStart);
}
...
Copy the code

4. Generate code (string) with chunks and write it to a file

During the generate()/write() phase, the generated code is written to a file, and the handleGenerateWrite() method internally generates the bundle instance for processing.

async function handleGenerateWrite(.) {...// Generate the Bundle instance, which is a packaged object that contains all the module information
	const bundle = new Bundle(outputOptions, unsetOptions, inputOptions, outputPluginDriver, graph);
	// Call the generate method of the instance bundle to generate the code
	const generated = await bundle.generate(isWrite);
	if (isWrite) {
		if(! outputOptions.dir && ! outputOptions.file) {return error({
				code: 'MISSING_OPTION'.message: 'You must specify "output.file" or "output.dir" for the build.'
			});
		}
		await Promise.all(
		   // Here's the key: generate the code through chunkId and write it to a file
			Object.keys(generated).map(chunkId= > writeOutputFile(generated[chunkId], outputOptions))
		);
		await outputPluginDriver.hookParallel('writeBundle', [outputOptions, generated]);
	}
	return createOutput(generated);
}
Copy the code

summary

The bottom line is: start with the entry file, find all the variables it reads, find out where the variable is defined, include the definition statement, discard all irrelevant code, and get what you want.

conclusion

In this paper, based on the tree-shaking principle of the rollup source code in the packaging process, it can be found that for the simple packaging process, the source code does not do additional mysterious operations on the code. I just made the traversal tag to use the collection and package the output of the collected code and included the tag node treeshakeNode to avoid repeated declarations.

Of course, the most important part is the internal static analysis and collection of dependencies, which is a complicated process, but the core of the process is to walk through the node: find the variables that the current node depends on, the variables that are accessed, and the statements for those variables.

As a lightweight and fast packaging tool, Rollup has the advantage of being easy to package function libraries. Thanks to its code-handling advantages, the source code volume is also much lighter than Webpack, but I still find reading source code a boring process…

But! If only in line with the purpose of understanding the principle, might as well first only focus on the core code process, the details of the corners in the back, may be able to enhance the reading pleasure experience, accelerate the pace of the source code!

The resources

Tree-shaking with invalid code elimination
Tree-Shaking performance optimization practice – principles
Your tree-shaking doesn’t work for eggs
Rollup is as simple as tree Shaking

Where is the useless code? Tree-shaking of project weight reduction rollup

Understanding tree-shaking

1. What is tree-shaking?

2. Why do you need tree-shaking?

Second, in-depth understanding of tree-shaking principle

1. Elimination of dead code with DCE

2. The Tree – shaking

1) Eliminate variables

2) Elimination function

3) to eliminate class

4) side effects

summary

Third, tree-shaking implementation process

1. Module parsing

Gets the absolute file path

A rollup (phase)

2. Mark whether the module can be tree-shaking

3. TreeshakeNode () method

4. Generate code (string) with chunks and write it to a file

summary

conclusion

The resources

Related Posts

Vue’s keep-alive deepens the impression

What you need to know about browser caching

[Luffy]_leetcode-641- Design loop double end queue