Any project that goes through several iterations is bound to leave a lot of useless code behind.

The core idea

  1. Starting with the entry file (corresponding to entry in the WebPack configuration), recursively look for dependencies
  2. Iterate through all files in the project directory (usually SRC)
  3. Diff both arrays to get a list of unused module files

Find the dependence

In JavaScript, modules rely on AMD, CMD, CommonJS and ES6 specifications, but in WebPack system, CommonJS and ES6 specifications are only commonly used. So, we just need to consider looking for dependencies in JavaScript modules that use CommonJS and ES6 references.

In the CommonJS specification, we use the require function to reference other modules:

const a = require('/path/to/a');
Copy the code

In the ES6 specification, we use the import keyword to declare module references:

import b from '/path/to/b';
Copy the code

So, what we need to do is search the code for /path/to/a and /path/to/b. Regular expressions can be used to match:

const exp = /import.+? from\s*['"](.+?) ['"]|require\s*\(\s*['"](.+?) ['"]\s*\)/g;

const requests = [];
while (exp.exec(code)) {
  requests.push(RegExp. $1 | |RegExp. $2); }console.log(requests);
Copy the code

Regular expressions are a lightweight and compact solution, but for more precise matches (think of comments or strings where the expression matches), they may not be as useful. Therefore, we chose to use another approach – syntax tree-based analysis.

Walk through the syntax tree

We now have mature tools for generating JavaScript syntax trees. Code compression, code detection, Babel translation, and so on, are dependent on syntax tree generation tools. Here, we use Acorn to generate the syntax tree and acorn-walk to traverse the syntax tree.

const fs = require('fs');
const acorn = require('acorn');
const walk = require('acorn-walk');

const content = fs.readFileSync(file, 'utf8');
const tree = acorn.parse(content, {
  sourceType: 'module'}); walk.simple(tree, {CallExpression: (node) = > {
    const { callee: { name }, arguments: [{value: request } ] } = node;
    if (name === 'require') {
      // request: /path/to/a}},ImportDeclaration: (node) = > {
    const { source: { value: request } } = node;
    // request: /path/to/b}});Copy the code

Parsed expression

Usually we refer to modules using relative paths, based on the current module’s __dirname or node_modules. For Webpack, it also allows custom paths to resolve alias aliases. To find these modules, we need to parse module path expressions into real absolute file paths.

Here, we refer to NodeJS’s module lookup algorithm and add alias support. Here is the pseudocode.

Click here for the complete code

resolve(X) from module at path Y
1. If X is a core module,
    a. return X
    b. STOP
2. If X begins with '/'
    a. set Y to be the filesystem root
3. If X begins with '/' or '/' or '.. / '
    a. RESOLVE_AS_FILE(Y + X, EXTENSIONS)
    b. RESOLVE_AS_DIRECTORY(Y + X, EXTENSIONS)
4. RESOLVE_ALIAS(X, ALIAS)
5. RESOLVE_NODE_MODULES(X, dirname(Y))
6. THROW "not found"

RESOLVE_AS_FILE(X, EXTENSIONS)
1. If X is a file.  STOP
2. let I = count of EXTENSIONS - 1
3. while I >= 0,
    a. If `${X}{EXTENSIONS[I]}` is a file.  STOP
    b. let I = I - 1

RESOLVE_AS_DIRECTORY(X, EXTENSIONS)
1. If X/package.json is a file,
    a. Parse X/package.json, and look for "main" field.
    b. If "main" is a falsy value, GOTO 2.
    c. let M = X + (json main field)
    d. RESOLVE_AS_FILE(M, EXTENSIONS)
    e. RESOLVE_INDEX(M, EXTENSIONS)
2. RESOLVE_INDEX(X, EXTENSIONS)

RESOLVE_INDEX(X, EXTENSIONS)
1. let I = count of EXTENSIONS - 1
2. while I >= 0,
    a. If `${X}/index{EXTENSIONS[I]}` is a file.  STOP
    b. let I = I - 1

RESOLVE_ALIAS(X, ALIAS, EXTENSIONS)
1. let PATHS = ALIAS_PATHS(X, ALIAS)
2. for each PATH in PATHS:
    a. RESOLVE_AS_FILE(DIR/X, EXTENSIONS)
    b. RESOLVE_AS_DIRECTORY(DIR/X, EXTENSIONS)

ALIAS_PATHS(X, START, ALIAS)
1. let PATHS = []
2. for each KEY in ALIAS:
    a. let VALUE = ALIAS[KEY]
    b. if not X starts with KEY CONTINUE
    c. let PATH = X replace KEY with VALUE
    d. PATHS = PATHS + PATH
3. return PATHS

RESOLVE_NODE_MODULES(X, START, EXTENSIONS)
1. let DIRS = NODE_MODULES_PATHS(START)
2. for each DIR in DIRS:
    a. RESOLVE_AS_FILE(DIR/X, EXTENSIONS)
    b. RESOLVE_AS_DIRECTORY(DIR/X, EXTENSIONS)

NODE_MODULES_PATHS(START)
1. let PARTS = path split(START)
2. let I = count of PARTS - 1
3. let DIRS = [GLOBAL_FOLDERS]
4. while I >= 0,
    a. if PARTS[I] = "node_modules" CONTINUE
    b. DIR = path join(PARTS[0 .. I] + "node_modules")
    c. DIRS = DIRS + DIR
    d. let I = I - 1
5. return DIRS
Copy the code

Directory traversal

Traversing the directory is routine and can be done using the fs and PATH modules.

const fs = require('fs');
const path = require('path');

function readFileList(folder, filter, files = []) {
  if(! fs.existsSync(folder) || ! fs.statSync(folder).isDirectory())return files;

  fs.readdirSync(folder).forEach((file) = > {
    const fullPath = path.join(folder, file);
    const stat = fs.statSync(fullPath);
    if (stat.isFile()) {
      if (typeof filter === 'function') {
        if(! filter(fullPath))return;
      }
      files.push(fullPath);
    } else if(stat.isDirectory()) { readFileList(fullPath, filter, files); }});return files;
}
Copy the code

An array of diff

The simplest way is to diff two arrays and end up with useless modules using array.prototype. indexOf.

const modules = ['/path/to/a'.'/path/to/b'];
const scripts = ['/path/to/a'.'/path/to/b'.'/path/to/c'.'/path/to/d'];

const useless = scripts.filter(v= >! ~modules.indexOf(v));console.log(useless);
// prints: [ '/path/to/c', '/path/to/d' ]
Copy the code

The resources

  • acorn
  • The ESTree Spec
  • require.resolve
  • enhanced-resolve