An overview of the

At present, Wepack seems to have become one of the indispensable tools in the front-end development, and his idea of all modules along with the continuous iteration of Webpack version (Webpack 4) makes its packaging speed faster, more efficient for our front-end engineering services

I believe you have been very skilled in using Webpack, he through a configuration object, including the entry, exit, plug-in configuration, and so on, and then according to the internal configuration object to package the entire project project, from a JS file (this is a single entry, Of course, you can also set the multi-entry file packaging), all the dependent files in the file will be packaged according to our needs by the specific loader and plug-in, so that in the face of the current ES6, SCSS, LESS, postCSS can be used freely. The packaging tool will help us get them running correctly in the browser. It’s a time saver and a worry saver.

So what’s at the heart of today’s packaging tools? Today we will simulate a small packaging tool to explore his core principle. Some knowledge in the article is point to, not deep dig, if interested can consult the information.

My skills are not good enough. I only have a basic understanding of the core principles and simple functions of packaging tools

The project address

Pack: Click github

The principle of

When we have a deeper understanding of the javascript language, to know some of the lower level of javascript implementation, it is very helpful for us to understand a good open source project, of course, it will be more helpful for our own technical improvement. Javascript is a weakly typed interpreted language, which means that we don’t need a compiler to compile a version for us to execute before we can execute it. For javascript, there is a compilation process, but most of the compilation takes place a few microseconds before the code is executed, and execution takes place as soon as it’s finished. That is, dynamically compile the code as it executes. In the process of compilation, a Syntax Tree is obtained through the analysis of Syntax and morphology, which can be called AST. That is, the source code of a programming language that maps the statements in the source code to each node in the syntax tree. . This AST allows us to analyze the core of the packaging tool.

We’re all familiar with Babel, and what makes front-end programmers happy is that it lets us write ES6, ES7, and ES8….. And so on, and he will help us all into a browser to perform ES5 version, its core is through a Babylon, js lexical parsing engine version to analyze we write ES6 + syntax for the AST (abstract syntax tree), and by the depth of the syntax tree traversal of the tree structure and data changes. Finally, the ES5 syntax is generated from the cleaned and modified AST. This is the main core of our use of Babel. Here is an example of a syntax tree

Files to convert (index.js)

    // es6 index.js
    import add from './add.js'
    let sum = add(1.2);
    export default sum
    // ndoe build.js
    const fs = require('fs')
    const babylon = require('babylon')

    // Read the contents of the file
    const content = fs.readFileSync(filePath, 'utf-8')
    // Generate AST via Babylon
    const ast = babylon.parse(content, {
        sourceType: 'module'
    })
    console.log(ast)
Copy the code

Execution file (build.js in Node)

    // node build.js
    // Introduce fs and Babylon engines
    const fs = require('fs')
    const babylon = require('babylon')

    // Read the contents of the file
    const content = fs.readFileSync(filePath, 'utf-8')
    // Generate AST via Babylon
    const ast = babylon.parse(content, {
        sourceType: 'module'
    })
    console.log(ast)
Copy the code

The generated AST

ast = { ... . comments:[],tokens:[Token {
                    type: [KeywordTokenType],
                    value: 'import'.start: 0.end: 6.loc: [SourceLocation] },
                Token {
                    type: [TokenType],
                    value: 'add'.start: 7.end: 10.loc: [SourceLocation] },
                Token {
                    type: [TokenType],
                    value: 'from'.start: 11.end: 15.loc: [SourceLocation] },
                Token {
                    type: [TokenType],
                    value: './add.js'.start: 16.end: 26.loc: [SourceLocation] },
                Token {
                    type: [KeywordTokenType],
                    value: 'let'.start: 27.end: 30.loc: [SourceLocation] },
                Token {
                    type: [TokenType],
                    value: 'sum'.start: 31.end: 34.loc: [SourceLocation] }, ... . Token {type: [KeywordTokenType],
                    value: 'export'.start: 48.end: 54.loc: [SourceLocation] },
                Token {
                    type: [KeywordTokenType],
                    value: 'default'.start: 55.end: 62.loc: [SourceLocation] },
                Token {
                    type: [TokenType],
                    value: 'sum'.start: 63.end: 66.loc: [SourceLocation] },
            ]
   }
Copy the code

The example above is the parsed AST syntax tree. As Babylon analyzes the source code, it reads letter by letter like a scanner, then analyzes the syntax tree. (about the syntax tree and Babylon, you can refer to www.jianshu.com/p/019d449a9.) . Modify its properties or values through traversal and recompose the code according to the corresponding algorithm rules. When analyzing our normal JS files, the AST is often very large or even tens of thousands of lines, so it requires a very good algorithm to ensure speed and efficiency. Babel-traverse is used to parse the AST. If you’re interested in algorithms, you can go and look at them. The knowledge points mentioned above are not in-depth. The reason is just like the title. I just want to explore the principle of packaging tools. The principle part is roughly introduced here, let’s start the actual combat.

Project directory

├ ─ ─ the README. Md ├ ─ ─ package. The json ├ ─ ─ the SRC │ ├ ─ ─ lib │ │ ├ ─ ─ bundle. Js / / generated documents after packaging │ │ ├ ─ ─ getdep. Js / / file dependencies from AST │ │ ├ ─ ├ ─ garbage, └─ garbage, └─ garbage, └─ garbage, └─ garbage, └─ garbage, └─ garbage, └─ garbage, └─ garbage, └─ garbageCopy the code

Mind mapping

The specific implementation

Process combing found that the focus is to find dependencies in each file, we use DEPS to collect dependencies. Modularized packaging of dependencies layer by layer. Let’s do this step by step

Comb through process mainly through code + explanation

Read file code

First, we need a path to the entry file, read the code from the specified file through node’s FS module, then analyze the code from Babylon to get the AST syntax tree, and then use the Babel-traverse library to get the module (path) with import from the AST. That is, dependencies. We push the relative paths of all dependent files of the current module into a DEPS array. So you can go through later looking for dependencies.

    const fs = require('fs')
    // Analysis engine
    const babylon = require('babylon')
    // traverse syntax tree traversal etc
    const traverse = require('babel-traverse').default
    // Syntax conversion provided by Babel
    const { transformFromAst } = require('babel-core')
    // Read the file code function
    const readCode = function (filePath) {
        if(! filePath) {throw new Error('No entry file path')
            return
        }
        // Dependency collection for the current module
        const deps = []
        const content = fs.readFileSync(filePath, 'utf-8')
        const ast = babylon.parse(content, { sourceType: 'module' })
        // Analyze the AST to get the module information (path)
        // The ImportDeclaration method is a callback when an import is iterated over
        traverse(ast, {
            ImportDeclaration: ({ node }) = > {
                // Push dependencies into deps
                // If there are multiple dependencies, use arrays
                deps.push(node.source.value)
            }
        })
        // Es6 is converted to ES5
        const {code} = transformFromAst(ast, null, {presets: ['env']})
        // Return an object
        // There are paths, dependencies, converted ES5 code
        // And a module id (custom)
        return {
            filePath,
            deps,
            code,
            id: deps.length > 0 ? deps.length - 1 : 0}}module.exports = readCode
Copy the code

I believe that the above code is understandable, the code of the comments written in detail, here is not much wordy. Note that the Babel-traverse library has very little information about the API and detailed descriptions, so check out other ways to see how the library works. In addition, it should be emphasized that the return value of the last function is an object that contains some important information in the current file (module). The DEPS store is all the dependent file paths obtained by the current module analysis. Finally, we need to recursively traverse all the dependencies of each module, as well as the code. This will be used later in the dependency collection.

Depend on the collection

By reading the file method above we get back some important information about a single file (module). FilePath (filePath),deps(all dependencies of the module),code(converted code),id(id of the object module) we define deps as an array, To store the above important information object of each file (module) in all dependencies. Next, we collect the dependency of the module through the dependency of the single file entry, and the dependency of the module…… We execute the readCode method recursively and loopically, pushing the object returned by readCode into the DEPS array each time. Finally, we get all the important information and dependencies of each module in the dependency chain.

    const readCode = require('./readcode.js')
    const fs = require('fs')
    const path = require('path')
    const getDeps = function (entry) {
        // The important information object of the main entry file module returned by reading file analysis
        const entryFileObject = readCode(entry)
        // deps is a composite array of important information objects for each dependency or module
        // Deps is the final core data we mentioned to build the entire package
        const deps = [entryFileObject ? entryFileObject : null]
        // Iterate through the deps
        // Get the filePath information and check whether it is a CSS file or a JS file
        for (let obj of deps) {
            const dirname = path.dirname(obj.filePath)
            obj.deps.forEach(rPath= > {
                const aPath = path.join(dirname, rPath)
                if (/\.css/.test(aPath)) {
                    // If it is a CSS file, readCode does not recursively analyze the code,
                    // Write the code directly to the style tag via js operations
                    const content = fs.readFileSync(aPath, 'utf-8')
                    const code = `
                    var style = document.createElement('style')
                    style.innerText = The ${JSON.stringify(content).replace(/\\r\\n/g.' ')}
                    document.head.appendChild(style)
                    `
                    deps.push({
                        filePath: aPath,
                        reletivePaht: rPath,
                        deps,
                        code,
                        id: deps.length > 0 ? deps.length : 0})}else {
                    // If it is a js file, continue to call readCode to analyze the code
                    let obj = readCode(aPath)
                    obj.reletivePaht = rPath
                    obj.id = deps.length > 0 ? deps.length : 0
                    deps.push(obj)
                }
            })
        }
        / / return deps
        return deps
    }

module.exports = getDeps
Copy the code

There may be some questions in the code above, maybe it’s a little bit tricky to loop and repeat the call when you’re going through all the dependencies on dePS, and there may be some questions about what the deps array is going to do at the end of the day.

The output file

Till now, we have been able to get all the documents and the corresponding dependence and the converted file code and id, yes, is our section on the returns of deps (on it), is expected to some question in the last section, then, we can directly on the code, slowly, slowly to solve your doubts.

    const fs = require('fs')
    // A library to compress code
    const uglify = require('uglify-js')
    // Four parameters
    // 1. All dependent arrays return values in the previous section
    // 2. Main entry file path
    // 3. Export file path
    // 4. Whether to compress the output file code
    // All the above three parameters, except the first deps, need to be passed in the main entry method of the project to configure the object
    const bundle = function (deps, entry, outPath, isCompress) {
        let modules = ' '
        let moduleId
        deps.forEach(dep= > {
            var id = dep.id
            // Here's the thing
            // Here, the module "id" through deps is used as an attribute, and its attribute value is a function
            // The function body is the "code" of the module currently traversed, i.e. the converted code
            // Produces a long character
            / / 0: function (...) {... },
            // 1: function(......) {... }
            // ...
            modules = modules + `${id}: function (module, exports, require) {${dep.code}}, `
        });
        // The self-executing function, the passed object just concatenated, and the deps
        // Require allows us to customize modularity to mimic commonJS
        let result = `
            (function (modules, mType) {
                function require (id) {
                    var module = { exports: {}}
                    var module_id = require_moduleId(mType, id)
                    modules[module_id](module, module.exports, require)
                    return module.exports
                }
                require('${entry}') ({})${modules}},The ${JSON.stringify(deps)}); function require_moduleId (typelist, id) { var module_id typelist.forEach(function (item) { if(id === item.filePath || id === item.reletivePaht){ module_id =  item.id } }) return module_id } `
        // Determine whether to compress
        if(isCompress) {
            result = uglify.minify(result,{ mangle: { toplevel: true } }).code
        }
        // Write file output
        fs.writeFileSync(outPath + '/bundle.js', result)
        console.log([success] (./bundle.js))}module.exports = bundle
Copy the code

Again, I want to describe it in detail. Because we want to output the file, a large number of strings appear. The modules string is the final string that is obtained by iterating through deps

    modules = '0: function (module, module. Exports, require){/ / function (module, module. Exports, require){/ / function (module, module. Function (module, module. Exports, require){/ / function (module, module. Exports, require){/ / function (module, module. . `
Copy the code

If we add “{” and”} “to both ends of the string, it is an object if it is executed as code. Yeah, so 0,1,2,3… It becomes a property, and the value of the property is a function, so you can call the function directly from the property. The content of this function is the Babel transformed code for each module we need to package. Explanation 2: Result string

    // The self-executing function passes the modules string above to {}
    (function (modules, mType) {
        // Create a custom require function that mimics commonJS modularity
        function require (id) {
            // Defines the Module object and its exports property
            var module = { exports: {}}
            // Convert path and id, related functions have been called
            var module_id = require_moduleId(mType, id)
            // Calls the function passed in the properties of the Modules object
            modules[module_id](module.module.exports, require)
            return module.exports
        }
        require('${entry}')
    })({${modules}},${JSON.stringify(deps)});

    // Convert path to id in order to call the function of the corresponding id attribute under the corresponding path
    function require_moduleId (typelist, id) {
        var module_id
        typelist.forEach(function (item) {
            if(id === item.filePath || id === item.reletivePaht){
                module_id = item.id
            }
        })
        return module_id
    }
Copy the code

The require_modulesId function is used to convert the path to the ID. The following is an example of using the require_modulesId function to convert the path to the ID:

    import a from './a.js'
    let b = a + a
    export default b
Copy the code

ES5 code:

    'use strict';

    Object.defineProperty(exports, "__esModule", {
        value: true
    });

    var _a = require('./a.js');

    var _a2 = _interopRequireDefault(_a);
    function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
    var b = _a2.default + _a2.default;
    
    exports.default = b;
Copy the code

Var _a = require(‘./a. Js ‘); , the require parameter he converted for us is the path of the file, and we need to call the corresponding module function whose attribute value is id(0,1,2,3…). Function (module, module.exports, require){function (module, module.exports, require){function (module, module.exports, require){function (module, module. }, because Babel uses the CommonJS specification for our code modularization.

The last

The final step is to encapsulate it by exposing an entry function. This step mimics the WebPack API, where a Pack method passes in a Config configuration object. This can be done by writing scripts NPM/YARN in package.json.

    const getDeps = require('./lib/getdep')
    const bundle = require('./lib/bundle')

    const pack = function (config) {
    if(! config.entryPath || ! config.outPath) {throw new Error('Pack Tools: Please configure entry and exit paths')
        return
    }
    let entryPath = config.entryPath
    let outPath = config.outPath
    let isCompress = config.isCompression || false

    let deps = getDeps(entryPath)
    bundle(deps, entryPath, outPath, isCompress)

}

module.exports = pack
Copy the code

The config passed in has only three attributes: entryPath, outPath, and isCompression.


conclusion

A simple implementation, just to explore the principle, does not have complete functionality and stability. Hopefully it will help those who see it

Packaging tools, first through our code files for lexical and syntax analysis, generate AST, and then by processing the AST, can eventually transform into what we want and browser compatible code, collect each file depend on, eventually forming a dependency chain, then through the dependencies finally output file after packaging.

Please be more understanding if there are any improper or wrong explanations. If you have any questions, you can communicate in the comments section. And don’t forget your 👍…