Getting to know Js parsers

Parser is one of the most important tools in Webpack. Files of different types will be converted and output into a string of Js strings after being processed by Loader. The Parser then converts it into an AST syntax tree, which lets you do whatever you want with your code. The most important function of the tree is to figure out which modules the code depends on. Webpack does dependency composition and content output based on parsed information, so Parser plays a very important role and is worth looking into.

Because the parser overall code is still more complex, in addition to the resolution of dependencies also do a lot of other function processing, and dependent resolution is compatible with CommonJS, ES6Module, AMD, etc., easy to see the source code at the beginning of the circle. For the sake of simplicity, I’m not going to analyze all of this here, but most of the operations are pretty much the same, so if you understand how it works, you can do everything.

Parser parses CommonJS dependencies from the Parser Parser Parser.

// Normal import module operation
require('./increment')
Copy the code

Start parsing

Parsing the AST is handled by the Acorn library. The main content of the Parser is to parse the AST, and the initialized state is used to store the results of the parsing:

/** Parser.js */
class Parser {
    parse(code, initialState) {
        // Parse the code into an AST
        var ast = acornParser.parse(code);
        // The most frequent use of this context is to add dependencies to state.module
        this.state = initialState;
        // Handle parsing content
        this.walkStatements(ast.body);
        return this.state
    }
}
Copy the code

AST – Abstract syntax tree

The AST is a data structure used to represent source code. We can visualize the AST using an online parser. For example, the above statement will be converted to the following:

Parsing the AST

As you can see, just one line of code can be converted into a big chunk of content. The basic idea of parsing an AST is to walk through the AST and extract the data you want.

Since the AST syntax tree is a layer of nested structures with many types of structures, the AST code takes up most of the traversal. Fortunately, this part of the code is relatively simple, as long as the AST tree is easy to understand.

Extracting data is a complex point here. Parser itself only supports scope and some core content processing related to expressions, and other capabilities including parsing dependencies are provided by various plug-ins. This makes Parser very flexible and extensible, and of course the code debugging is much more complicated. So here we will handle CommonJS and other related plug-ins as synchronous calls to facilitate analysis.

Let’s take a look at the execution code:

/** Parser.js */
// iterate over all statements
walkStatements(statements) {
    for (let index = 0, len = statements.length; index < len; index++) {
        const statement = statements[index];
        this.walkStatement(statement); }}// Process a single statement, passing it to the corresponding type of statement handler, in this case the expression statement
walkStatement(statement) {
    switch (statement.type) {
        case "ExpressionStatement":
            this.walkExpressionStatement(statement);
            break;
        // ...}}// Handle the concrete expression content
walkExpressionStatement(statement) {
    this.walkExpression(statement.expression);
}
// Give different types of expressions to handle functions, in this case function call expressions
walkExpression(expression) {
    switch (expression.type) {
        case "CallExpression":
            this.walkCallExpression(expression)
            break;
        // ...}}// Handle function call expressions
walkCallExpression(expression) {
    // "callee": { "type": "Identifier", "name": "require" },
    // "arguments": [ { "type": "Literal", "value": "./increment", "raw": "'./increment'" } ]
    const callee = this.evaluateExpression(expression.callee);
    // The function call type is' require(XXX) ', i.e., 'require' is an identifier, other cases such as' a. reire (xx) 'is a member function are filtered out
    if (callee.isIdentifier()) {
        /** CommonJsRequireDependencyParserPlugin.js */
        const param = parser.evaluateExpression(expression.arguments[0]);
        // Arguments are parses of strings
        if (param.isString()) {
            // Add module dependencies to recursively load parsing modules
            const dep = newCommonJsRequireDependency(param.string, param.range); dep.loc = expr.loc; dep.optional = !! parser.scope.inTry; parser.state.current.addDependency(dep);// Another dependency that converts' require 'in code to' __webpack_require__ '
            const dep = new RequireHeaderDependency(expression.callee.range);
            dep.loc = expression.loc;
            parser.state.current.addDependency(dep);
            return}}else 
    // require(1 > 0? './example' : './increment')`
    if(param.isConditional()) {
        // ... 
    }
    // ...
}
// Evaluates the value of the expression
evaluateExpression(expression) {
    let result
    switch(expression.type) {
        case "Identifier":
            result = evaluateIdentifierExpression(expression)
            break;
        case "Literal": 
            result = evaluateLiteralExpression(expression)
            break;
    }
    if(result ! = =undefined) {
        result.setExpression(expression);
        returnresult; }}// The expression method that handles the Identifier type, this.rest.evaluate. For ("Identifier")
evaluateIdentifierExpression(expression) {
    if(expression.name === 'require') {
        /** CommonJsPlugin.js */
        let evex = new BasicEvaluatedExpression()
            .setIdentifier('require')
            .setRange(expr.range);
        returnevex; }}// The expression method to handle Literal types, this.rest.evaluate. For ("Literal")
evaluateLiteralExpression(expression) {
    switch (typeof expr.value) {
        case "number":
            return new BasicEvaluatedExpression()
                .setNumber(expr.value)
                .setRange(expr.range);
        case "string":
            return new BasicEvaluatedExpression()
                .setString(expr.value)
                .setRange(expr.range);
        case "boolean":
            return newBasicEvaluatedExpression() .setBoolean(expr.value) .setRange(expr.range); }}Copy the code

At the end

After an operation above, output for two dependent CommonJsRequireDependency and RequireHeaderDependency, thus Webpack successfully convert the file content to internal objects, these objects will play a role in the subsequent points depend on and when it is output.

Of course, only the simple Parser execution process is analyzed here. There may be some unfamiliar terms in the middle, as long as you can understand the results of their implementation, the content will be analyzed in detail later.

Refer to the article

AST online compilation and viewing