preface

When it comes to compilation, it’s near and far for a small front end, far because you probably don’t even need to touch it if you’re just doing business development; But as you dig deeper, you’ll find that the front-end world is now filled with applications that are very close to you, like webPack, rollup, Babel, and even PostCSS, Recent multipurpose packaging frameworks Taro and MPVue also rely on the application of compilation principles;

I recall the compiler principle courses and books in college, which are very difficult to understand. After work, if the work content is disconnected from the compiler principle, it is inevitable that I will become more and more unfamiliar. However, if you want to go to a higher level of technology, as one of the computer basic disciplines of the compilation principle, it is necessary to master;

In the process of using Babel to optimize the code, I slowly have a further understanding of the compilation principle, inspired by predecessors (JSJS), try to implement a JS interpreter, with JS to explain JS;

Just to be clear: there’s a lot more to compilation than this article covers.

Warehouse address: github.com/jackie-gan/…

The preparatory work

What is a JS interpreter in the first place? In simple terms, it is to use JS to run JS;

Since you need to use JS to run JS, you need to understand JS, and actively execute JS;

For example, how to execute console.log(‘123’); Where is the statement:

  • First you need to interpret the individual words in the JS statementconsole,log,'123';
  • And find out what syntax they belong to, for exampleconsole.log('123');Actually belongs to one of theCallExpression.calleeIs aMemberExpression;
  • Finally foundMemberExpressionIn theconsoleObject and find it againlogFunction, and finally execute the function, output123;

So in order to read and execute JS, you need to use the following tools:

  • Acorn, code parsing tool, can convert JS code into the corresponding AST syntax tree;
  • Astexplorer to visually view the AST syntax tree;

Implementation approach

  • The first step is byacornConvert the code toASTThe syntax tree.
  • The second step is to customize the traverser and node processing functions.
  • Third, in the processing function, execute the object code, and recursively execute;
  • In step 4, the interpreter entry function processes the first oneASTNode;

Traverser implementation

The AST syntax tree converted by Acorn conforms to the ESTree specification, such as viewing console.log(‘123’) through AstExplorer; The AST syntax tree after statement transformation looks like this:

As you can see, there are different node types in the syntax tree, so we need to continue to define the handler function for each node:

const es5 = {
  Program() {},
  ExpressionStatement() {},
  BlockStatement() {},
  ThisExpression() {},
  ObjectExpression() {},
  BinaryExpression() {},
  Literal() {},
  Identifier() {},
  VariableDeclaration() {},
  ...
};
Copy the code

Next, we need to implement a traverser that recursively traverses the nodes of the syntax tree in order to finally traverse the syntax tree:

constvistorsMap = { ... es5 };export function evaluate(astPath: AstPath<ESTree.Node>) {
  const visitor = vistorsMap[astPath.node.type];

  return visitor(astPath);
}
Copy the code

Node handler function

The processing of AST syntax tree nodes is the same as the processing of DOM tree nodes. After traversing the nodes, the nodes can be processed according to the specification.

At present, only the code interpretation of ES5 specification has been implemented in this paper, so the nodes to be processed are mainly ES5 nodes. The following are examples of the processing methods of some nodes:

The Program node

As the root node of the entire AST syntax tree, only the body property of the node is traversed successively, and the order of the nodes in the body is the execution order of JS statements.

Program: (astPath: AstPath<ESTree.Program>) = > {
  const { node, scope, evaluate } = astPath; 
  node.body.forEach((bodyNode) = > {
    evaluate({ node: bodyNode, scope, evaluate });
  });
},
Copy the code

BinaryExpression node

To process a binary operation expression node, we need to first evaluate the left and right expressions, then perform the corresponding calculation according to the operator, and finally return the processing result.

BinaryExpression: (astPath: AstPath<ESTree.BinaryExpression>) = > {
  const { node, scope, evaluate } = astPath;
  const leftVal = evaluate({ node: node.left, scope, evaluate });
  const rightVal = evaluate({ node: node.right, scope, evaluate });
  const operator = node.operator;

  const calculateFunc = {
    '+': (l, r) = > l + r,
    The '-': (l, r) = > l - r,
    The '*': (l, r) = > l * r,
    '/': (l, r) = > l / r,
    The '%': (l, r) = > l % r,
    '<': (l, r) = > l < r,
    '>': (l, r) = > l > r,
    '< =': (l, r) = > l <= r,
    '> =': (l, r) = > l >= r,
    '= =': (l, r) = > l == r,
    '= = =': (l, r) = > l === r,
    '! = ': (l, r) = >l ! = r,'! = = ': (l, r) = >l ! == r };if (calculateFunc[operator]) return calculateFunc[operator](leftVal, rightVal);
  else throw `${TAG} unknow operator: ${operator}`;
}
Copy the code

WhileStatement node

The node of the While loop contains the test and body attributes; The test attribute is a condition of the while loop, so it continues to iterate recursively, while the body represents the logic inside the while loop, which also continues to iterate recursively;

WhileStatement: (astPath: AstPath<ESTree.WhileStatement>) = > {
  const { node, scope, evaluate } = astPath;
  const { test, body } = node;

  while (evaluate({ node: test, scope, evaluate })) {
    const result = evaluate({ node: body, scope, evaluate });

    if (Signal.isBreak(result)) break;
    if (Signal.isContinue(result)) continue;
    if (Signal.isReturn(result)) returnresult.result; }}Copy the code

It is important to note that in a While loop, the keyword break, continue, or return may be encountered to terminate the loop logic; So you need to do extra processing for these keywords;

Keyword processing

BreakStatement, ContinueStatement, and ReturnStatement are also available. We need to define another keyword base class, Signal, whose instance is the return value of these key byte-point functions so that they can be processed by the upper level;

BreakStatement: (a)= > {
  // Return the result to the previous level
  return new Signal('break');
}

ContinueStatement: (a)= > {
  // Return the result to the previous level
  return new Signal('continue');
}

ReturnStatement: (astPath: AstPath<ESTree.ReturnStatement>) = > {
  const { node, scope, evaluate } = astPath;
  // Return the result to the previous level
  return new Signal('return', node.argument ? evaluate({ node: node.argument, scope, evaluate }) : undefined);
}
Copy the code

The Signal base class is as follows:

type SignalType = 'break' | 'continue' | 'return';

export class Signal { public type: SignalType public value? : anyconstructor(type: SignalType, value? : any) {this.type = type;
    this.value = value;
  }

  private static check(v, t): boolean {
    return v instanceof Signal && v.type === t;
  }

  public static isContinue(v): boolean {
    return this.check(v, 'continue');
  }

  public static isBreak(v): boolean {
    return this.check(v, 'break');
  }

  public static isReturn(v): boolean {
    return this.check(v, 'return'); }}Copy the code

More node processing

Because there are too many AST node types, this article will be too long. If you need to see the processing of other nodes, you can directly go to the Git repository to check.

When dealing with the VariableDeclaration node, that is, the VariableDeclaration, there is a problem: where should defined variables be stored?

This is where the concept of scope comes in;

scope

As we all know, JS has the concept of global scope, function scope, block level scope;

Variables defined in the global context should be stored in the global context, while variables defined in the function context should be stored in the function scope.

export class Scope {
  private parent: Scope | null;
  private content: { [key: string]: Var };
  public invasive: boolean;

  constructor(public readonly type: ScopeType, parent? : Scope) {this.parent = parent || null;
    this.content = {};  // The variable of the current scope
  }

  /** * is stored in the upper scope */
  public var(rawName: string, value: any): boolean {
    let scope: Scope = this;

    // function is defined in function scope
    while(scope.parent ! = =null&& scope.type ! = ='function') {
      scope = scope.parent;
    }

    scope.content[rawName] = new Var('var', value);
    return true;
  }

  /** * only defines */ in the current scope
  public const(rawName: string, value: any): boolean {
    if (!this.content.hasOwnProperty(rawName)) {
      this.content[rawName] = new Var('const', value);
      return true;
    } else {
      // It is already defined
      return false; }}/ * * * * /
  public let(rawName: string, value: any): boolean {
    if (!this.content.hasOwnProperty(rawName)) {
      this.content[rawName] = new Var('let', value);
      return true;
    } else {
      // It is already defined
      return false; }}/** * find variable */ from scope
  public search(rawName: string): Var | null {
    // 1. Search from the current scope first
    if (this.content.hasOwnProperty(rawName)) {
      return this.content[rawName];
    // 2. If not, go to the higher level
    } else if (this.parent) {
      return this.parent.search(rawName);
    } else {
      return null;
    }
  }

  public declare(kind: KindType, rawName: string, value: any): boolean {
    return ({
      'var': (a)= > this.var(rawName, value),
      'const': (a)= > this.const(rawName, value),
      'let': (a)= > this.let(rawName, value) })[kind](); }}Copy the code

When you encounter a BlockStatement, you need to form a Scope instance because variables such as const and let definitions form block-level scopes and their values are stored in the current block-level Scope.

The var variable in the BlockStatement still needs to be defined in the upper scope until the function scope is reached. Therefore, when defining var, the following processing is performed:

public var(rawName: string, value: any): boolean {
  let scope: Scope = this;

  // define the variable to the parent scope
  while(scope.parent ! = =null&& scope.type ! = ='function') {
    scope = scope.parent;
  }

  scope.content[rawName] = new Var('var', value);
  return true;
}
Copy the code

The entry function

Now that we have the ability to read and execute JS, let’s define an entry function and output the result of the execution:

export function execute(code: string, externalApis: any = {}) {
  // Global scope
  const scope = new Scope('root');
  scope.const('this'.null);

  for (const name of Object.getOwnPropertyNames(defaultApis)) {
    scope.const(name, defaultApis[name]);
  }

  for (const name of Object.getOwnPropertyNames(externalApis)) {
    scope.const(name, externalApis[name]);
  }

  // Module export
  const $exports = {};
  const $module = { exports: $exports };
  scope.const('module', $module);
  scope.var('exports', $exports);

  const rootNode = acorn.parse(code, {
    sourceType: 'script'
  });

  const astPath: AstPath<ESTree.Node> = {
    node: rootNode,
    evaluate,
    scope
  }

  evaluate(astPath);

  // Export the result
  const moduleExport = scope.search('module');

  return moduleExport ? moduleExport.getVal().exports : null;
}
Copy the code

The entry function execute takes two arguments:

  • Code is the code converted to a string;
  • ExternalApis for some"Built-in"Object;

Output of the result of an entry function, implemented through a custom module.exports object;

What can be done now

The interpreter is only a prototype and can only do simple JS interpretations for now:

  • For example, run through written test cases;
  • Do some simple code execution in a small program like environment;

If the packed JS is run in a small program, its running effect is as follows:

  • Sample code:
const interpreter = require('./interpreter');

// Example 1:
interpreter.execute('wx.showmodal ({title:' instance one ', success: function() {wx.showtoast ({title: 'click the button'}); }}); `, { wx: wx });

// Example 2:
interpreter.execute('setTimeout(function() {wx.showtoast ({title:' countdown completed '}); }, 1000); `, { wx: wx });

Copy the code
  • The effect is as follows:

conclusion

In the process of implementing JS interpretation, it is also a process of further understanding JS language. We will continue to refine the interpreter, for example:

  • Handle variable promotion;
  • Provide more ES6 + processing;
  • , etc.

I first cast a brick to introduce jade, welcome everybody big guy exchange!