preface

AST is a very important knowledge point, but I may not spend too much time to understand it. This article mainly introduces some personal understanding of AST and myself, including the process of compiling it into code, and some application scenarios in some front-end compilation tools. Including the following 👇🏻 points, you can know a little about:

  • ES6 is converted to ES5 in Babel
  • How does VUE template compilation apply to AST
  • CSS pretreatment
  • Develop the WebPack plug-in
  • UglifyJS to compress the code

You will also learn about the basics of AST abstract syntax trees:

  1. How does the JS engine parse the code? Process?
  2. What is abstract syntax tree AST?

The lexical unit flow is replaced by a nested tree of elements that represents the syntactic structure of the program

  1. What is the use of the abstract syntax tree AST, other than parsing code, in a project?
  2. What is the structure of AST? (Lay a foundation for subsequent use)
  3. How does the AST compile? (The specific process should be clear and explained clearly) (different compiled plug-ins are involved (understandable))

What is AST

1.1 JS engine parsing process

When the engine encounters a JS script, it waits for it to execute and actually needs to be parsed by the engine

JS is an interpreted language that does not need to be compiled in advance, but is run in real time by the interpreter

  1. Read the code, perform lexical analysis, and then decompose the code into tokens
  2. Parsing the lexical elements, and then organizing the code into a syntax tree
  3. Using a translator, the code is converted to bytecode
  4. Bytecode is converted into machine code using a bytecode interpreter

Modern browsers use just-in-time compilation (JIT) for speed

1.2 Parser Compiles the AST

Js Parser is a Parser that converts JS source code into an abstract syntax tree

The parsing process is divided into two steps:

  1. Word segmentation: Splits the entire code string into an array of minimal syntax units

Ps: Grammatical units are the smallest units with practical meaning in the parsed grammar, which are simply understood as words in natural language.

2. Grammatical analysis: Establish and analyze the relationship between grammatical units on the basis of word segmentation

Syntax units in JS code mainly include the following categories:

Keywords, identifiers, operators, numbers, strings, Spaces, comments, others.

AST tool (esprima.org/demo/parse….). It takes four steps to compile

1. Lexical analysis scanner

The lexical analysis node first scans the code, which generates a token stream

var name = 'jingda'
Copy the code
  1. We can judge by conditional statement, this character is a letter “/”, “digital”, space, “(“,”) “, “;” And so on.
  2. If it’s a letter or a number, the process continues until it’s not, at which point the string found is a “var”, is a Keyword, and the next character is a “space”, generating {“type” : “Keyword”, “value” : “var”} into the array.
  3. It goes down and finds the letter ‘name’ (because the last value it found was’ var ‘) followed by a space and generates {“type” : “Identifier”, “value” : “Name”} in an array.
  4. {“Punctuator” : “Punctuator”, “value” : “=”}} {“Punctuator” : “Punctuator”, “value” : “=”}}
  5. Find ‘jingda’ and generate {“type” : “String”, “value” : “jingda”} into the array.

[
    {
        "type": "Keyword",
        "value": "var"
    },
    {
        "type": "Identifier",
        "value": "name"
    },
    {
        "type": "Punctuator",
        "value": "="
    },
    {
        "type": "String",
        "value": "'jingda'"
    }
]
Copy the code

2. Parser generates an AST tree

(Use a plug-in that parses the AST.)

Resolution:

const name = "jing"
Copy the code

Use NPM I esprima –save

const esprima = require('esprima');
let code = 'const name = "jing"';
const ast = esprima.parseScript(code);
console.log(ast);
Copy the code
Script {
  type: 'Program',
  body: [
    VariableDeclaration {
      type: 'VariableDeclaration',
      declarations: [Array],
      kind: 'const'
    }
  ],
  sourceType: 'script'
}
Copy the code

3. Traverse AST tree, add, delete, change and check

Use: NPM I estraverse –save

Const name = "jing" to const jing = "name";Copy the code
const esprima = require('esprima'); const estraverse = require('estraverse'); let code = 'const name = "jing"'; const ast = esprima.parseScript(code); estraverse.traverse(ast, { enter: function (node) { node.name = 'team'; Node. value = "turn around FE"; }}); console.log(ast);Copy the code
Script { type: 'Program', body: [ VariableDeclaration { type: 'VariableDeclaration', declarations: [Array], kind: 'const', name: 'team', value: 'team'}], sourceType: 'script', name: 'team', value: 'team'}Copy the code

4. The generator converts the updated AST into code

Use: NPM I escodeGen –save

const esprima = require('esprima'); const estraverse = require('estraverse'); const escodegen = require('escodegen'); let code = 'const name = "jingda" '; const ast = esprima.parseScript(code); estraverse.traverse(ast, { enter: function (node) { node.name = 'jingda'; node.value = "name"; }}); const transformCode = escodegen.generate(ast); console.log(transformCode);Copy the code
➜  11.9-AST node parse1.js         
const jingda = 'name';
Copy the code

Babel, which can convert ES2015 + version code into a backwards-compatible JS syntax so that it can run on current and older versions of browsers or other environments, regardless of the compatibility of the new syntax

In fact, many of the functions in Babel are implemented by modifying the AST

With abstract syntax tree parsing, we can see the workings of the javascript machine as if it were a childhood toy, and reassemble it as you wish


function add(a,b) {
    return a + b;
}
Copy the code

It is a function definition object

Divided into three blocks:

  1. An ID, which is its name add
  2. The two params are its parameters [a,b]
  3. A body is just a bunch of things inside curly braces

Add, as a basic Identifier object, is used as a unique Identifier for functions, like a person’s name

{
    name: 'add',
    type: 'identifier',
    ....
}
Copy the code

Params continues to disassemble the array of two identifiers

[
    {
       name: 'a',
       type: 'identifier',
       ....
    },
    {
       name: 'b',
       type: 'identifier',
       ....
    }
]
Copy the code

Let’s look at the body

Body is a block-level scope for {return a + b}

Enter this block-level scope, which also contains the Return field, which means Return A + b;

In the return field there is a BinaryExpression object a + b

Continue to open the BinaryExpression object, which is divided into three parts left operator right

  • Operator that +
  • Left contains the Identifier object A
  • “Right” holds the object “B” of identity

This is the end of the simple Add function


\

2. Application scenarios of AST

Application:

  • ES6 is converted to ES5 in Babel
  • How does VUE template compilation apply to AST
  • CSS pretreatment
  • Develop the WebPack plug-in
  • UglifyJS to compress the code

We often use Babel plug-in to convert ES6 to ES5, use UglifyJS to compress code, CSS preprocessor, develop WebPack plug-in, vuE-CLI front-end automation tool

The foundation for parsing parsing by interpreter/compiler

JS: Code compression, obfuscation, compilation

CSS: Code compatible with multiple versions

HTML: Implementation of Virtual DOM in Vue

What can it do?

  1. IDE error prompts, code formatting, code highlighting, code completion, etc
  2. JSLint, JSHint checks for code errors or styles, etc
  3. Webpack, rollup for code packaging, etc
  4. CoffeeScript, TypeScript, JSX, etc. Convert to native JS

2.1. AST Compilation flowchart

2.2. Babel compiles code to ES lower versions

The Babel plug-in works on the abstract syntax tree AST

The three steps are: parse, transform and generate

  1. Parsing the parse

Lexical analysis (generation of token-tokens, nodes in AST) and syntactic analysis ()

  1. Transformation transform

The AST is received and traversed, during which nodes are added, updated, and removed. Babel maintains the overall state of the AST tree by performing depth-first traversal with babel-traverse

  1. Generate the generate

Depth-first traverses the AST and builds strings that represent the transformed code.

Babel is then converted to JS code via babel-Generator by depth-first traversing the AST and building a string that represents the transformed code.

2.3. VUE template compilation

Schematic diagram:

Steps:

Compiling templates in VUE consists of three main steps:

  1. Parser phase: Parse the code inside the template into an AST abstract syntax tree;
  2. Optimizer stage: Tag AST abstract syntax tree static tags to prevent repeated rendering (optimized diff algorithm);
  3. Code generator stage: the optimized AST abstract syntax tree generates the render function string through generate function;
export const createCompiler = createCompilerCreator(function baseCompile (template: string, options: CompilerOptions) :CompiledResult {
  // Process of generating ast
  const ast = parse(template.trim(), options)
  // Optimize the AST process to tag the AST abstract syntax tree static tags to prevent repeated rendering
  if(options.optimize ! = =false) {
    optimize(ast, options)
  }
  // Generate the render function string through the generate function
  const code = generate(ast, options)
  return {
    ast,
    render: code.render,
    staticRenderFns: code.staticRenderFns
  }
})
Copy the code

Parse () is used to parse the template into an AST. In fact, the parser is divided into several parsers, such as an HTML parser, a text parser, and a filter parser. The main one is the HTML parser.

The function of HTML parser is to parse HTML, it will constantly trigger various hook functions in the process of parsing HTML

See how the source code is implemented:

ParseHTML (template, {// Parse the start tag start (tag, attrs, unary, start, end) {}, // parse the end tag (tag, start, end) {}, Chars (text: string, start: number, end: number) {},Copy the code

For example, 🌰 :

<div> I am Jing Da </div>Copy the code

When the template above is parsed by the HTML parser, it triggers the start and chars end hook functions

So the HTML parser is actually a function, and it has two parameters – template and options, our template is a small, small segment to intercept and parse, so we need to constantly intercept

How to achieve in VUE: VUE parser to generate AST syntax tree main flow

Function parseHTML (HTML, options) {while (HTML) {if (! lastTag || ! IsPlainTextElement (lastTag) {// In vue to judge is text, comment, condition comment, DOCTYPE, end, start tag Var textEnd = html.indexof ('<'); Conditionalcomment.test (HTML) {// conditionalcomment.test (HTML) {// conditionalcomment.test (HTML) {} Var doctypeMatch = html.match(doctype); If (doctypeMatch) {} var endTagMatch = html.match(endTag); If (endTagMatch) {} var startTagMatch = parseStartTag(); if (startTagMatch) {} } var text = (void 0), rest = (void 0), next = (void 0); If (textEnd < 0) {text = HTML HTML = ""} if (textEnd < 0) {text = HTML HTML =" "} // If text text exists // call Options. Chars callback, Chars (text)}}else{// The parent element is script, style, textarea}}}Copy the code

2.4.TerserPlugin- Webpack plugin code compression

TerserPlugin:webpack.docschina.org/plugins/ter…

Code compression, reduced code submission, increased bandwidth, faster loading, better user experience

Create a new demo.js file

Function add(a,b) {return a + b; }Copy the code

Terminal input:

Wc-c Demo.js (used to measure volume)

 207 demo.js
Copy the code

So how do you compress your code when it’s too big?

The first thing to be clear about is that the only code that really makes sense in this file is

function add(a,b){return a+b};

For other comments or Spaces, line breaks are not redundant. Try this by removing comments and Spaces:

function add(a,b) {return a + b; }Copy the code
181 demo.js
Copy the code

The size is reduced a little bit.

AST apply:

The process of compressing code: code -> AST -> (transform) A smaller AST -> code, which is exactly the same as Babel and ESLint.

Babel uses the parser Babylon, while Uglify uses UglifyJS for code compression.

2.5. Compress code in Webpack

Everything related to performance optimization can be found in Optimization. TerserPlugin is an underlying UglifyJs-based plug-in for compressing JS.

You need to install the Terser-webpack-plugin:

$ npm install terser-webpack-plugin --save-dev
Copy the code

Official examples:

const TerserPlugin = require("terser-webpack-plugin");

module.exports = {
  optimization: {
    minimize: true,
    minimizer: [new TerserPlugin()],
  },
};
Copy the code

2.6.ESlint validates your code rules

2.6.1. A preliminary study

ESLint website

ESLint is a tool for identifying and reporting pattern matches in ECMAScript/JavaScript code with the goal of ensuring code consistency and avoiding errors. In many ways, it is similar to JSLint and JSHint, with a few exceptions:

  • ESLint uses Espree to parse JavaScript.
  • ESLint uses AST to analyze patterns in code
  • ESLint is fully plug-in. Each rule is a plug-in and you can add more rules at run time.

The source code is parsed into an AST, and then the AST is examined to determine if the code conforms to the rules. ESLint uses Esprima to parse source code into an AST, and then you can use any rule to check that the AST meets expectations, which is why ESLint is so extensible.

var ast = esprima.parse(text, { loc: true, range: true }),
    walk = astw(ast);

walk(function(node) {
    api.emit(node.type, node);
});

return messages;
Copy the code

ESLint didn’t catch on at the time because it needed to convert source code to AST, which was slower than JSHint, and JSHint already had a complete ecosystem (editor support). What really took ESLint off was the introduction of ES6.

With the release of ES6, JSHint won’t be supported for a while because of the new syntax, and ESLint only needs the right parser to be able to do Lint checking. At this point, Babel provided support for ESLint, developing babel-ESLint to make ESLint the fastest lint tool to support ES6 syntax.

Why is ESLint needed

JavaScript is a dynamic, weakly typed language that is prone to error in development. Because there is no compiler, it is often necessary to debug during execution to find JavaScript code errors. Things like ESLint allow programmers to find problems while coding rather than during execution.

1. Avoid low-level bugs and find out possible syntax errors

Using undeclared variables, modifying const variables…

2. Prompt to delete unnecessary code

Declared unused variables, repeated cases…

3. Make sure your code follows best practices

Please refer to Airbnb Style and javascript Standard

4. Unify the team’s code style

With or without a semicolon? TAB or space?

2.6.2. Practice

Segmentfault.com/a/119000001…

2.6.3. Handwriting ESlint plugin

www.it610.com/article/142…

The ESLint plugin is designed to verify that code comments are not commented:

  • Every declarative function and function expression needs comments;
  • Each interface header and field needs to be commented;
  • Each enum header and field needs to be commented;
  • Each type header needs a comment;
  • .

knowledge

  • AST Abstract syntax tree
  • ESLint
  • Mocha unit tests
  • Npm release

Yeoman and Generator-esLint to build the plugin’s scaffolding code.

First installation

npm install -g yo generator-eslint

Third, some practices

3.1 RecAST (screwdriver that can manipulate syntax tree) disassembly

  1. npm i recast -S
  2. Create a new parse.js file

3.1.1 try 1

Const recast = require("recast"); Const ast = recast.parse(code); const code = 'function add(a,b) {return a + b}' const ast = recast.parse(code); console.log(ast)Copy the code
➜ 11.9-AST node parse.js {program: Script {type: 'program ', body: [[FunctionDeclaration]], sourceType: 'script', loc: { start: [Object], end: [Object], lines: [Lines], indent: 0, tokens: [Array] }, errors: [] }, name: null, loc: { start: { line: 1, column: 0, token: 0 }, end: { line: 3, column: 1, token: 13 }, lines: Lines { infos: [Array], mappings: [], cachedSourceMap: null, cachedTabWidth: undefined, length: 3, name: null }, indent: 0, tokens: [ [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object] ] }, type: 'File', comments: null, tokens: [ { type: 'Keyword', value: 'function', loc: [Object] }, { type: 'Identifier', value: 'add', loc: [Object] }, { type: 'Punctuator', value: '(', loc: [Object] }, { type: 'Identifier', value: 'a', loc: [Object] }, { type: 'Punctuator', value: ',', loc: [Object] }, { type: 'Identifier', value: 'b', loc: [Object] }, { type: 'Punctuator', value: ')', loc: [Object] }, { type: 'Punctuator', value: '{', loc: [Object] }, { type: 'Keyword', value: 'return', loc: [Object] }, { type: 'Identifier', value: 'a', loc: [Object] }, { type: 'Punctuator', value: '+', loc: [Object] }, { type: 'Identifier', value: 'b', loc: [Object] }, { type: 'Punctuator', value: '}', loc: [Object] } ] }Copy the code

3.1.2 try 2

Const recast = require("recast"); Const ast = recast.parse(code); const code = 'function add(a,b) {return a + b}' const ast = recast.parse(code); Console. log(ast) // Ast can handle a lot of huge code files, so only the first line is needed here. const add =ast.program.body[0]; console.log(add);Copy the code
{
  program: Script {
    type: 'Program',
    body: [ [FunctionDeclaration] ],
    sourceType: 'script',
    loc: {
      start: [Object],
      end: [Object],
      lines: [Lines],
      indent: 0,
      tokens: [Array]
    },
    errors: []
  },
  name: null,
  loc: {
    start: { line: 1, column: 0, token: 0 },
    end: { line: 3, column: 1, token: 13 },
    lines: Lines {
      infos: [Array],
      mappings: [],
      cachedSourceMap: null,
      cachedTabWidth: undefined,
      length: 3,
      name: null
    },
    indent: 0,
    tokens: [
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object]
    ]
  },
  type: 'File',
  comments: null,
  tokens: [
    { type: 'Keyword', value: 'function', loc: [Object] },
    { type: 'Identifier', value: 'add', loc: [Object] },
    { type: 'Punctuator', value: '(', loc: [Object] },
    { type: 'Identifier', value: 'a', loc: [Object] },
    { type: 'Punctuator', value: ',', loc: [Object] },
    { type: 'Identifier', value: 'b', loc: [Object] },
    { type: 'Punctuator', value: ')', loc: [Object] },
    { type: 'Punctuator', value: '{', loc: [Object] },
    { type: 'Keyword', value: 'return', loc: [Object] },
    { type: 'Identifier', value: 'a', loc: [Object] },
    { type: 'Punctuator', value: '+', loc: [Object] },
    { type: 'Identifier', value: 'b', loc: [Object] },
    { type: 'Punctuator', value: '}', loc: [Object] }
  ]
}
FunctionDeclaration {
  type: 'FunctionDeclaration',
  id: Identifier {
    type: 'Identifier',
    name: 'add',
    loc: {
      start: [Object],
      end: [Object],
      lines: [Lines],
      tokens: [Array],
      indent: 0
    }
  },
  params: [
    Identifier { type: 'Identifier', name: 'a', loc: [Object] },
    Identifier { type: 'Identifier', name: 'b', loc: [Object] }
  ],
  body: BlockStatement {
    type: 'BlockStatement',
    body: [ [ReturnStatement] ],
    loc: {
      start: [Object],
      end: [Object],
      lines: [Lines],
      tokens: [Array],
      indent: 0
    }
  },
  generator: false,
  expression: false,
  async: false,
  loc: {
    start: { line: 1, column: 0, token: 0 },
    end: { line: 3, column: 1, token: 13 },
    lines: Lines {
      infos: [Array],
      mappings: [],
      cachedSourceMap: null,
      cachedTabWidth: undefined,
      length: 3,
      name: null
    },
    tokens: [
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object]
    ],
    indent: 0
  }
}
Copy the code

3.2 recast.types. Builders reload

So the simplest example, we want to take the previous one

function add(a, b){... }Copy the code

Declaration, change to anonymous functional declaration

const add = function(a ,b){... }Copy the code

How to repack

  1. We create a VariableDeclaration VariableDeclaration object with a const header and a VariableDeclarator object that will be created.

  2. Create a VariableDeclarator, place add.id on the left, and to the right is the FunctionDeclaration object that will be created

  3. We create a FunctionDeclaration, like the three components mentioned earlier, in id params body, which is null because it’s an anonymous function, and in add.params, which is add.body.

This creates an AST object for const add = function(){}.

/* * @file: description * @author: longjing03 * @Date: 2021-11-09 10:57:51 * @LastEditors: longjing03 * @LastEditTime: 2021-11-09 13:57:33 */ / const recast = require("recast"); Const ast = recast.parse(code); const code = 'function add(a,b) {return a + b}' const ast = recast.parse(code); const add =ast.program.body[0]; // introduce variable declarations, variable symbols, Const {variableDeclaration, variableDeclaration, functionExpression} = recast.types. Builders // Put prepared components into the mold, And assemble back into the original AST object. ast.program.body[0] = variableDeclaration("const", [ variableDeclarator(add.id, functionExpression( null, // Anonymize the function expression. add.params, add.body )) ]); Const output = recast.print(AST).code; console.log(output)Copy the code
const add = function(a, b) {
    return a + b
};
Copy the code

Recast.parse’s reverse process, specifically:

recast.print(recast.parse(source)).code === code
Copy the code

Print prettily formatted code snippets

const output = recast.prettyPrint(ast, {tabWidth: 2}).code

console.log(output)
Copy the code
const add = function(a, b) {
  return a + b;
};
Copy the code

Now you can generate code from the AST tree

3.3, combat advanced: command line modify JS files

In addition to Parse/Print/Builder, Recast has three main features:

  • Run: Reads the JS file from the command line and converts it to the AST for processing.
  • TNT: You can verify the type of an AST object using assert() and check().
  • Visit: Traverses the AST tree to obtain valid AST objects and modify them.

Update ing…

There are many excellent articles at the end of this article.

Asast Abstract syntax tree..

Asast practices and applications..

From AST principles to ESlint practices..

❤ In-depth understanding of ESLint..

Mastering the PRINCIPLES of AST..