Abstract Syntax tree (AST) is a very basic and important knowledge point, but the domestic documents are almost blank.
This article takes you through the AST from the ground up and introduces you to the power of the AST by releasing a small front-end tool
Javascript is like a machine that works so well that we can do anything we want with it.
We know so much about the javascript ecosystem, but we often ignore javascript itself. What parts are supporting this machine?
AST may be difficult to cover in your daily business, but when you want to be more than just an engineer, write tools like WebPack, VUe-CLI front-end automation, or have engineering requirements to modify source code in bulk, you must know AST. The POWER of AST is very powerful, and it really helps you understand the essence of javascript.
In fact, in the javascript world, you can think of the Abstract syntax tree (AST) as the lowest level. Further down the line is the “dark magic” realm of transformations and compilations.
Dismantling Javascript for the first time in my life
As children, when we are given a screwdriver and a machine, one of life’s most memorable dreamlike moments begins:
We take the machine apart into small parts, gears and screws, and put them together by clever mechanical principles…
When we put it back together in a different way, the machine starts running again-the world is new to your eyes.
With abstract syntax tree parsing, we can see the workings of the Javascript machine as if it were a childhood toy, and reassemble it as you wish.
Now let’s disassemble a simple add function
function add(a, b) {
return a + b
}Copy the code
First of all, the syntax block we’ve got is a FunctionDeclaration object.
Tore it apart, and it broke into three pieces:
- An ID, which is its name, is add
- Two params, that’s the argument, [a, b]
- A body, a bunch of things inside curly braces
Add is a basic Identifier object that is used as a unique Identifier for functions, like a person’s name.
{
name: 'add'
type: 'identifier'
...
}Copy the code
Params continues to tear down what is actually an array of two identifiers. There’s no way to take it down after that.
[
{
name: 'a'
type: 'identifier'
...
},
{
name: 'b'
type: 'identifier'
...
}
]Copy the code
The body is a BlockStatement object that is used to represent {return a + b}.
Open Blockstatement, which contains a ReturnStatement (Return field) object that represents Return A + B
Continue to open the ReturnStatement, which contains a BinaryExpression object representing a + b
Continuing to open BinaryExpression, it becomes three parts, left, operator, and right
-
operator
即+
-
left
Inside is the Identifier objecta
-
right
And inside that is the Identifer objectb
In this way, we have taken apart a simple add function and graphically represented it
Look! Abstract Syntax Tree is indeed a standard Tree structure.
Then, where can we check the specifications of the above mentioned Identifier, Blockstatement, ReturnStatement and BinaryExpression?
Please see theAST object documentation
AST screwdriver for you: Recast
Enter the command:
npm i recast -SCopy the code
You get a screwdriver to manipulate the syntax tree
Next, you can manipulate the screwdriver in any JS file. Let’s create a parse.js:
parse.js
Recast const recast = require("recast"); // Your "machine" -- a piece of code // We used a very strange format of code, Const code = 'function add(a, b) {return a + // } 'const ast = recast.parse(code); Const add = ast.program.body[0] console.log(add) const add = ast.program.body[0] console.log(add)Copy the code
Enter node parse.js to see the structure of the add function, as described above, and its properties can be found in the AST object documentation:
FunctionDeclaration{
type: 'FunctionDeclaration',
id: ...
params: ...
body: ...
}Copy the code
You can also use console.log to see deeper into it, for example:
console.log(add.params[0])Copy the code
console.log(add.body.body[0].argument.left)Copy the code
Recast.types. Builders make molds
A machine, you can only take apart and rebuild, not skill.
Disassembled, can also be modified, only then count as a mesa.
Recast.types. Builders offers a number of “molds” that you can easily spline together into new machines.
Function add(a, b){… function add(a, b){… Const add = function(a,b){… }
How to modify?
As a first step, we create a VariableDeclaration VariableDeclaration object with a const header and a VariableDeclarator object to be created.
Second, create a VariableDeclarator, placing add.id on the left and the FunctionDeclaration object to be created on the right
In the third step, we create a FunctionDeclaration, like the three components described earlier, in the ID params body, which is null because it is an anonymous function, using Add.params and adding. Body.
This creates an AST object for const add = function(){}.
Add the following code after the previous parse.js code
// introduce variable declarations, variable symbols, Const {variableDeclaration, variableDeclaration, functionExpression} = recast.types. Builders // Put prepared components into the mold, And assemble back into the original AST object. ast.program.body[0] = variableDeclaration("const", [ variableDeclarator(add.id, functionExpression( null, // Anonymize the function expression. add.params, add.body )) ]); Const output = recast.print(AST).code; console.log(output)Copy the code
And as you can see, we printed it out
Const add = function(a, b) {return a + // something is wrong with b};Copy the code
The last line
const output = recast.print(ast).code;Copy the code
This is the reverse process of Recast. Parse
recast.print(recast.parse(source)).code === sourceCopy the code
It is printed with the “original” function content, not even the comments changed.
We could also print out a prettily formatted snippet:
const output = recast.prettyPrint(ast, { tabWidth: 2 }).codeCopy the code
The output is
const add = function(a, b) {
return a + b;
};
Copy the code
Now, are you under the illusion that “I can generate any JS code from the AST tree”?
For the record, I’m not hallucinating.
Actual combat advanced: command line modify JS files
In addition to Parse/Print/Builder, Recast has three main features:
- Run: Reads the JS file from the command line and converts it to the AST for processing.
- TNT: You can verify the type of an AST object using assert() and check().
- Visit: Traverses the AST tree to obtain valid AST objects and modify them.
We learn the full Recast toolset through a series of tips:
Create a sample file, let’s say demo.js
demo.js
function add(a, b) { return a + b } function sub(a, b) { return a - b } function commonDivision(a, b) { while (b ! == 0) { if (a > b) { a = sub(a, b) } else { b = sub(b, a) } } return a }Copy the code
Recast. run — Command line file reads
Create a new file named read.js and write to read.js
recast.run( function(ast, printSource){
printSource(ast)
})Copy the code
Command line input
node read demo.jsCopy the code
We checked to see the JS file content printed on the console.
As you can see, Node Read can read demo.js files and convert the demo.js contents into ast objects.
It also provides a printSource function that converts the ast content back into source code at any time for easy debugging.
Recast. visit — Traversal of the AST node
read.js
#! /usr/bin/env node const recast = require('recast') recast.run(function(ast, printSource) { recast.visit(ast, { visitExpressionStatement: function({node}) { console.log(node) return false } }); });Copy the code
Recast. visit Traverses the nodes in the AST object one by one.
Pay attention to
- If you want to manipulate function declarations, use visitFunctionDelaration to iterate, and if you want to manipulate assignment expressions, use visitExpressionStatement. Objects defined in the AST object documentation can be traversed by preceded by visit.
- The AST object can be retrieved from node
- Each traversal function must be followed by a return false, or else an error will be reported:
#! /usr/bin/env node const recast = require('recast') recast.run(function(ast, printSource) { recast.visit(ast, { visitExpressionStatement: function(path) { const node = path.node printSource(node) this.traverse(path) } }) });Copy the code
When debugging, if you want to output AST objects, console. Log (node)
If you want to output the source code for the AST object, you can use printSource(node)
Command line ‘node read demo.js’ to test.
#! /usr/bin/env node
In all use
recast.run()
You need to add this line at the top of all your files, and we’ll talk about what that means later.
TNT – Determines the AST object type
TNT, or recast.types.namedTypes, is as popular as its name and is used to determine whether an AST object is of the specified type.
Tnt.node.assert (), like a buried explosive in a machine, blows up the machine when it doesn’t work properly (type mismatch) (error exit)
Tnt.node.check (), which checks whether the types are consistent and prints False and True
The Node can replace any AST objects, such as TNT. ExpressionStatement. Check (), TNT. FunctionDeclaration. Assert ()
read.js
#! /usr/bin/env node const recast = require("recast"); const TNT = recast.types.namedTypes recast.run(function(ast, printSource) { recast.visit(ast, {{visitExpressionStatement: function (path) const node = path. The value / / determine whether to ExpressionStatement, a line of words is correct output. If (TNT) ExpressionStatement) check (node)) {the console. The log (' this is a ExpressionStatement ')} this. Traverse by (path); }}); });Copy the code
read.js
#! /usr/bin/env node const recast = require("recast"); const TNT = recast.types.namedTypes recast.run(function(ast, printSource) { recast.visit(ast, { visitExpressionStatement: Function (path) {const node = path.node // Error is the global error of TNT. ExpressionStatement. Assert (node) enclosing traverse by (path); }}); });Copy the code
Command line ‘node read demo.js’ to test.
Actual combat: modify the source code with AST, export all methods
exportific.js
Now, we want to cover all the functions in the demo
We want the functions in this file to be written in a form that can be fully exported, for example
function add (a, b) {
return a + b
}Copy the code
Want to change for
exports.add = (a, b) => {
return a + b
}Copy the code
In addition to the clunky way of reading files with fs.read, replacing text with regular matches, and writing files with fs.write, we can solve the problem elegantly with AST ==.
Query AST object documents
First, we use builders to implement a key-header function out of thin air
exportific.js
#! /usr/bin/env node const recast = require("recast"); const { identifier:id, expressionStatement, memberExpression, assignmentExpression, arrowFunctionExpression, blockStatement } = recast.types.builders recast.run(function(ast, PrintSource) {// a block level domain {} console.log('\n\nstep1:') printSource(blockStatement([])) // a key header function ()=>{} Console. log('\n\nstep2:') printSource(arrowFunctionExpression([],blockStatement([]))) // add = ()=>{} console.log('\n\nstep3:') PrintSource (assignmentExpression('=',id('add'),arrowFunctionExpression([],blockStatement([])))) // exports.add assigns the value to the key header function exports.add = ()=>{} console.log('\n\nstep4:') printSource(assignmentExpression('=',memberExpression(id('exports'),id('add')), arrowFunctionExpression([],blockStatement([])))) });Copy the code
It shows how we deduce the exports.add = ()=>{} step by step to get a concrete AST structure.
Run node exportific Demo.js to view the results.
Then, in the resulting expression, replace the id(‘add’) with the iterated function name, replace the iterated function parameter with the iterated function parameter, and replace the blockStatement([]) with the iterated function block-level scope, and you have successfully overwritten all functions!
In addition, we need to note that the commonDivision function, which refers to the sub function, should be rewritten to exports.sub
exportific.js
#! /usr/bin/env node const recast = require("recast"); const { identifier: id, expressionStatement, memberExpression, assignmentExpression, arrowFunctionExpression } = recast.types.builders recast.run(function (ast, PrintSource) {let funcIds = [] recast.types. Visit (ast, VisitFunctionDeclaration (path) {// Get the function name, parameters, and block-level field traversed to const node = path.node const funcName = node.id const Funcids.push (funcname.name) // This is the ast structure derived from the last step const rep = expressionStatement(assignmentExpression('=', memberExpression(id('exports'), funcName), ArrowFunctionExpression (params, body))) Replace (rep) // Stop iterating return false}}) recast.types.visit(ast, {// Go through all the functions call visitCallExpression(path){const node = path.node; // If the function call appears in the function definition, If (funcids.includes (node.callee.name)) {node.callee = memberExpression(id('exports'), Node.callee)} return false}}) printSource(ast)})Copy the code
One step in place, one of the simplest exportific front-end tools
So much has been said above, but it is still only in the theoretical stage.
However, by simple rewriting, we can use Recast to produce a source code editing tool called exportific.
The following code additions make two minor changes
- Added specification –help, and the –rewrite pattern, which can either overwrite files directly or export *.export.js by default.
- Replace printSource(ast) with writeASTFile(ast,filename,rewriteMode)
exportific.js
#! /usr/bin/env node const recast = require("recast"); const { identifier: id, expressionStatement, memberExpression, assignmentExpression, ArrowFunctionExpression} = recast.types. Builders const fs = require('fs') const path = require('path' Options = process.argv.slice(2) // If there are no arguments, or -h or --help options are provided, Print the help if (options. The length = = = 0 | | options. Includes (' -h ') | | options. Includes (' -- help ')) {the console. The log (` using commonjs rules, Change all functions in the.js file to export form. Options: // if -r or --rewrite is used to rewrite the file ') process.exit(0)} The rewriteMode to true let rewriteMode = options. Includes (' -r ') | | options. Includes (' - rewrite ') / / access to the file name const clearFileArg = options.filter((item)=>{ return ! ['-r','--rewrite','-h','--help'].includes(item)}) let filename = clearFileArg[0] const writeASTFile = function(ast, filename, rewriteMode){ const newCode = recast.print(ast).code if(! RewriteMode){// In non-overwrite mode, Under the new file is written to *. Export. Js filename = filename. The split ('. '). Slice (0, 1). The concat ([' export 'and' js]). The join ('. ')} / / to write new code into the file fs.writeFileSync(path.join(process.cwd(),filename),newCode) } recast.run(function (ast, printSource) { let funcIds = [] recast.types.visit(ast, {visitFunctionDeclaration(path) {// Get function names, arguments, and block-level fields traversed by const node = path.node const funcName = node.id const params = node.params const body = node.body funcIds.push(funcName.name) const rep = expressionStatement(assignmentExpression('=', memberExpression(id('exports'), funcName), arrowFunctionExpression(params, body))) path.replace(rep) return false } }) recast.types.visit(ast, { visitCallExpression(path){ const node = path.node; if (funcIds.includes(node.callee.name)) { node.callee = memberExpression(id('exports'), node.callee) } return false } }) writeASTFile(ast,filename,rewriteMode) })Copy the code
Now try it
node exportific demo.jsCopy the code
You can already find the demo. Export. Js file in the current directory.
Contracted out NPM
Edit the package.json file
Exportific {"name": "exportific", "version": "0.0.1", "description": "exportific", "main": "exportific.js", "bin": { "exportific": "./exportific.js" }, "keywords": [], "author": "wanthering", "license": "ISC", "dependencies": {"recast": "^0.15.3"}}Copy the code
Note the bin option, which means exportific global command to exportific.js in the current directory
Next, as long as any JS file want to export to use, exportific xxx.js.
This is local play, and if you want to share this front-end gadget with everyone, all you need to do is release the NPM package.
At the same time, be sure to note the exportitic.js file header
#! /usr/bin/env nodeCopy the code
Otherwise, an error will be reported during use.
Next, the NPM package is officially released!
If you already have an NPM account, use NPM login to login
If you don’t already have an NPM account www.npmjs.com/signup it’s very easy to signup for NPM
Then, type NPM publish
Without any tedious steps, or audit at all, you published a useful exportific front-end gadget. Anyone can pass
npm i exportific -gCopy the code
Install this plug-in globally.
Tip: == In the tutorial, please do not have the same name as my package, change the package name. = =
conclusion
We are all familiar with javascript, but through the perspective of AST, the most ordinary JS statements glow with elaborate beauty. You can batch build any javascript code with it!
When you are a child, the world is full of novelty toys, and even the most ordinary things are precious to you. Today, computer languages are big toys in your hands, bits and pieces of AST objects that build the online world we live in.
So I have to say that software engineer is a happy job, you still live in the heart of the afternoon teenager, there are always countless novelty waiting for you to discover, there are always countless dreams waiting for you to build.
Github address: github.com/wanthering….