preface

First of all, it should be explained that this article is a translation from AST for JavaScript Developers. I seldom take the time to translate an article. It is very tiring to read and write. This article is so good that I can’t help sharing it with you.

My blog: github.com/CodeLittleP… , my blog will update a variety of types of articles from time to time, I hope you support.

OK, let’s get straight to the point.

Why AST (Abstract Syntax tree)?

If you look at devDependencies in any major project today, you’ll see numerous plug-ins created in previous years. To summarize, we have: javascript translation, code compression, CSS preprocessor, ELint, Pretiier, etc. There are many JS modules that we don’t use in production, but they play an important role in our development process. All of the above tools, however, are built on the shoulders of the GIANT AST.

All of the above tools, however, are built on the shoulders of the giant AST

Let’s set a small goal, starting with explaining what an AST is and moving on to how to build it from regular code. We will briefly touch on some of the most popular usage examples and tools based on AST processing. And I plan to talk about my JS2Flowchart project, which is a good demo of using AST. OK, let’s get started.

What is an AST (Abstract Syntax tree)?

It is a hierarchical program representation that presents source code structure according to the grammar of a programming language, each AST node corresponds to an item of a source code.

Like the cat in this photo, many students will be left stunned by the official definition. OK, let’s look at an example:

This is to simplify

In fact, the true AST has more information per node. But that’s the general idea. From pure text, we will get the data in a tree structure. Each entry corresponds to a node in the tree.

So how do you get an AST from plain text? Wow, we know that compilers today do this. So let’s just look at what a normal compiler does.

Trying to build a compiler is a bit of an effort, but fortunately, we don’t have to go through all the basics of the compiler and end up translating high-level languages into binary code. We only need to focus on lexical analysis and grammatical analysis. These two steps are key to generating the AST from your code.

The first step, lexical analysis, is also called scanning scanner. It reads our code and merges them into tokens according to predefined rules. At the same time, it removes whitespace, comments, etc. Finally, the entire code is split into a list of tokens (or a one-dimensional array).

When lexically analyzing source code, it reads the code letter by letter, so it is figuratively called scanning – SCANS; When it encounters Spaces, operators, or special symbols, it considers a statement to be complete.

The second step, parsing, is also the parser. It transforms the lexicographically analyzed array into a tree representation. Also, validate the syntax and throw syntax errors if there are any.

When generating a tree, the parser removes unnecessary tokens (such as incomplete parentheses), so the AST is not 100% source code matching, but it does give us an idea of how to handle them. As an aside, the parser covers 100% of all code structures in a spanning tree called CST.

What we end up with

Want to learn more about compilers? The-super-tiny-compiler, a good project. About 200 lines of code, almost every line annotated.

Want to create your own programming language? LangSandbox, a better project. It demonstrates how to create a programming language. Of course, design programming languages are all over the market. So, this project goes further. Unlike the project of the -super-tiny-Compiler, which converted Lisp into C, this project allows you to write your own language, compile it into C or machine language, and run it.

Can I generate an AST directly from a tripartite library? Of course you can! There’s a bunch of tripartite libraries you can use. You can go to AstExplorer and pick your favorite library. Astexplorer is a great site where you can play with AST online, and there are many AST libraries in other languages besides JS.

I have to highlight a three-way library that I think is great, called Babylon.

It’s used in the famous Babel, and maybe that’s why it’s so popular. Thanks to the Babel project, we can expect it to keep up with The Times and support the latest JS features, so we can use it without worrying about the massive refactoring that comes with new versions of JS. In addition, its API is very simple and easy to use.

Ok, now that you know how to generate an AST, let’s move on and take a look at the real world use cases.

The first use case, I want to talk about transcoding, that’s right, Babel.

Babel is not a ‘tool for having ES6 support’. Well, it is, but it is far not only what it is about.

Beble is often associated with supporting ES6/7/8, and in fact, that’s why we use it a lot. However, it is just a set of plug-ins. We can also use it to compress code, react related syntax translations (e.g. JSX), flow plugins, etc.

Babel is a javascript compiler. At a macro level, it runs code in three stages: parsing, transforming, and generation. We can give Babel some javascript code, it modifies the code and generates new code to return. So how does it change the code? That’s right! It creates the AST, iterates through the tree, modifies tokens, and finally generates new code from the AST.

Let’s take a look at this process in the following demo:

As I mentioned earlier, Babel uses Babylon, so first we parse the code into an AST, then iterate over the AST, then reverse all the variable names, and finally generate the code. Done! As we can see, steps 1 (parsing) and 3 (generation) seem pretty routine, and we do them every time. So, Babel took over and took care of them. Finally, we are most concerned with the AST translation step.

When we developed the Babel-plugin, we only needed to describe transforming the node “visitors” of your AST.

Add it to your Babel plugin list by setting your Babel-loader configuration for Webpack or plugins in.babelrc

You may check out Babel-handbook if you would like to learn more about how to build your first babel-plugin. If you want to learn how to create your first Babel-plugin, check out the Babel-Handbook

Moving on, the next use case I want to mention is the automated code refactoring tool, and the artifact JSCodeshift.

Let’s say you want to replace all the old anonymous functions with Lambda expressions (arrow functions).

Your code editor probably won’t be able to do this, because it’s not as simple as finding replacement operations. That’s where jscodeshift comes in.

If you’ve heard jscodeshift, there’s a good chance you’ve heard codemods, too. It might be confusing at first, but that’s okay. Jscodeshift is a tool to run codemods. Codemod is a piece of code that describes what the AST is going to convert into, much like Babel’s plug-in.

So, if you want to create an automatic migration of your code from an old framework to a new one, this is a great way to do it. For example, react 16’s prop-types refactoring.

There are many different codemodes already created, you can save what you need to avoid manually changing a bunch of code and splurge: github.com/facebook/js… Github.com/reactjs/rea…

The last use case I want to mention Prettier because probably every coder uses Prettier in his daily work.

Prettier formats our code. It adjusts long sentences, collates Spaces, parentheses, etc. So it takes the code as input and the modified code as output. Sound familiar? Of course!

Same idea. First, the code generates an AST. After that, the AST is processed again, and finally the code is generated. However, the intermediate process is not as simple as it seems.

Also, if you want to learn more about the theory behind prettier printing, there’s A book you can dig into called A Prettier Printer.

But as we wrap up and move on, one last thing I want to mention today is my library, jS2Flowchart (4.5K stars on Github).

As the name suggests, it converts JS code to generate SVG flowcharts

This is a great example because it shows you that you, when you have an AST, can do anything you want. It’s not necessary to turn the AST back into string code, you can use it to draw a flow chart, or whatever you want.

Js2flowchart What are the workflow scenarios? With flowcharts, you can explain your code or document your code. Learning other people’s code through visual interpretation; Create a flow chart for a simple description of each process using simple JS syntax.

Let’s try it the simplest way; go to online editing and see jS-code-to-SVG-flowchart

You can also use it in code, or through the CLI, you just point to the file you want to generate SVG from. Also, there are VS Code plug-ins (linked in the project readme)

So, what else can it do? Wow, I don’t want to talk nonsense here, you are interested in directly read the documentation of the project.

OK, so how does it work?

First, the code is parsed into the AST. Then, we iterate over the AST and generate another tree, which I call the workflow tree. It removes many tokens that are not important, but puts together key pieces such as functions, loops, conditions, etc. After that, we iterate through the workflow tree and create the shape tree. The nodes of each shape tree contain information about the visual type, location, connections in the tree, and so on. As a final step, we iterate over all shapes, generate corresponding SVG, and merge all SVG into one file.

At the end

Looking for and screening data is really hard, hope students can give more support!