“This is the 24th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”
Hello, I’m Shanyue.
AST is the abbreviation of Abstract Syntax Tree, which is a noun that can not be bypassed by front-end engineering. It involves the application of many aspects of engineering, such as:
- How to convert Typescript to Javascript
- How to convert SASS/LESS to CSS (SASS/LESS)
- How to convert ES6+ to ES5 (Babel)
- How to format Javascript Code (esLint/Prettier)
- How to identify JSX (Babel) in React project
- GraphQL, MDX, Vue SFC, etc
In the process of language conversion, it is essentially the operation of its AST, and the core step is the AST three steps
- Code -> AST (Parse)
- AST -> AST (Transform)
- AST -> Code (Generate)
// Code
const a = 4
// AST
{
"type": "Program"."start": 0."end": 11."body": [{"type": "VariableDeclaration"."start": 0."end": 11."declarations": [{"type": "VariableDeclarator"."start": 6."end": 11."id": {
"type": "Identifier"."start": 6."end": 7."name": "a"
},
"init": {
"type": "Literal"."start": 10."end": 11."value": 4."raw": "4"}}]."kind": "const"}]."sourceType": "module"
}
Copy the code
Different languages have different parsers, for example Javascript parsers and CSS parsers are completely different.
There are many parsers for the same language, resulting in multiple AST types, such as Babel and Espree.
Parser and Transformer for many languages are listed in the AST Explorer.
The generation of AST
The AST generation step is called Parser, and it also has two stages: Lexical Analysis and Syntactic Analysis.
Lexical analysis
Lexical analysis is used to convert code into Token streams and maintain an array of tokens
// Code
a = 3
// Token[{type: {... },value: "a".start: 0.end: 1.loc: {... }}, {type: {... },value: "=".start: 2.end: 3.loc: {... }}, {type: {... },value: "3".start: 4.end: 5.loc: {... }},... ]Copy the code
The Token stream after lexical analysis also has many applications, such as:
- Code checks such as esLint to determine whether it ends with a semicolon and whether it contains a semicolon token
- Syntax highlighting, such as Highlight/PRISM, makes the code highlighted
- Template syntax, such as EJS, is also required
Syntax analysis
Parsing transforms Token flows into structured AST for easy operation
{
"type": "Program"."start": 0."end": 5."body": [{"type": "ExpressionStatement"."start": 0."end": 5."expression": {
"type": "AssignmentExpression"."start": 0."end": 5."operator": "="."left": {
"type": "Identifier"."start": 0."end": 1."name": "a"
},
"right": {
"type": "Literal"."start": 4."end": 5."value": 3."raw": "3"}}}]."sourceType": "module"
}
Copy the code