Last time, we completed the reactive module ReActivity and the runtime module Runtime, and another essential feature for daily use in vue is the template syntax, although you can write JSX, However, both the Template syntax and JSX will eventually be compiled into render and H, and there are still a lot of instructions to deal with, such as V-if, V-ON, etc., which are implemented in the compiler module. The next step is to complete the compilation module to complete the last piece of the puzzle

A round of quick questions

Q: Why compile? A: It says so

Q: How does it compile? A: Vue3 is an abstract syntax tree that extracts data from the template string you wrote. The syntax tree is called AST, and the syntax tree is called AST. Finally, the final compilation results are generated according to the AST

Q: What is AST? A: Abstract Syntax Tree, more on that below.

Q: What does an AST look like? A: It’s just A tree, A tree structure

Q: Why do you have to use AST? A: Because it’s very convenient

Q: What if I don’t want to use AST? A: In fact, I think it is possible to skip AST and directly use regular matching to process, but the cost performance of such A process is only unacceptable. We should hold the determination to exhaust user operations to do it

Introduction to compilation

The following does not consider assembly language, except machine language is referred to as high-level language and does not consider the difference between compilation and translation, referred to as compilation

To introduce the AST, you need to explain the compilation process. If you can see this, you must be able to write code, at least js. The code we write can be interpreted as statements that follow certain rules (syntax), but computers have no way to directly recognize the code we write

const a = 10;
Copy the code

This is a very simple statement, we can understand it, the key here is that we can think and understand the meaning of the code, but the computer is a stupid things, in addition to other computing power is useless, he can only recognize a lot of binary data, can’t understand the meaning of this sentence code directly. We write programs to tell the computer what to do and when, and without language, the computer doesn’t know what to do, so it needs a compilation process. The process of compilation is simply to convert our human-written source code into binary code that a computer can recognize directly.

  • High-level languages: the languages we use in daily programming, such as JS, Java, C++, etc
  • Machine language: language that a computer can recognize directly, otherwise it’s a bunch of binary data

So the compiler is essentially a translator, translating our words to the computer so that the computer can understand us and do the work

Overview of the AST

Now that I’ve given you a rough introduction to the concept of compilation, LET’s move on to AST

Abstract syntax tree AST is an abstract representation of the syntax structure of the source code. It represents the syntax structure of the programming language in the form of a tree. Each node in the tree represents a structure in the source code

We know from the above definition

  • The AST is a tree structure
  • AST represents the syntactic structure of source code

Further, the AST is actually an intermediate in the compilation process, since there is no object code (machine language) available, so why use the AST

To understand the functions of the AST, you need to understand the AST generation process. The other steps are omitted. The key steps are lexical analysis and syntax analysis

Lexical analysis

These scans are also called scans. During this process the compiler scans our written source code step by step and identifies tokens according to predefined grammar rules. These tokens are then stored in a list and the compiler performs special tasks such as removing Spaces. You can also remove comments (some compilers also keep comments as a node), and so on

// These key names and values are arbitrary, meaning similar
// source code: const a = 5;[{type: 'VariableDeclaration'.kind: 'const' },
    { type: 'Identifier'.name: 'a' },
    { type: 'OperationSymbol'.kind: '=' },
    { type: 'Iiteral'.value: '5'},];Copy the code

Then our source code becomes a list of tokens, a one-dimensional array. In this process, every key word, symbol, identifier, and value in our source code is wrapped into objects according to predefined rules. Objects usually contain a lot of information. However, I only wrote type and value here for simplicity, and it can be expected that the computer can know what each token is by recognizing the type attribute of each token. This is lexical analysis

Syntax analysis

It is also called the parser. It only knows what each token is for to complete lexical analysis, but it can’t show what the code is for. It is equivalent to taking an advanced math class, you can understand every word of the teacher, but you don’t know what it means together. Therefore, a syntax analysis process is required to correctly associate each token, which is also the process of generating the AST.

We usually write code usually occur, there are a lot of nested judgments, circulation, such as function, conditions, these are nested logic, these nested logic code will be divided into different blocks, need to block the inside of the execution of the code, and then performing external code, so the recursion is a very good and very natural selection, Because of recursion, a tree structure is naturally generated to represent the syntactic structure of source code, as shown in the following example

const sum = a + b;
Copy the code

The above code is parsed as follows

The tree structure in the figure is the AST, in which EACH node is simply represented, but the real situation will be more complicated. After this step is completed, other steps such as error correction and optimization are omitted. At this point, the next processing can be carried out according to the AST, such as:

  • Transpiler: Converts this AST into another AST that is printed as object code, such as Babel
  • Interpreter: Either interprets the AST directly or interprets the execution of linear intermediate code, such as JS
  • Compiler: Converts the AST into linear intermediate code, regenerates it into assembly language, and does some special processing to generate machine code, such as Java

Why AST

The AST is like a middleman, except it doesn’t cost money, it costs performance, people ask

“Why do I have to use an AST when I can’t just parse the string and generate the final code according to certain rules, because that’s one step less?”

Yes, but the effort and the gain are completely out of proportion. We don’t build an AST for the sake of getting an AST. After we get an AST, we can do a lot of things

  • Change the beginning letter of every word in the source code to uppercase
    • AST: Iterate through the AST, obtain the value of the word node for change, and regenerate it into code to write
    • STR: Requires a lot of string substitution and string reading and writing
  • Change the indent format of source code
    • AST: Iterate over the AST, delete or add certain nodes according to specific rules to generate object code to write
    • STR: Manipulating strings directly is really, really hard to do

The AST is very, very flexible, and we can do a lot of things with it. Besides, after generating an AST, most compilers do error correction (semantic analysis) and optimizations for source code, such as correctness checking, type derivation checking, type resolution, and so on. Optimizations such as deleting useless assignments, merging constant operations, deleting common subexpression, etc., are all difficult to do directly with strings, or perform poorly

In summary, using AST is not because AST is necessary, but because

  • Generating an AST from source code is quite natural
  • The AST describes the syntactic structure of the source code appropriately
  • The AST is very convenient to operate

vue & AST

Vue takes the contents of the template string, compiles it, generates the AST, and optimizes the AST for some operations, such as marking static tags to prevent repeated rendering. As mentioned above, the code is actually a string of text. During the process of generating the AST, a large number of string operations are used

  1. Reads the template template string
  2. Parsing the information in the string generates an AST
  3. Optimize according to AST
  4. Generating object code

The steps will be detailed in the next article

What are the front-end applications of AST

Prettier (AST) Prettier (AST) Prettier (AST) Prettier (AST

  • Babel: A typical translator that converts the AST of source code into an AST of other code to regenerate object code, such as ES6 to ES5
  • JSX: The famous JSX syntax is actually compiled, and it is compiled a lotrenderfunction
  • ESlint: ESlint also needs to parse the AST of source code for compliance
  • TypeScript: Ts, which is used every day, is compiled into JS
  • V8: Chrome’s V8 engine can execute JS directly without thinking about it
  • Syntax highlighting: The colorful code you see every day is also compiled
  • Code tip: ditto
  • Error check: same as above
  • .

In fact, the title is not very accurate, should be “in front of the application of the principles of compilation”, because AST is only an intermediate product of the compilation process, in addition to the inverse polish representation, quaternion representation, ternary representation, etc., but AST is more commonly used

conclusion

The above gives a brief introduction to the compilation process, the AST generation process and some applications. It is just a preliminary knowledge, and the next article will start to implement vue3 compilation module