motivation

Want to try to write a Javascript parser and interpreter, javascript is my first language, if not HTML as a language, it is also our stepping stone into the programming industry. I have been writing javascript for 2 or 3 years before, but for Web engineers, I never thought of writing a parser myself until I came into contact with tools like Webpack and VUE-CLI front-end automation and read the materials to know AST. AST may be difficult to cover in your daily business, but if you want to be more than just an engineer, first of all you need to know AST if you want to write more elegant and high-performance code, or more boldly, you need to know AST if you want to implement your own framework, or some batch processing widget. The POWER of AST is very powerful, and it really helps you understand the essence of javascript. You can click on the Adventure AST and type in javascript code on the left side to generate the AST tree on the right side of the window to try it out before you start.

But on the Internet about this part of the content is not much, so I want to collect some information, sort out and share with you, because personal understanding is effective, so there may be deficiencies, but also hope everyone to participate in the comments and discussion.

What is the AST

AST is short for Abstract Syntax Tree.

Word segmentation and parsing

Generating an abstract syntax tree goes through two stages, tokenize and parse. Word segmentation is to divide source code into grammatical units, and semantic analysis is to analyze the relationship between these grammatical units on the basis of word segmentation results.

Best practices

ESLint

ESLint is a plugin tool for checking and reporting JavaScript writing specifications, and configuring rules to regulate code and quality developer Writing Quality code. This share mentions this tool

Vue-cli

Vue-cli relies on AST for development

webpack

Webpack source code parsing AST without using Acorn. Webpack itself implements a JavascriptParser class that uses Acorn.

The parser

In addition, most of the resources on the Web for combinators of language parsers can be obscure to beginners. That’s because most of this is aimed at people who are familiar with the functional programming language Haskell, which is implemented with a combination of language parsers. For those unfamiliar with Haskell, it can be difficult to understand how to implement a parser combinator.

The idea behind the parser combinator is simple. It is possible for a beginner to implement a parser combinator library in a high-level language like Javascript with just a few hundred lines of code. In this post, you can invent parser combinators.

Start with a function that reads the first character from some input. If the first character is a, then a is returned and the input character is advanced by one. If not, the input is not advanced and the result of failure is returned.

If you want to write your own parser, see how others implement parser and use parser combinators first.

const A = require('arcsecond');

const stringParser = A.str('hello');
const result = stringParser.run(
    "hello"
)

console.log(result)//{ isError: false, result: 'hello', index: 5, data: null }
Copy the code

Stringparser. run(“hello”) calls the parser run method. An object is returned after passing in a string to match in.

  • IsError indicates whether a string is parsed
  • Result: Returns the parse structure
  • Index: returns the parsed index position
  • Data: data
const stringParser = A.str("hello");

resstringParser.run("world")
Copy the code

In this example, if there is no string in which there is no string to parse, the returned object isError field is true and error prompts an error message to analyze the cause.

{
  isError: true.error: "ParseError (position 0): Expecting string 'hello', got 'world... '".index: 0.data: null
}
Copy the code

Parsing combinator

Sometimes we can supply the many method with ArcSecond and pass the parser a.strr (“hello”) as an argument to the many method. This returns a parser that can parse multiple Helos from a string. This is a parser combination.

const A = require('arcsecond');

const stringParser = A.many(A.str('hello'));
const result = stringParser.run(
    "hellohelloworld"
)

console.log(result)//{ isError: false, result: [ 'hello', 'hello' ], index: 10, data: null }
Copy the code

Usually when we match strings, we take Spaces into account. Spaces are also used as characters to match. For example, since there is a space between hello and Hello, the result will only return a [‘hello’].

const stringParser = A.many(A.str('hello'));
const result = stringParser.run(
    "hello helloworld"
)

console.log(result)//{ isError: false, result: [ 'hello' ], index: 5, data: null }
Copy the code

Ok, in the example above, define a text parser combinator that can parse multiple Hellos from the text and return an array. So if we want to parse multiple words from a string, not just Hello but also word, we can use a. Cube to take an array, and each element of the array is a parser. Here we create two string parsers for hello and world.

const A = require('arcsecond');

const stringParser1 = A.str("hello");
const stringParser2 = A.str("world")

const stringParser3 = A.choice([
    stringParser1,
    stringParser2
])

const stringParser = A.many(stringParser3)

const result = stringParser.run(
    "helloworldhelloworld"
)

console.log(result)//{ isError: false, result: [ 'hello' ], index: 5, data: null }

/** * * { isError: false, result: [ 'hello', 'world', 'hello', 'world' ], index: 20, data: null } */

Copy the code

You can perform further operations on the result, such as converting letters to small and large letters using a map.

const stringParser = A.str("hello").map(result= >result.toUpperCase()) 

result = stringParser.run("hello")
console.log(result)//{ isError: false, result: 'HELLO', index: 5, data: null }
Copy the code

We’ll start with a simple parser and then rewrite it ourselves.

const A = {
    str: function(str){
        return (function(input) {
            var r = inputRead(input);
            if (r == str) {
                inputAdvance(input, 1);
                return r;
            } else {
                returnfailure; }}); },choice: function(str){
        return (function (input) {
            result = []
            for (parser in arr){
                let res  = parser(input);
                if(res ! = parser(input)){ result.push(res) } }if(result.length > 0) {return result
            }else{
                returnfailure } }); }}Copy the code