preface

In this article, we designed a set of abstract syntax trees to represent our language, and in the next two articles, we designed and implemented a virtual machine.

But what the virtual machine executes is a function prototype (similar to parsed Java bytecode). As a last resort, we need to generate function prototypes from abstract syntax trees.

The overall structure

In the abstract syntax tree, we focus on the expression of program logic. In functional prototypes, you’re more concerned with how the virtual machine works.

So in the process of parsing the abstract syntax tree step by step, we need to consider instruction generation, register allocation, local variable declaration, constant record.

In the implementation, they are treated separately as a module.

Instruction table and constant table

The instruction table and the constant table are relatively simple. It is to record the corresponding constants of the generated instructions and attach them to the corresponding function prototype.

/ / instruction sheet
export class InstructionList {
    #list: number[] = []

    add(ins: Instruction) {
        this.#list.push(ins)
        return this.#list.length - 1
    }

    attach(proto:Proto){
        proto.code = this.#list
    }
}

/ / often scale
export class ConstantRegister {
    #list: LxValue[] = []

    regist(val: LxValue) {
        const [type, value] = val
        const idx = this.#list.findIndex(v= >
            v[0= = =type && v[1] === value
        )

        // Check whether the corresponding constant is registered
        if (idx >= 0) {
            return idx
        } else {
            this.#list.push(val)
            return this.#list.length - 1}}attach(proto:Proto){
        proto.constants = this.#list
        return this.#list
    }
}
Copy the code

Stack allocation

As mentioned earlier, we split the stack into two parts.

We logically split the stack into two parts, with the first part used to store local/temporary variables, known as registers. The latter part implements the computing part.

For the later part of the calculation, because each instruction is executed, this part will be cleared. So just write down the maximum number you need.

export class StackAllocator {
    #extra: number = 0
    
    extra(n: number) {
        this.#extra = n > this.#extra ? n : this.#extra
    }
}
Copy the code

In the previous part, it is not known when these variables will be destroyed, that is, an array is required to record the current state of the stack (whether the current index stack is in use) and two methods, apply and release.

For some call instructions, it is necessary to ensure that all values are in a contiguous register during execution. Therefore, size is also required during application, and the return value is the index of the first register.

On the one hand, it is necessary to reuse the previously released registers as much as possible, on the other hand, it is necessary to ensure that all registers applied are continuous. The application method is relatively complicated.

export class StackAllocator {
    // ...
    #slot: boolean[] = []
    
    alloc(size: number) {
        if (size <= 0) throw new Error('alloc : size error! ')
        let idx = -1
        let remains = size

        for (let i = 0; i < this.#slot.length; i++) {
            if (this.#slot[i] === true) {
                idx = -1
                remains = size
                continue
            }

            if (idx === -1) {
                idx = i
            }

            remains = remains - 1

            if (remains === 0) break
        }

        idx = idx < 0 ? this.#slot.length : idx

        this.#slot = this.#slot.concat(new Array(remains).fill(false))

        for (let i = idx; i < idx + size; i++) {
            this.#slot[i] = true
        }

        if (this.#slot.length > 256) {
            throw new Error('alloc: stackoverflow! ')}return idx
    }
    free(idx: number, size: number = 1) {
        for (
            let i = idx;
            i < idx + size && i < this.#slot.length;
            i++
        ) {
            this.#slot[i] = false}}}Copy the code

Finally, attach method records the maximum stack number and the number of registers in it in the function prototype.

export class StackAllocator {
    // ...
    attach(proto:Proto){
        proto.slotSize = this.#slot.length
        proto.maxStackSize = this.#slot.length + this.#extra
    }
}
Copy the code

Local variable scale

When you define a local variable, that variable is logged to the stack. You also need a table to keep track of variables and stack indexes.

export class LoaclVarTable {
    #table = new Map<string.number> ()define(name: string, slot: number) {
        if (this.#table.has(name))
            throw new Error(` '${name}' has already been declared`)
        else
            this.#table.set(name, slot)
    }

    get(name: string) {
        const idx = this.#table.get(name)

        if (typeofidx ! ='number') {
            throw new Error(` '${name}' is not defined`)}else {
            return idx
        }
    }
}
Copy the code

prototyping

Abstract a ProtoGenerator class for prototype generation. In addition to the several parts already done, the Protos field is required to record all subfunctions, and Argus is required to record the argument list.

The construction of a ProtoGenerator naturally requires a Func object in a lexical tree. If there are arguments in the constructor, you also need to allocate a register for each argument.

Then the Gen method is implemented to generate the corresponding function prototype.

export class ProtoGenerator {
    #constant = new ConstantRegister()
    #stack = new StackAllocator()
    #local = new LoaclVarTable()
    #instructions = new InstructionList()
    #protos: ProtoGenerator[] = []
    #argus: Argu[] = []

    constructor(fn: Func) {
        this.#argus = fn.argu
        
        if(this.#argus.length > 0) {const idx =  this.#stack.alloc(this.#argus.length)
            this.#argus.forEach((argu,i) = >{
                this.#local.define(argu.name,idx + i)
            })
        }
        // todo
    }

    gen(){
        const proto = new Proto()
        proto.numParams = this.#argus.length
        proto.protos = this.#protos.map(v= >v.gen())

        this.#constant.attach(proto)
        this.#stack.attach(proto)
        this.#instructions.attach(proto)
    
        return proto
    }
}
Copy the code

Parse syntax tree

During the construction of ProtoGenerator, the lexical tree is analyzed.

The body function is a list of statements. Add a statement method to ProtoGenerator for statement parsing.

export class ProtoGenerator {
    / /...
    constructor(fn: Func) {
        this.#argus = fn.argu
        
        if(this.#argus.length > 0) {const idx =  this.#stack.alloc(this.#argus.length)
            this.#argus.forEach((argu,i) = >{
                this.#local.define(argu.name,idx + i)
            })
        }
        
        fn.body.forEach(statement= > {
            this.statement(statement)
        });

    }
    
    statement(state: Statement){
        // todo}}Copy the code

Statement

Statement is a virtual class that actually has no instance of Statement. Each subclass of Statement also implements a method to parse.

One of the more special is Expr, Expr will return a value, before parsing Expr needs to allocate a register to store this value, after parsing is completed, release it.

export class ProtoGenerator {
    / /...
    statement(state: Statement) {
        if(state instanceof Expr){
            const slot =  this.#stack.alloc(1)
            this.expr(state,slot)
            this.#stack.free(slot)
        }else if(state instanceof Return){
            this.return(state)
        }else if(state instanceof Define){
            this.define(state)
        }else if(state instanceof Setter){
            this.setter(state)
        }else{
            throw new Error('undfined expr! ')}}}Copy the code

Expr

Expr is also a virtual class. Same operation as before in Statement.

export class ProtoGenerator {
    / /...
    expr(expr: Expr, slot: number) {
        if(expr instanceof Int){
            this.int(expr,slot)
        }else if(expr instanceof BinaryIntArith){
            this.binaryIntArith(expr,slot)
        }else if(expr instanceof Nil){
            this.nil(expr,slot)
        }else if(expr instanceof Getter){
            this.getter(expr,slot)
        }else if(expr instanceof Func){
            this.func(expr,slot)
        }else if(expr instanceof Call){
            this.call(expr,slot)
        }else{
            throw new Error('undfined expr! ')}}}Copy the code

Nil

Nil is the easiest way to do it. Just add a loadnil instruction and loadnil into the corresponding register.

Because loadnil uses an extra stack bit, you also need to call the extra method.

export class ProtoGenerator {
    // ...
    nil(_: Nil, slot: number) {
        this.#stack.extra(1)
        this.#instructions.add(ins.create(OpCode.loadNil, slot, 1.0))}}Copy the code

Int

Int goes one step further than nil, and adds the value to the constant table. Then load the corresponding register with loadk.

If the constant index is too large, 26 bits are not enough. Then call loadKX load.

export class ProtoGenerator {
    // ...
    int(int: Int, slot: number) {
        this.#stack.extra(1)
        const { val } = int
        const idx = this.#constant.regist([LxType.int, val])
        if (idx < 2支那26) {
            this.#instructions.add(ins.create(OpCode.loadK, slot, idx))
        } else {
            this.#instructions.add(ins.create(OpCode.loadKX, slot, 0))
            this.#instructions.add(ins.iax(0, idx))
        }
    }
}
Copy the code

BinaryIntArith

Binary operations have left and right operands, each of which is assigned a register. Call EXPR separately for parsing.

Call different instruction opcodes according to different arithmetic characters.

Finally, two registers are released.

If the operand is directly Int and the constant table index is less than 255, optimization is also required.

export class ProtoGenerator {
    // ...
    
    binaryIntArith(binaryArith: BinaryIntArith, slot: number) {
        
        this.#stack.extra(2)

        const { arith, a, b } = binaryArith
        let op: OpCode = OpCode.iadd

        switch (arith) {
            case 'add': op = OpCode.iadd; break;
            case 'sub': op = OpCode.isub; break;
            case 'mul': op = OpCode.imul; break;
            case 'div': op = OpCode.idiv; break;
            default: throw new Error('error binaryArith')}let slotA = 0
        if (
            a instanceof Int
            && (this.#constant.regist([LxType.int, a.val]) <= 0xff)
        ) {
            slotA = this.#constant.regist([LxType.int, a.val]) + 1 + 0xff
        } else {
            slotA = this.#stack.alloc(1)
            this.expr(a, slotA)
        }


        let slotB = 0
        if (
            b instanceof Int
            && (this.#constant.regist([LxType.int, b.val]) <= 0xff)
        ) {
            slotB = this.#constant.regist([LxType.int, b.val]) + 1 + 0xff
        } else {
            slotB = this.#stack.alloc(1)
            this.expr(b, slotB)
        }


        this.#instructions.add(ins.create(op, slot, slotA, slotB))

        if (slotA <= 0xff) { this.#stack.free(slotA) }
        if (slotB <= 0xff) { this.#stack.free(slotB) }

    }
}
Copy the code

Define

Register variables – Apply for a register slot and register it in the local variable table. The default value is then parsed and written into the register.

export class ProtoGenerator {
    // ...
    define(def: Define) {
        this.#stack.extra(1)
        const { name, defVal } = def
        const slot = this.#stack.alloc(1)
        this.expr(defVal, slot)
        this.#local.define(name, slot)
    }
}
Copy the code

Setter

Find the register index from the local variable list by name. The value is then parsed to the register by a call to EXPr.

export class ProtoGenerator {
    // ...    
    setter(set: Setter) {
        this.#stack.extra(1)
        const { name, val } = set
        const slot = this.#stack.alloc(1)
        const target = this.#local.get(name)
        this.expr(val, slot)
        this.#instructions.add(ins.create(OpCode.move, target, slot, 0))
        this.#stack.free(slot)
    }
    
}
Copy the code

Getter

Find the register index from the local variable list by name. The move instruction is called to write the value in the register to the desired register.

export class ProtoGenerator {
    // ...
    getter(get: Getter, slot: number) {
        this.#stack.extra(1)
        const { name } = get
        const source = this.#local.get(name)
        this.#instructions.add(ins.create(OpCode.move, slot, source, 0))}}Copy the code

Func

Generate a ProtoGenerator from the Func object, add it to the subfunction table, and then generate a Func instruction to load the corresponding function prototype into the register.

export class ProtoGenerator {
    // ...
    func(func: Func, slot: number) {
        const proto = new ProtoGenerator(func)
        this.#protos.push(proto)
        this.#instructions.add(ins.create(OpCode.func, slot, this.#protos.length - 1))}}Copy the code

Call

Apply for n + 1 continuous registers according to the number of parameters n. Write the called function and arguments to these registers, respectively. Then the call command is generated for invocation.

Note that the call method on the virtual machine consumes n+1 stack bits simultaneously. So you need to call the extra method to record it.

export class ProtoGenerator {
    // ...
    call(call: Call, slot: number) {
        const { fn, inputs } = call
        const target = this.#stack.alloc(inputs.length + 1)
        this.#stack.extra(inputs.length + 1)

        this.expr(fn, target)

        inputs.forEach((argu, i) = > {
            this.expr(argu, target + i + 1)})this.#instructions.add(ins.create(OpCode.call, slot, target, inputs.length))

        this.#stack.free(target,inputs.length + 1)}}Copy the code

Return

Call return to return the value of the corresponding register.

export class ProtoGenerator {
    // ...
    return(ret: Return) {
        const { val } = ret
        const slot = this.#stack.alloc(1)
        this.expr(val, slot)
        this.#instructions.add(ins.create(OpCode.return, slot, 0.0))
        this.#stack.free(slot)
    }
}
Copy the code

conclusion

Since then, four articles, a compiler, and a virtual machine have been completed.

However, the virtual machine and language, with no logical operations, no floating point numbers and strings, no looping judgments, and no objects and closures, can only do a mixture of four integers.

But never mind, the functionality can be expanded. The next article will add closures for this virtual machine.