The best time to plant a tree was ten years ago, followed by now.

0x00 several concepts interpreter instruction register stack local variable table operand stack 0x01 register or stack scenario register scheme stack scheme mixed scheme, Combine register and stack scheme (i.e., JVM scheme) simple comparison 0x02 interpreter core 0x03 simple code implementation register stack mix about the above code 0x04 summary 0x05 preview 0x06 FAQ

0x00 several concepts

The interpreter

An interpreter is a program that interprets a programming language line by line. The interpreter acts like a middleman. Every time you run a program, you have to switch to another language, so the interpreter’s programs run slowly. It does not translate the entire program at once, but runs it as soon as it translates a line of program description, then translates the next line, runs it again, and so on.

instruction

Some related concepts, assembly instructions, JVM bytecode instructions. Instructions are usually simple and describe a specific operation. For example, the assembly instruction mov &ex, 1 => puts the integer 1 in register EX. The bytecode instruction bpush 1 => puts Byte 1 at the top of the operand stack.

register

Simply put, a register is a Map. It can be accessed and valued by its key. Main operations get(key), PUT (key,value)

The stack

Push: pushes a value to the top of the stack and increases the stack size by 1. Pop: removes a value from the top of the stack and decreases the stack size by 1.

Local variable scale

The data structure inside the stack frame is an array. LLDB Array [0], array[1]. From another perspective, you can actually think of it as a special register. It’s just that key is a subscript.

The operand stack

Stack frame internal data structure, same stack.

0x01 Register or stack

Technology design outside of the business scenario is rogue. — Nicholas. Zhao four

scenario

The pseudocode is shown below.

int a = 1 + 1; 

int b = 2 + 2;

int c = 0;

int d = b - a;

int d = d - c;

println(d)

Copy the code

2. The premise of automation is to be manual. So let’s do it.

Register scheme

Generate the instruction format, inst [value | address] + um participant

mov &0 1= > the number1Put in register0 里    

add &3 &0 &1 &2=> Add the register0Register,1Register,2Add the values in, and place the result in a register3In the water.

sub &3 &0 &1 &2=> Add the register0In minus the register1Register,2And put the result into a register3In the water.

println &3=> Take out the register3Value of and output.

Copy the code

The command code generated is as follows. For ease of understanding, a comment follows a double slash. Indicates the state of a register or stack after an operation

mov &0 1        // {0:1}

mov &1 1        // {0:1, 1:1}

add &2 &0 &1    // {0:1, 1:1, 2:2}



mov &3 2        // {0:1, 1:1, 2:2, 3:2}

mov &4 2        // {0:1, 1:1, 2:2, 3:2, 4:2}

add &5 &3 &4    // {0:1, 1:1, 2:2, 3:2, 4:2, 5:4}



mov &6 0        // {0:1, 1:1, 2:2, 3:2, 4:2, 5:4, 6:0}



sub &7 &5 &2    // {0:1, 1:1, 2:2, 3:2, 4:2, 5:4, 6:0, 7:2}



sub &8 &7 &6    // {0:1, 1:1, 2:2, 3:2, 4:2, 5:4, 6:0, 7:2, 8:2} 



println &8      // {0:1, 1:1, 2:2, 3:2, 4:2, 5:4, 6:0, 7:2, 8:2} 

Copy the code

Stack solution

Generate command format inst [value]{0,1} LLDB

push 1= > will value1Push to the top of the stack. - > (.. .1)  

Add => add the two values at the top of the stack and place the result at the top of the stack. ,v1,v2) -> (.. ,v1+v2)

Sub => assume the top stack value is v2, (.. ,v1,v2) -> (.. ,v1-v2)

Swap => swap the top two elements (v1,v2) -> (v2,v1)

Swap1 => second and third positions on switch stack (v1,v2,x) -> (v2,v1,x)

Println => the top value of the stack is sent out and the result is printed. .1) - > (..)

Copy the code

The command code generated is as follows. For ease of understanding, a comment follows a double slash. Indicates the state of a register or stack after an operation

push 1  // (1)

push 1  // (1, 1)

add     // (2)



push 2  // (2, 2)

push 2  // (2, 2, 2)

add     // (2, 4)



push 0  // (2, 4, 0)



swap    // (2, 0, 4)

swap1   // (0, 2, 4)

swap    // (0, 4, 2)

sub     // (0, 2)



swap    // (2, 0)

sub     // (2)



println  // ()

Copy the code

Hybrid schemes, combining registers and stacks (i.e., JVM schemes)

e.g:

load=> Takes the value from the register and pushes it to the operand stack.

store=> The top value of the operand stack is removed from the stack and stored in a register.

Copy the code

The command code generated is as follows. For ease of understanding, a comment follows a double slash. Indicates the state of a register or stack after an operation

push 1    // (1)     {}

push 1    // (1, 1)  {}

add       // (2)     {}

store &0  // ()      {0:2}



push 2    // (2)     {0:2}

push 2    // (2, 2)  {0:2}

add       // (4)     {0:2}

store &1  // ()      {0:2, 1:4}



push 0    // (0)     {0:2, 1:4}

store &2  // ()      {0:2, 1:4, 2:0}



load &1   // (4)     {0:2, 1:4, 2:0}

load &0   // (4, 2)  {0:2, 1:4, 2:0}

sub       // (2)     {0:2, 1:4, 2:0}

store &3  // ()      {0:2, 1:4, 2:0, 3:2}



load &3   // (2)     {0:2, 1:4, 2:2, 3:2}

load &2   // (2, 0)  {0:2, 1:4, 2:2, 3:2}

sub       // (2)     {0:2, 1:4, 2:2, 3:2}

store $4  // ()      {0:2, 1:4, 2:2, 3:2, 4:2}



load $4   // (2)     {0:2, 1:4, 2:2, 3:2, 4:2}

println   // ()      {0:2, 1:4, 2:2, 3:2, 4:2}

Copy the code

Simple comparison

In simple scenarios, all three solutions meet requirements. The register scheme corresponds to the computer physical implementation. CPU processing is register-based. Advantages High performance, less data handling times, disadvantages of complex instructions. A purely stack-based scheme does not seem to have any, because there are only two operations, push POP, and data needs to be moved frequently in the case of a large number of local variables. The advantage is simple instructions. Easy to transplant. Hybrid scheme, set the strengths of the two, in the portability and efficiency of the compromise scheme.

0x02 Interpreter core

As foretold, the core of the interpreter is a loop.

do {

Get the next instruction

Explain instructions

while(and instructions);

Copy the code

The architecture diagram is as follows

mini-jvm-1

0x03 Simple code implementation

INTERPRETER-DEMO

register

// Core loop

  public void run(a) {



    List<Inst> insts = genInsts();



    int size = insts.size();

    int pc = 0;



    while (pc < size) {

      Inst inst = insts.get(pc);

      inst.execute();

      pc++;

    }

  }



// The Add directive is implemented

class AddInst implements Inst {

  public final Integer targetAddress;

  public final Integer[] sourceAddresses;



  AddInst(Integer targetAddress, Integer... sourceAddresses) {

    this.targetAddress = targetAddress;

    this.sourceAddresses = sourceAddresses;

  }



  @Override

  public void execute(a) {

    int sum = 0;

    for (Integer sourceAddress : sourceAddresses) {

      sum += RegisterDemo.REGISTER.get(sourceAddress);

    }

    RegisterDemo.REGISTER.put(targetAddress, sum);

  }

}

Copy the code

Code address: register DEMO

The stack

// Core loop

  public void run(a) {



    List<Inst> insts = genInsts();



    int size = insts.size();

    int pc = 0;



    while (pc < size) {

      Inst inst = insts.get(pc);

      inst.execute();

      pc++;

    }

  }



// The Add directive is implemented

class AddInst implements Inst {



  @Override

  public void execute(a) {

    Integer v2 = StackDemo.STACK.pop();

    Integer v1 = StackDemo.STACK.pop();

    StackDemo.STACK.push(v1 + v2);

  }

}

Copy the code

Address: stack DEMO

hybrid

// Core loop

  public void run(a) {



    List<Inst> insts = genInsts();



    int size = insts.size();

    int pc = 0;



    while (pc < size) {

      Inst inst = insts.get(pc);

      inst.execute();

      pc++;

    }

  }



// The Add directive is implemented

class AddInst implements Inst {



  @Override

  public void execute(a) {

    Integer v2 = HybridDemo.STACK.pop();

    Integer v1 = HybridDemo.STACK.pop();

    HybridDemo.STACK.push(v1 + v2);

  }

}

Copy the code

Address: mixed DEMO

About the code above

Implemented for a specific scenario, it just works. Three scenarios, each with less than 100 lines of code. Back to the previous question, is 10 minutes enough to implement a simple interpreter? Nature is enough, have interest might as well try to write.

0 x04 summary

This article discusses three options for implementing the interpreter, and implements the corresponding interpreter for a simple case. The Mini-JVM is slowly expanding from this simple core. Since the JVM chooses a hybrid solution, the focus will be solely on the hybrid solution.

!!!!!!!!! The core implementation of the interpreter is particularly important, if this article does not make the reader understand, it must be a problem of the article, welcome to submit your questions, have iterated this article.

0 x05 forecast

  1. The current interpreter has nothing to do with a javac compiled class. It should.

0x06 FAQ

  1. Why don’t we start with the interpreter when, by convention, we should start with classfile parsing? Classfile parsing is not difficult, manual work, parsed data is for the interpreter service. Essentially a static data provider. Non-core logic. Therefore, the interpreter comes first.