Illustrated JVM bytecode execution engine

We all know that after current Java (1.0), the compiler converts source code into bytecode, so how is bytecode executed? This involves the JVM’s bytecode execution engine, which is responsible for the specific code calls and execution. For now, all execution engines are basically the same:

Enter: bytecode file
Processing: bytecode parsing
Output: Execution result.

The execution engine of a physical machine is implemented by hardware. Unlike the execution process of a physical machine, the execution engine of a VIRTUAL machine is implemented by itself.

The stack structure at runtime

Each thread has a stack, and the basic elements of the stack are called stack frames. Stack frames are used to support method calls and method executions by virtual machinesThe data structure. Each stack frame contains the following parts: local variable table, operand stack, dynamic linkage, method return address, and some additional information. How large the local variable table and how deep the operand stack should be in the stack frame is fully determined during compiling the Code and written into the Code property in the method table. In an active thread, the stack frame at the top of the current stack is valid, called the current frame, and the method associated with this stack frame is called the current method. All bytecode instructions run by the execution engine operate only on the current stack frame. Note that the number of stack frames that can fit on a stack is limited, and too deep a method call can result in a StackOverFlowError. The schematic diagram of the model is as follows:

1. Local variable table is the storage space of variable values, which is composed of method parameters and local variables defined inside the method. Its capacity uses Slot1 as the minimum unit. At compile time, the maximum size of the local variable table that the method needs to allocate is determined in the max_locals data item of the method’s Code property. Since the local variable table is built on the stack of the thread, it is the thread’s private data, so there is no data security problem. During method execution, the virtual machine passes the parameter values to the parameter variable list using the local variable table. In the case of an instance method, the SLot in the zeroth index of the local variable table stores a reference to the instance of the object that the method belongs to, so the implied parameter can be queried within the method using the keyword this. The rest of the parameters are arranged in the order of the parameter list. After the parameter list is assigned, the parameters are assigned according to the order and scope of the variables defined in the method body. We know that the class variable table has two chances to be initialized. The first is during the “preparation phase”, when system initialization is performed and the class variable is set to zero, and the second is during the “initialization” phase, when the programmer is given the initial values defined in the code. Unlike class variable initialization, there is no system initialization for a local variable table, which means that once a local variable is defined, it must be initialized manually or it cannot be used. For example:

public void test(){
    call(2.3); . call2(2.3);
}

public void call(int i,int j){
    int b=2; . }public static void call2(int i,int j){
    int b=2; . }Copy the code

For convenience, assume that the above two pieces of code are in the same class. This is the local variable table in the stack frame corresponding to call(), which looks like this:

The local variation table of stack frame corresponding to Call2 () is roughly as follows:

2, operand stack back in first out of the stack, bytecode instructions to store and fetch data on the stack, any element in the stack can be arbitrary Java data type. Like local variables, the maximum depth of the operand stack is written to the max_stacks data item in the Code attribute at compile time. When a method is first executed, the operand stack of the method is empty. During the execution of the method, various bytecode instructions write and fetch contents to and from the operands, namely, push/push operations. The data types of the elements in the operand stack must exactly match the sequence of bytecode instructions 2, which is verified by the compiler during compiler time and again during data flow analysis during class validation during class loading. In addition, we say that the Interpretation engine of the Java Virtual machine is a stack-based execution engine, where the stack refers to the operand stack.

Each stack frame contains a reference to the method in the runtime constant pool to which the stack frame belongs, which is held to support dynamic concatenation during method calls.

4. The method return address stores the value of the PC counter that called the method. Once a method has been started, there are only two ways to exit the method: 1. The execution engine encounters bytecode instructions returned by any of the methods, known as the normal completion exit. 2. When an exception is encountered during the execution of a method, and the exception is not handled within the method, that is, as long as no matching exception handler is found in the exception table of this method, the method will exit, which is called the exception completion exit. The difference between a normal completion exit and an exception completion exit is that an exception completion exit does not return any value to its upper callers. In case of normal exit, the value of the caller’s PC counter is used as the return address. In case of exception exit, the return address is determined by the exception handler table. This part of information is generally not stored in the stack frame. In essence, the method exit is the process of the current stack frame out of the stack.

The main task of method invocation is to determine the version of the called method (that is, which method to call), this process does not involve the specific operation process of the method. According to the way of invocation, it can be divided into two types:

Parsing calls is a static process, and the target method is fully determined at compile time.
Dispatch calls can be static or dynamic and can be divided into single dispatch and multiple dispatch according to the dispatch criteria. Pairwise combinations form static single dispatch, static multiple dispatch, dynamic single dispatch and dynamic multiple dispatch

5.1 Parsing: In the Class file, the target methods in all method calls are symbolic references in the constant pool. During the parsing phase of the Class load, some symbolic references are converted to direct references. In other words, the only target method can be determined at compile time. Such methods mainly include static methods and private method two kinds big, the former is directly associated with the type, the latter in the external access, thus determines their can’t rewrite the method through inheritance or by other way, accord with the two kinds of methods mainly include the following: the static method, a private method, instance constructor, the superclass method. The virtual machine provides the following method call instructions:

Invokestatic: Invokes static methods, and the parsing phase determines the unique method version
Invokespecial: call<init>Method, private, and superclass methods, and the resolution phase determines the unique method version
Invokevirtual: Calls all virtual methods
Invokeinterface: Invokes interface methods
Invokedynamic: Dynamically resolves the method to be invoked and executes it

The first four instructions are fixed inside the virtual machine and the method invocation execution cannot be considered interference, whereas the InvokeDynamic instruction allows the user to determine the method version. The invokestatic and invokespecial commands are called non-virtual methods, and the rest (excluding final modification [^footnote4]) are called virtual methods.

5.2 Dispatch: Dispatch calls are more polymorphic.

Static dispatching: All dispatching that relies on static type 3 to locate the version of a method execution is called static dispatching, which occurs at compile time and is typically used with method overloading.
Dynamic dispatch: Dispatch that determines the execution version of a method at runtime based on actual type 4 is called dynamic dispatch. It occurs during program execution and is typically used for method rewriting.
Single dispatch: Select the target method based on a case 5.
Multiple dispatch: Select a target method based on more than one case.

5.3. JVM implementation of dynamic dispatch Dynamic dispatch is widely used in Java and is used so frequently that it may affect execution efficiency if a class’s method metadata has to be searched again for the appropriate target during each dynamic dispatch. So the JVM creates virtual method tables in the method section of the class to improve performance. Each class has a virtual method table that holds the actual entry to each method. If a method is not overridden in a subclass, the address entry of the method in the virtual method table of the subclass is the same as the address entry of the method in the parent class, i.e. the method entry of the subclass points to the method entry of the parent class. If a subclass overrides a method of its parent class, the actual entry to that method in the subclass’s virtual method table will be replaced with the entry address pointing to the implementation version of the subclass. So when is the virtual method table created? The virtual method table is created and initialized during the connection phase of the class load, and the JVM initializes the method table for that class after the class’s variable initializers are ready.

6. Method execution

6.1 Execution description

Back in JDK 1.0, Java virtual machines were all interpreted execution, and as technology has evolved, most mainstream virtual machines now include just-in-time compilers. Therefore, only the virtual machine can accurately determine whether to interpret or compile the code in the process of executing it, but the principle of any virtual machine basically conforms to the modern classic compilation principle, as shown in the following figure:

In Java, the Javac compiler performs lexical analysis, syntax analysis, and abstraction of the syntax tree, ultimately traversing the syntax tree to generate a stream of linear bytecode instructions, which occurs outside of the virtual machine.

6.2 Stack-based Instruction Set and Register-based Instruction Set The instruction stream input by the Java compiler is basically a stack-based instruction set architecture. Most of the instructions in the instruction stream are zero-address instructions, and their execution process depends on the operation stack. Another kind of instruction set architecture is registrie-based instruction set architecture, which is typically used in x86 binary instruction set, such as traditional PC and Android Davlik virtual machine. The most direct is the difference between the two, based on the stack, instruction set architecture does not require hardware support, and based on register instruction set architecture is completely dependent on the hardware, this means of instruction set architecture based on register execution efficiency is higher, the single portability is poor, and the instruction set architecture based on stack portability is higher, but the execution efficiency is relatively slow, for the first time, the same action, The stack-based instruction set usually requires more instructions. For example, to perform the same logical operation 2+3, the instructions are as follows: Stack-based computing flow (Using the Java VM as an example) :

iconst_2  // push constant 2 to the stack
istore_1  
iconst_3  // push constant 3 to the stack
istore_2
iload_1
iload_2
iadd      // Add constants 2 and 3 on the stack
istore_0  // result 5 is pushed into the stackCopy the code

And the calculation process of giving registers:

mov eax,2 // Set the eAX register to 1
add eax,3 // Add 3 to the eAX registerCopy the code

Let’s use a simple example to explain the process of JVM code execution. The code example is as follows:


public class MainTest {
    public  static int add() {int result=0;
        int i=2;
        int j=3;
        int c=5;
        return result =(i+j)*c;
    }

    public static void main(String[] args) { MainTest.add(); }}Copy the code

View the bytecode using the Javap directive:

{
  public MainTest(a); flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1 // Method java/lang/Object."
       
        ":()V
       
         4: return
      LineNumberTable:
        line 2: 0

  public static int add(a); flags: ACC_PUBLIC, ACC_STATIC Code: stack=2, locals=4, args_size=0     // Stack depth 2, 4 local variables, 0 parameters
         0: iconst_0  // result=0,0 is pushed
         1: istore_0  // Take the top element 0 and store it in the 0th local variable solt
         2: iconst_2  // I =2,2 is pushed
         3: istore_1  // Take the top element 2 and store it in the first local variable solt
         4: iconst_3  // if j= 3,3 is pushed
         5: istore_2  // Take the top element 3 and store it in the second local variable solt
         6: iconst_5  // if c= 5,5 is pushed
         7: istore_3  // Take the top element of the stack and store it in the third local variable solt
         8: iload_1   // Copy the value 2 in the first slot of the local variable table to the top of the stack
         9: iload_2   // Copy the value 3 in the second slot in the local variable table to the top of the stack
        10: iadd      // The two top elements, 2, and 3, are added to the stack, and the result, 5, is pushed back onto the stack
        11: iload_3   // Copy the number 5 in the third slot in the local variable table to the top of the stack
        12: imul      // two elements at the top of the stack are pushed 5,5, multiplied, and pushed
        13: dup       // copy the top element 25 and push the copied value to the top.
        14: istore_0  // Take the top element 25 and store it in the 0th local variable solt
        15: ireturn   // Return the top stack element 25 to its caller
      LineNumberTable:
        line 4: 0
        line 5: 2
        line 6: 4
        line 7: 6
        line 8: 8

  public static void main(java.lang.String[]);
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=1, locals=1, args_size=1
         0: invokestatic  #2 // Method add:()I
         3: pop
         4: return
      LineNumberTable:
        line 12: 0
        line 13: 4
}
Copy the code

The code, operand stack, and local variable table change during execution as follows:

Also known as capacity slots, the virtual specification does not specify how much memory a Slot should occupy. ↩
Strict matching here means that the actual element types in the stack of bytecode operations must be the same as the element types specified by the bytecode. For example, the iADD directive specifies that two elements in the stack must be integers when operating on the actual elements in the stack. ↩
Animal dog=new Dog(); Animal is what we call a static type, and Dog is a dynamic type. The difference is that the static type only changes when used. The static type of the variable itself is not changed. The final static type is known at compile time, while the actual type is determined at run time. ↩
Animal dog=new Dog(); Animal is what we call a static type, and Dog is a dynamic type. The difference is that the static type only changes when used. The static type of the variable itself is not changed. The final static type is known at compile time, while the actual type is determined at run time. ↩
Case: The receiver of a method and its parameters are called case of the method.

Here’s an example:

public void dispatcher(){ Int result = enclosing the execute (8, 9); } public void execute(int pointX,pointY){ //TODO }

Execute (8,9) is called from the dispatcher() method, and the receiver of this is the object to which this points, 8 and 9 are the parameters of the method, and this and its parameters are called arguments.↩

Illustrated JVM bytecode execution engine

Related Posts

Several issues with the Android Activity lifecycle

Tips for Optimizing Flutter performance

Who told you that Flutter would kill native developers?