An overview of the

This article is the first in a tutorial on getting started with the JVM. In this blog post, I will introduce you to the basic structure of the JVM and related concepts, and show you how to run a Java program with a simple example.

JVM runtime data area

As shown in the figure above, after the Java code is compiled, the class file is generated. When a Java program is running, the JVM allocates a memory space for the class file to store information about its runtime, the JVM runtime data area. The JVM run data area is further divided into several different data areas depending on whether the thread is exclusive or shared. The thread shared part includes method area and heap memory, and the thread exclusive part includes virtual machine stack, local method stack, and program counter.

Methods area

In short, the method area is used to store data from the class file, such as information about classes loaded by the virtual machine, constants, static variables, and code compiled by the just-in-time compiler. It is a logical partition in the virtual machine specification. The implementation varies from vm to VM. For example, HotSpot puts the method area in the persistent generation in Java7 and the method area in the metadata space in Java8 and manages this area through GC.

Heap memory

Once the classes are loaded, we may need to use them to create objects. The sole purpose of this memory area is to hold object instances, and almost all object instances are allocated memory here. The heap memory can also be subdivided into old generation, new generation (Eden, From Survivor, To Survivor).

Program counter

A program counter is a small memory space that can be regarded as a line number indicator of the bytecode executed by the current thread, recording the position of execution by the current thread. The CPU will only execute instructions in one thread at a time, and after a thread switch, the line number of bytecode pointed to by the thread independent thread counter can be returned to the last point, and continue to execute the remaining bytecode.

The virtual machine stack

Each thread has a private space in this space. A thread stack consists of multiple stack frames. A thread executes one or more methods, and each method corresponds to a stack frame. The contents of stack frame include: local variable table, operand stack, dynamic link, method return address, additional information and so on. The maximum stack memory is 1 MB by default, exceeding which a StackOverflowError is thrown.

Local method stack

Similar to the function of the vm stack, the vm stack is prepared for the VM to execute Java methods, and the Native method stack is prepared for the VM to use Native methods. Just like the implementation of the virtual stack, StackOverflowError is raised when the size is exceeded. The implementation is determined by different virtual machine vendors.

Examples demonstrate

Runtime environment

  • System: win10
  • The JDK: jdk1.8
  • Hex file viewing tool: Winhex

Code demo

Here, we define a simple Java class Demo1. In the main method of the class, we define the simple calculation logic, and we can quickly see that the printed result is 55. But that’s not the point; we’re going to show you how Java programs run in the JVM.

public class Demo1{ public static void main(String[] args){ int x = 500; int y = 100; int a = x / y; int b = 50; System.out.println(a + b); }}Copy the code

We locate the Java file on the command line and compile it.

javac Demo1.java
Copy the code

Use Winhex to view hexadecimal class files

Because the contents of a class file are made up of many hexadecimal bytes. Windows does not read directly by default, so we can install winHEX software to read class files in general. (Download address:www.x-ways.net/winhex/Download, unzip, install winHEX, and drag the class file to the interface to view the contents of the class file.The class file contains the bytecodes executed by Java program code, and the data is arranged in a tightly formatted binary stream of the class file without any delimiters. The file begins with a special symbol 0xcafeBabe (hexadecimal).

Use Javap to view class files

To better read the contents of the class file, we can use the javap command to parse the file and write the contents to demo1.txt.

javap -v Demo1.class > Demo1.txt
Copy the code

In the directory where the current Java file is located, a demo1.txt file will be generated. We open this file and the contents of the file are as follows:

Classfile /C:/Java Senior Engineer project /jvm_demo/ demo1.class Last Modified 2020-8-22; size 414 bytes MD5 checksum ae6fa820973681b35609c75631cb255b Compiled from "Demo1.java" public class Demo1 minor version: 0 major version: 52 flags: ACC_PUBLIC, ACC_SUPER Constant pool: #1 = Methodref #5.#14 // java/lang/Object."<init>":()V #2 = Fieldref #15.#16 // java/lang/System.out:Ljava/io/PrintStream; #3 = Methodref #17.#18 // java/io/PrintStream.println:(I)V #4 = Class #19 // Demo1 #5 = Class #20 // java/lang/Object #6  = Utf8 <init> #7 = Utf8 ()V #8 = Utf8 Code #9 = Utf8 LineNumberTable #10 = Utf8 main #11 = Utf8 ([Ljava/lang/String; )V #12 = Utf8 SourceFile #13 = Utf8 Demo1.java #14 = NameAndType #6:#7 // "<init>":()V #15 = Class #21 // java/lang/System #16 = NameAndType #22:#23 // out:Ljava/io/PrintStream;  #17 = Class #24 // java/io/PrintStream #18 = NameAndType #25:#26 // println:(I)V #19 = Utf8 Demo1 #20 = Utf8 java/lang/Object #21 = Utf8 java/lang/System #22 = Utf8 out #23 = Utf8 Ljava/io/PrintStream;  #24 = Utf8 java/io/PrintStream #25 = Utf8 println #26 = Utf8 (I)V { public Demo1(); descriptor: ()V flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 1: 0 public static void main(java.lang.String[]); descriptor: ([Ljava/lang/String;)V flags: ACC_PUBLIC, ACC_STATIC Code: stack=3, locals=5, args_size=1 0: sipush 500 3: istore_1 4: bipush 100 6: istore_2 7: iload_1 8: iload_2 9: idiv 10: istore_3 11: bipush 50 13: istore 4 15: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 18: iload_3 19: iload 4 21: iadd 22: invokevirtual #3 // Method java/io/PrintStream.println:(I)V 25: return LineNumberTable: line 3: 0 line 4: 4 line 5: 7 line 6: 11 line 7: 15 line 8: 25 } SourceFile: "Demo1.java"Copy the code

Class file information

Let’s start with the Classfile section:

Classfile /C:/Java Senior Engineer project /jvm_demo/ demo1.class Last Modified 2020-8-22; size 414 bytes MD5 checksum ae6fa820973681b35609c75631cb255b Compiled from "Demo1.java"Copy the code

It describes the path of the class file, the last update time, the size in bytes, the MD5 check code, and specifies the Java file from which it was compiled.

Class content – Basic information

public class Demo1
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC, ACC_SUPER
Copy the code

In this command, major version indicates the major version number (JDK5,6,7, and 8 correspond to 49,50,51,52 respectively), minor version indicates the minor version number, and flag indicates the access flag. For the meanings of access flags, see the following table.

Class content – Constant pool

Constant pool: #1 = Methodref #5.#14 // java/lang/Object."<init>":()V #2 = Fieldref #15.#16 // java/lang/System.out:Ljava/io/PrintStream; #3 = Methodref #17.#18 // java/io/PrintStream.println:(I)V #4 = Class #19 // Demo1 #5 = Class #20 // java/lang/Object #6  = Utf8 <init> #7 = Utf8 ()V #8 = Utf8 Code #9 = Utf8 LineNumberTable #10 = Utf8 main #11 = Utf8 ([Ljava/lang/String; )V #12 = Utf8 SourceFile #13 = Utf8 Demo1.java #14 = NameAndType #6:#7 // "<init>":()V #15 = Class #21 // java/lang/System #16 = NameAndType #22:#23 // out:Ljava/io/PrintStream;  #17 = Class #24 // java/io/PrintStream #18 = NameAndType #25:#26 // println:(I)V #19 = Utf8 Demo1 #20 = Utf8 java/lang/Object #21 = Utf8 java/lang/System #22 = Utf8 out #23 = Utf8 Ljava/io/PrintStream;  #24 = Utf8 java/io/PrintStream #25 = Utf8 println #26 = Utf8 (I)VCopy the code

This stores the static constants contained in the class information, which can be verified after compilation. Refer to the following table for the meanings of the identifiers used here.Java file and demo1.txt file, with reference to the table above, we can see which constants are stored in the constant pool: For example, Java inherits the Object class by default, so we reference the Object class and its associated no-argument constructor constants, and we use the system.out.println () method in our code. The system.out.println () method is also referred to as class, method, and field constants. Then, the encoding of our system is UTF-8, and the corresponding utF-8 encoded string constants are also recorded in the constant pool. (Code in #8 represents the method table, and LineNumberTable in #9 represents the mapping between Java source line numbers and bytecode instructions. These are the default constants.)

Class content – constructor

  public Demo1();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 1: 0
Copy the code

This section describes the constructor information of the class file. In the Demo1 example, we did not write a constructor, so we can see that there is an implicit no-argument constructor when the constructor is not defined. Flags ACC_PUBLIC indicates the public type. See the class section above – table for basic information. Code represents the method table. “Stack =1, locals=1, args_size=1”, where the depth of the operand stack is 1, the number of local variables is 1, and the number of parameters is 1. The number of local variables and arguments is 1 because the constructor itself contains this by default.

Let’s move on. “0: aload_0” means loading the reference type value onto the stack from local variable 0, which in this case is the this variable, preceded by the number, which is the offset (bytes). “1: Invokespecial #1” indicates that the compile-time method binding calls the corresponding method #1, that is, the no-argument constructor of Object. “4: return” indicates that the void function returns. LineNumberTable” represents the mapping between source code and bytecode instructions. “Line 1:0” indicates that the first line of the source code points to a bytecode instruction with an offset of 0, in this case “0: aload_0”.

If you look at this, some of you might be a little confused about what the offset is, and how it’s calculated. So let me make it very simple.

  1. The offset represents the relative entry address offset in bytes. For example, if the offset is 1, the address is offset by one byte
  2. Each byte holds an instruction, or a parameter
  3. Offset of next bytecode = current offset + number of current instructions + number of arguments

For example, “1: Invokespecial #1” above, it can be seen that the current offset is 1, and #1 is the index pointing to the constant value. Here we cannot simply consider the number of parameters as 1, but need to refer to the official Java virtual machine specification document, which I put in the appendix.By checking the official documents, we can know that when using the invokespecial command, two parameters are required. So the offset of the next bytecode =1+1+2=4. That is, the offset of “4: Return”. Of course, in general, we don’t need to calculate the offset ourselves. This is just an official document to satisfy our curiosity.

Class content -main method

The following is the opcode information of the main method of the class file. Before explaining the operation process of the main method, we first carry out a complete operation analysis of the program.

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=5, args_size=1
         0: sipush        500
         3: istore_1
         4: bipush        100
         6: istore_2
         7: iload_1
         8: iload_2
         9: idiv
        10: istore_3
        11: bipush        50
        13: istore        4
        15: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        18: iload_3
        19: iload         4
        21: iadd
        22: invokevirtual #3                  // Method java/io/PrintStream.println:(I)V
        25: return
      LineNumberTable:
        line 3: 0
        line 4: 4
        line 5: 7
        line 6: 11
        line 7: 15
        line 8: 25
Copy the code

Program Complete Operation Analysis (I)

As you can see from the figure above, the Java source code is compiled to form a class bytecode file. When the JVM loads a class, it loads class information, runtime constant pools, string constants, and so on into the method area. For the HotSpot virtual machine, the method area was stored in the persistent generation prior to 1.7, and since 1.8, the method area has been called the metadata space.

Program Complete operation Analysis (II)

Once the class is loaded, the program runs. The JVM then creates threads to execute the code. This is where space needs to be allocated in the virtual machine stack and program counters. (There is no reference to the local method stack, because our code is Java code.) Each thread has its own space, and the program counter has its own bytecode instruction address.

Program Complete operation analysis (III)

When a thread is running, it needs to create a small space in the program counter to keep track of where the current thread is executing code. You also need to create a space in the virtual machine stack. One virtual machine stack for each thread. A virtual machine stack corresponds to multiple stack frames. Stack frames are operations corresponding to methods. The main method is the entrance to the program. The main method stack frame contains the local variable table and operand stack. Let’s examine the execution of the main method above.

“Stack =3, locals=5, args_size=1” means that the depth of the operand stack is 3, the number of local variables is 5 (args, x, y, A, B), and the number of parameter variables is 1 (args).

The offset Bytecode instruction Operation description Program counter position Local variable scale The operand stack
0 sipush 500 Push the value 500 onto the operand stack 0 args 500
3 store_1 Pop operand stack top 500, save to local variable table 1 3 args,500
4 bipush 100 Push the value 100 onto the operand stack 4 args,500 100
6 istore_2 Pop the top element of the operand stack, in this case 100, and save it to local variable table 2 6 The args, 500100
7 iload_1 Read local variable table 1, push operand stack, here 500 7 The args, 500100 500
8 iload_2 Read local variable table 2, push operand stack, in this case 100 8 The args, 500100 100500
9 idiv Int = 500/100; int = 500/100 9 The args, 500100 5
10 store_3 Take the top int value off the stack and save it to local variable 3 10 The args, 500100, 5
11 bipush 50 Push 50 onto the operand stack 11 The args, 500100, 5 50
13 istore 4 Pops the top element of the operand stack, in this case 50, and saves it to the local variable table 4 13 The args, 500100,5,50
15 getstatic #2 Gets the value of the static field corresponding to constant pool #2, the System.out static variable 15 The args, 500100,5,50 # 2
18 iload_3 Read the local variable 3, that is, 5, and push it onto the operand stack 18 The args, 500100,5,50 5, # 2
19 iload 4 Read the local variable 4, or 50, and push it onto the operand stack 19 The args, 500100,5,50 50, 5, and # 2
21 iadd Count two int frames off the stack, add them together and push them onto the stack 21 The args, 500100,5,50 55, # 2
22 invokevirtual #3 Create a new stack frame, the parameters in the method are popped from the operand stack, pushed into the new virtual machine stack, and the virtual machine starts executing the top stack frame. Here #3 is the system.out.println () method 22 The args, 500100,5,50 # 2,55 (new virtual stack)
25 return Void returns, main completes 25 The args, 500100,5,50

The appendix

  1. Click to download the JVM instruction byte code table
  2. Official Documentation of the Java Virtual Machine Specification