background
The third edition of Understanding the Java Virtual Machine has just been released, so it’s worth checking out this new edition of the book, which includes a lot of updates on the new version of the virtual machine, as well as some reconstructions of the old ones. In the spirit of review and consolidation, I’ve decided to compile a simple class file to analyze Java bytecode content to help you understand and consolidate your knowledge of Java bytecode, and hopefully to help you as you read this article.
Note: The environment used this time is OpenJdk12
Compile the 1+1 code
First of all, we need to write a simple small program, 1+1 program, learning to start from the simplest 1+1, the code is as follows:
package top.luozhou.test; /** * @description: * @author: luozhou * @create: 2019-12-25 21:28 **/ public class TestJava { public static void main(String[] args) { int a=1+1; System.out.println(a); }}Copy the code
After the Java class file is written, run the javac testjava. Java command to compile the class file and generate testJava.class. Then run the javap-verbose TestJava decompiler command. The bytecode result is displayed as follows:
Compiled from "TestJava.java"
public class top.luozhou.test.TestJava
minor version: 0
major version: 56
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #5.#14 // java/lang/Object."
":()V
#2 = Fieldref #15.#16 // java/lang/System.out:Ljava/io/PrintStream;
#3 = Methodref #17.#18 // java/io/PrintStream.println:(I)V
#4 = Class #19 // top/luozhou/test/TestJava
#5 = Class #20 // java/lang/Object
#6 = Utf8
#7 = Utf8 ()V
#8 = Utf8 Code
#9 = Utf8 LineNumberTable
#10 = Utf8 main
#11 = Utf8 ([Ljava/lang/String;)V
#12 = Utf8 SourceFile
#13 = Utf8 TestJava.java
#14 = NameAndType #6:#7 // "
":()V
#15 = Class #21 // java/lang/System
#16 = NameAndType #22:#23 // out:Ljava/io/PrintStream;
#17 = Class #24 // java/io/PrintStream
#18 = NameAndType #25:#26 // println:(I)V
#19 = Utf8 top/luozhou/test/TestJava
#20 = Utf8 java/lang/Object
#21 = Utf8 java/lang/System
#22 = Utf8 out
#23 = Utf8 Ljava/io/PrintStream;
#24 = Utf8 java/io/PrintStream
#25 = Utf8 println
#26 = Utf8 (I)V
{
public top.luozhou.test.TestJava();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."
":()V
4: return
LineNumberTable:
line 8: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=2, args_size=1
0: iconst_2
1: istore_1
2: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
5: iload_1
6: invokevirtual #3 // Method java/io/PrintStream.println:(I)V
9: return
LineNumberTable:
line 10: 0
line 11: 2
line 12: 9
}
Copy the code
Parsing bytecode
1. Basic information
The above results remove some of the redundant information that does not affect parsing, so we can parse the bytecode results.
Minor version: 0 Indicates that the vm version is not used. Major Version: 56 Indicates the major version. 56 indicates JDk12, which indicates that the VM can run only in JDK12 or later VMSCopy the code
flags: ACC_PUBLIC, ACC_SUPER
Copy the code
ACC_PUBLIC: This is an access flag that indicates whether it is of type public or not.
ACC_SUPER: This falg is intended to solve the problem of calling the super method with the Invokespecial directive. It can be thought of as a bug fix for Java 1.0.2, so that it can find the super class method correctly. Starting with Java 1.0.2, the compiler always generates the ACC_SUPER access identifier in the bytecode. Those who are interested can click here to learn more.
2. The constant pool
Next, we’ll look at constant pools, which you can also understand against the overall bytecode above.
#1 = Methodref #5.#14 // java/lang/Object."
":()V
Copy the code
This is a method reference, where #5 represents the index value, and then we can see the bytecode with index value 5 as follows
#5 = Class #20 // java/lang/Object
Copy the code
It means that this is an Object class, and in the same way that #14 refers to a ”
“:()V refers to an initialization method.
#2 = Fieldref #15.#16 // java/lang/System.out:Ljava/io/PrintStream;
Copy the code
This represents a field reference, which also refers to #15 and #16, and actually refers to the PrintStream object in the Java /lang/System class. Other constant pool analyses follow the same line, and I won’t go into them for space, but just list a few key types and information.
NameAndType: This representation is a generic table of names and types, which can point to method names or field indexes, both of which are the actual methods represented in the bytecode above.
Utf8: We often use character encoding, but this does not mean only character encoding. It means a string whose character encoding is Utf8. It is the most commonly used table structure in virtual machines, and you can think of it as describing methods, fields, classes, and so on. Such as:
#4 = Class #19
#19 = Utf8 top/luozhou/test/TestJava
Copy the code
The index #4 is a class, and it points to class #19, which is a Utf8 table, Finally deposit is a top/luozhou/test/TestJava, then such a link can know # 4 locations referenced classes is a top/luozhou/test/TestJava.
3. Construct method information
Next, we examine the bytecode of the constructor. We know that when a class is initialized, its constructor is executed first. If you don’t write a constructor, the system adds a constructor with no arguments by default.
public top.luozhou.test.TestJava();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."
":()V
4: return
LineNumberTable:
line 8: 0
Copy the code
Descriptor: ()V: means this is a method that returns no value.
Flags: ACC_PUBLIC: is a public method.
Stack =1, locals=1, args_size=1: the number in the stack is 1, the variable in the local variable table is 1, and the call parameter is 1.
Why are they all 1’s here? Isn’t this the default constructor? Where do you get the parameters? In fact, the Java language has an unspoken rule: in any instance method, you can use this to ask about the object that the method belongs to. If you’re familiar with Python, you’ll know that when you define a method in Python, you always pass in a parameter called self, which is the reference to the instance inside the method. Java simply pushes this mechanism back to compile time. So, the 1 here is just the this parameter.
0: aload_0
1: invokespecial #1 // Method java/lang/Object."
":()V
4: return
LineNumberTable:
line 8: 0
Copy the code
After the above analysis it is clear what this constructor means.
Aload_0: loading the first variable in the local variable table onto the stack, namely this.
Invokespecial: Directly invokes the initialization method.
Return: The method ends.
LineNumberTable: This is a line-count table that maps bytecode offsets to lines of code. Line 8: 0 indicates that line 8 in the source code corresponds to the bytecode with offset 0, which is not directly reflected here because it is the default constructor.
The Object constructor is implemented because Object is the parent of all classes, and subclasses need to construct the parent constructor first.
4. Main method information
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=2, args_size=1
0: iconst_2
1: istore_1
2: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
5: iload_1
6: invokevirtual #3 // Method java/io/PrintStream.println:(I)V
9: return
LineNumberTable:
line 10: 0
line 11: 2
line 12: 9
Copy the code
With the previous constructor analysis, we will be familiar with the main method. I will skip the repetitions and focus on the code part.
Stack =2, locals=2, args_size=1: the stack and local variables are 2, and the parameter is 1. Why is that? Since the main method declares a variable a, local variables are added to the list, as is the stack, so they are 2. So why is args_size still 1? You said you would pass this by default. It should be 2. The main method is a static method. Static methods can be accessed directly by the class + method name, so there is no need to pass in an instance object.
0: iconst_2: pushes int type 2 to the top of the stack.
1: istore_1: stores the value of type int at the top of the stack into the second local variable.
2: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; : Gets PrintStream.
5: ILoAD_1: pushes the second int local variable to the top of the stack.
6: invokevirtual # 3 / Java/Method/IO/PrintStream println: (I) V: call println Method. The println method here will take the element at the top of the stack as its own entry, and it will print 2.
9: return: the end method is invoked.
LineNumberTable = LineNumberTable = LineNumberTable = LineNumberTable = LineNumberTable
Line 10: 0: line 10 represents the 0: iconst_2 bytecode, and here we see that the compiler has just computed for us and pushed 2 to the top of the stack.
Line 11: 2: line 11 corresponds to the 2: getStatic class PrintStream that gets the output.
Line 12: 9:12 corresponds to return, indicating the end of the method.
Here I have also drawn a moving image to illustrate the execution of the main method, hoping to help you understand:
conclusion
This article, I start with 1 + 1 source code to compile, after analyzed the generation of Java byte code, including the basic information of the class, the constant pool, method invocation process and so on, through these analysis, we have a basic understanding of Java bytecode, also know the Java compiler to optimize method, which is manifested through the compiled bytecode For example, with our 1+1=2, the bytecode byte assigns a 2 to the variable instead of adding, thus optimizing our code and improving execution efficiency.
reference
- Bugs.openjdk.java.net/browse/JDK-…