Java code from a bytecode perspective

The next thing we need to do is become as familiar with bytecode instructions as possible, which will give us a fundamental leap in our understanding of Java code.

1. Bytecode instruction set

1.1 Instruction set

In fact, our subsequent learning is all about bytecode instructions. Bytecode instructions can be divided into many groups with similar meanings. If we pursue speed in later learning, we can basically group a single instruction to play. Here we show all the instructions

Let’s start by analyzing these groups of instructions. We know that the smallest unit of code execution is the frame stack, which contains three parts of data that are constantly commotion:

  1. A reference to the runtime constant pool
  2. Local variable scale
  3. The operand stack

So that means there has to be some instruction to handle the flow of data between these three.

  1. The data at the top of the stack is stored in the local variable table: the xstrore class instruction, x changes according to the specific data type
  2. The local variable table stores the data to the top of the stack: xload instructions, where X changes depending on the specific data type
  3. Constants are introduced to the top of the stack (no constant pool required) : Xconst class instructions, x varies depending on the specific data type
  4. Constant pool data to the top of the stack: IDC class instructions
  5. Some operations at the top of the stack itself :pop, DUP, swap instructions

The above group instructions can solve the data migration operation in three locations.

So corresponding to our usual code writing there are the following categories

  1. Operations on data have several subbranches:

    • Floating-point operations (including addition, subtraction, multiplication, and division of float and double, negative values, and modulo operations)
    • Integer operation (including addition, subtraction, multiplication and division of int and long, negative value, modular operation)
    • Logical operations (including bitwise and/or operations on two numbers, bitwise shift operations on numbers)
    • Numeric type conversion
  2. Array operation correlation

  3. Flow control (including judgment instruction and jump instruction, corresponding flow control statements can be composed of if,for,while and switch)

  4. Object operation instruction

  5. Method call instruction

  6. Method return instruction

  7. Exception related instructions (can be incorporated into process control related)

  8. Synchronization related

Simple and powerful instruction set, simple hundreds of instructions built the Java building!

2. Look at the principles behind Java statements from a bytecode perspective

Since we want to look at the Java language from a different Angle, let’s follow the path of Java knowledge.

In any Java primer, the transfer of knowledge must be gradual. Here are some of the main points (incomplete, just for the sake of the bytecode)

  1. Basic data types
  2. Arithmetic and logical operators
  3. Process control
  4. Object-oriented programming
    1. Class initialization
    2. Class method call
  5. Exception handling mechanism
  6. Collection frameworks and generics
  7. Lambda expressions and streaming APIS
  8. io
  9. multithreading
  10. reflection

So when we understand bytecode, we can also follow this order of one-to-one correspondence

2.1 Instruction Basis

In this section, you will demonstrate the use of most of the instructions

2.1.1 Load and Store instructions

Load and store related instructions are the most frequently used instructions, which are divided into load, store and constant load.

  1. Load class instructions are to load variables in the local variable table onto the operand stack. For example, iloAD_0 loads int variables with subscript 0 in the local variable table onto the operand stack. According to different data variable types, there are lLOAD, FLOad, Dload, ALOad and other instructions. Load long, float, double, reference variables in the local variable table.
  2. The store instruction stores the data at the top of the stack in a local variable table. For example, istore_0 stores the element at the top of the operand stack at the subscript 0 position in the local variable table. This position is of type int, and depending on the data variable type there are lstore, fstore, dstore, astore and other instructions.
  3. Constant load related instructions, common examples are const class, push class, LDC class. Const, push class instructions load constant values directly onto the top of the operand stack. For example, iconst_0 loads the integer 0 onto the operand stack. Bipush 100 loads the int constant 100 onto the operand stack. The LDC instruction loads the corresponding constant from the constant pool to the top of the operand stack. For example, LDC #10 loads constant data with subscript 10 from the constant pool onto the operand stack.

Why do I need so many types to load an int constant? In order to make the bytecode more compact, int constant values use the following instructions according to the range of values n.

  • If n is in the range [-1, 5], using iconst_n, the operands and opcodes together take up only one byte. For example, the hexadecimal value of iconst_2 is 0x05. -1 is special and corresponds to the instruction iconst_M1 (0x02).
  • If n is in the range [-128,127], bipush n is used, and the operands and opcodes together take only two bytes. For example, if n is 100 (0x64), bipush 100 corresponds to 0x1064 in hexadecimal.
  • If n is in the range of [-32768, 32767] and sipush n is used, the operand and opcode together contain only three bytes. For example, if n is 1024 (0x0400), the corresponding bytecode is siPUSH1024 (0x110400).
  • If n is in other ranges, LDC is used. The integer value of this range is stored in the constant pool. For example, if n is 40000, 40000 is stored in the constant pool, and the loaded instruction is LDC # I, where I is the index value of the constant pool.

The following code

public class TestByteCode {

    private int int_2_2_1 = 1;

    public void test2_1_1_1(int i,int j){
        int k = int_2_2_1 + i +j;
        System.out.println(k);
    }

    public void test2_1_1_2(a){
        int i = 1;
        int j = 2;
        int k = int_2_2_1 + i +j;
        System.out.println(k);
    }

    public static void main(String[] args) {}}Copy the code

It can be seen that the local variable table has 4 slots, 1, 2, and 3, which are I,j, and K made by us

Code data for

 0 aload_0
 1 getfield #2 <com/zifang/util/core/TestByteCode.int_2_2_1>
 4 iload_1
 5 iadd
 6 iload_2
 7 iadd
 8 istore_3
 9 getstatic #3 <java/lang/System.out>
12 iload_3
13 invokevirtual #4 <java/io/PrintStream.println>
16 return
Copy the code

0 pushes this onto the operand stack

Line 1 calls the getfield directive to get the data for the inT_2_2_1 variable, which is pushed onto the operand stack

4 rows, pushing the value of position 1 (that is, I) in the local variable table onto the operand stack

On line 5, iadd adds the values of int_2_2_1 and I to the operand stack

In line 6, the value (j) at position 2 in the local variable table is pushed from ILOAD_2

7 lines, summation, summation pushed on the stack

At line 8, the istore_3 instruction is called to store the final value into the local variable table at position 3 (position k)

Line 9, get the system. out class ref pushed

Line 12, pushing data from k

Line 13, execute invokevirtual to make the call

Line 16, execute the null method to return

To illustrate the CONST_ * directive, the test2_1_1_2 method was made

You can do your own analysis

2.1.2 Operand stack instructions

Common operand stack instructions are POP, DUP, and swap. The pop instruction is used to push the top value off the stack. A common scenario is to call a method that returns a value but does not use it, as shown in the following code.

    public int test2_1_2_1(){
        int i = 1;
        int j = 2;
        int k = int_2_2_1 + i +j;
        return k;
    }

    public static void main(String[] args) {
        new TestByteCode().test2_1_2_1();
    }
Copy the code

The bytecode of the mian method is:

 0 new #5 <com/zifang/util/core/TestByteCode>
 3 dup
 4 invokespecial #6 <com/zifang/util/core/TestByteCode.<init>>
 7 invokevirtual #7 <com/zifang/util/core/TestByteCode.test2_1_2_1>
10 pop
11 return
Copy the code

The other instructions are pretty much the same, just operations on the operand stack itself. Here’s a little slack, and I’ll make it up in full later.

2.2 Operation and type conversion instructions

In Java, data can be added, subtracted, multiplied, and divided, as well as the following operations and the corresponding bytecode

If we have a statement like 1+1.0f in Java, 2 will be converted to float, essentially because the Fadd directive can only accept two float values, so we use the i2f directive at the bytecode level to convert int to float

Fconst_1 // pushes 1.0 iconst_1 // pushes 1 i2f // Converts the int of 1 at the top of the stack to float fadd // adds two float valuesCopy the code

Boolean, char, byte, and short are different data types, but they are all treated as ints at the JVM level. There is no explicit conversion to an int, and there is no instruction on the bytecode instructions to convert them.

There are some special instructions here, which are auto-increment instructions (the same routine as auto-decrement instructions) : iinc, which takes two parameters, A and B, where A refers to the position of the local variable table and B is the cumulative value. So you can see that this instruction operates directly on the local variable table, and the information at the top of the stack does not move. So there’s a lot of weird stuff going on.

  • I++ instance
    public static void test_2_2_1() {
        int i = 0;
        for (int j = 0; j < 50; j++) {
            i = i++;
        }
        System.out.println(i);
    }
Copy the code

Look at the bytecode, get the core bytecode

10 iload_0
11 iinc 0 by 1
14 istore_0
Copy the code

As you can see, iload_0 adds I =0 to the operand stack, and then increments, directly operating on the local variable table. At this point, I on the local variable table has already become 1, but istore_0 assigns I =0 on the operand stack to the local variable table, resulting in I =0, so it is 0 no matter how many times it is traversed.

  • + + I instance

It’s the same routine. Make a method

    public static void test_2_2_2() {
        int i = 0;
        for (int j = 0; j < 50; j++) {
            i = ++i;
        }
        System.out.println(i);
    }
Copy the code

Get the core bytecode out of there

10 iinc 0 by 1
13 iload_0
14 istore_0
Copy the code

As you can see, the bytecode sequence is not the same as before. First, the data is incrementally loaded to the operand stack, and then saved back to the operand stack. The changes take effect.

2.3 Process Control

Control transfer instructions branch according to conditions, common if-then-else, trinary expressions, for loops, exception handling, etc., all belong to this category. The corresponding instruction set includes:

  • Conditional branch: Ifeq, IFLT, IFLE, IFNE, IFGT, IFGE, IFNULL, IFnonNULL, IF_ICMPEQ, IF_ICMPNE, if_ICMPLT, If_icmpgt, IF_ICmPLE, if_ICMPGE, if_ACMPEq, and if_ACMPne.
  • Compound condition branches: Tableswitch and LookupSwitch
  • Unconditional branches: GOTO, GOTO_W, JSR, jSR_W, ret

2.3.1 If Judgment branch

We make code like this:

public int test2_3_1(int n){ if(n > 0){ return 1; } else { return 0; }}Copy the code

The resulting bytecode is:

0 iload_1
1 ifle 6 (+5)
4 iconst_1
5 ireturn
6 iconst_0
7 ireturn
Copy the code

The key here is the IFile instruction. It compares the top element of the operand stack with 0, jumps to a specific bytecode if it is less than or equal to 0, and continues with subsequent bytecode if it is greater than 0.

In line 1, and the data at the top of the stack (ILoad1, get the n value from the position of local variable table 1) are compared with 0, and xiaoyu 0 jumps to line 6

Other similar command routines, you can try one by yourself. Refer to the table below for other instructions

2.3.2 for loop

We made the following Java code:

public void test_2_3_2(int[] c){ for(int i = 0; i < c.length; i++){ System.out.println(i); }}Copy the code

Decompiler generates:

 0 iconst_0
 1 istore_2
 2 iload_2
 3 aload_1
 4 arraylength
 5 if_icmpge 21 (+16)
 8 getstatic #3 <java/lang/System.out>
11 iload_2
12 invokevirtual #4 <java/io/PrintStream.println>
15 iinc 2 by 1
18 goto 2 (-16)
21 return
Copy the code

The scale of local variation is

Line 0: iconst_0, 0 pushed (equivalent to I = 0 assignment)

Line 1: Press 0 data into slot (slot I) of local variable table index2

Line 2: the iload2 instruction pushes the I value of the local variable table onto the operand stack

Line 3: aload_1 pushes the C array

Line 4: Get length for data, push (c is no longer on the stack)

Line 5: the if_ICMPge instruction compares the two values (array length and I data) at the top of the stack, that is, whether the I < C. length in the code is established, the execution continues if it is established, and the jump to 21 if it is not established

8-12相当于 System.out.print(i);

Line 15 increments, then jumps to line 2 to compare again

This logical path makes up our for loop

2.3.2.1 Principle of the for-each loop

Java improves for loops by using syntactic sugar to express loop semantics, for example

    public void test_2_3_2_1_1(a){
        int[] numbers = new int[] {1.2.3};
        for (intnumber : numbers) { System.out.println(number); }}public void test_2_3_2_1_2(a){
        List<String> a = new ArrayList<>();
        a.add("a");
        a.add("b");
        a.add("c");
        for(String item : a) { System.out.println(item); }}Copy the code

The bytecode resulting from compiling the test_2_3_2_1_1 method is

 0 iconst_3
 1 newarray 10 (int)
 3 dup
 4 iconst_0
 5 iconst_1
 6 iastore
 7 dup
 8 iconst_1
 9 iconst_2
10 iastore
11 dup
12 iconst_2
13 iconst_3
14 iastore
15 astore_1
16 aload_1
17 astore_2
18 aload_2
19 arraylength
20 istore_3
21 iconst_0
22 istore 4
24 iload 4
26 iload_3
27 if_icmpge 50 (+23)
30 aload_2
31 iload 4
33 iaload
34 istore 5
36 getstatic #3 <java/lang/System.out>
39 iload 5
41 invokevirtual #4 <java/io/PrintStream.println>
44 iinc 4 by 1
47 goto 24 (-23)
50 return
Copy the code

The scale of local variation is:

You’ll see that it’s a lot more complicated, but it actually looks like a for loop, and you can use what you already know to deduce it.

Use decompiler tool to obtain the source code of test_2_3_2_1_1 method:

public void test_2_3_2_1_1() { int[] numbers = new int[]{1, 2, 3}; int[] var2 = numbers; int var3 = numbers.length; for(int var4 = 0; var4 < var3; ++var4) { int number = var2[var4]; System.out.println(number); }}Copy the code

You’ll see that this is essentially a for loop, where the compiler performs the bytecode generation operation on syntactic sugar.

In the same way step by step analysis of test_2_3_2_1_2 method to obtain the decompiled source code:

public void test_2_3_2_1_2() { List<String> a = new ArrayList(); a.add("a"); a.add("b"); a.add("c"); Iterator var2 = a.iterator(); while(var2.hasNext()) { String item = (String)var2.next(); System.out.println(item); }}Copy the code

You’ll notice that it uses iterator mode for functional loops.

The iterator interface sits on top of the Collection interface, which means that any Collection class in Java can use the convenience of the for-each loop.

2.3.3 swith – case branch

2.3.3.1 switch jump

The same routine creates Java code

    public int test_2_3_3_1_1(int i) {
        switch (i) {
            case 100: return 0;
            case 101: return 1;
            case 104: return 4;
            default: return -1; }}Copy the code

Decompiling bytecode with Javap:

 0 iload_1
 1 tableswitch 100 to 104	100:  36 (+35)
    101:  38 (+37)
    102:  42 (+41)
    103:  42 (+41)
    104:  40 (+39)
    default:  42 (+41)
36 iconst_0
37 ireturn
38 iconst_1
39 ireturn
40 iconst_4
41 ireturn
42 iconst_m1
43 ireturn
Copy the code

As you can see, there are 102 and 103, which are added by the virtual machine itself, pointing to the default statement branch. This allows for O(1) time – complexity lookups that can be found in one go with the cursor.

So what if the values in the case differ greatly and there is a fault?

    public int test_2_3_3_1_2(int i) {
        switch (i) {
            case 1: return 0;
            case 10: return 1;
            case 100: return 4;
            default: return -1; }}Copy the code

The bytecode corresponding below

 0 iload_1
 1 lookupswitch 3
	1:  36 (+35)
	10:  38 (+37)
	100:  40 (+39)
	default:  42 (+41)
36 iconst_0
37 ireturn
38 iconst_1
39 ireturn
40 iconst_4
41 ireturn
42 iconst_m1
43 ireturn
Copy the code

You’ll notice that the lookupswitch directive is used at this point. The keys are sorted, and the search can be binary search, order log n time.

Sparse refers to the cost estimation of tableswitch and lookupswitch in JavAC. A few options may not be distinguished due to algorithm reasons.

2.3.3.2 String – switch
    public int test_2_3_3_2(String name) {
        switch (name) {
            case "Eat 1":
                return 100;
            case "Eat 2":
                return 200;
            default:
                return -1; }}Copy the code
 0 aload_1
 1 astore_2
 2 iconst_m1
 3 istore_3
 4 aload_2
 5 invokevirtual #16 <java/lang/String.hashCode>
 8 lookupswitch 2
      21885863:  36 (+28)
      21885864:  50 (+42)
      default:  61 (+53)
36 aload_2
37 ldc #17"Have a meal1>
39 invokevirtual #18 <java/lang/String.equals>
42 ifeq 61 (+19)
45 iconst_0
46 istore_3
47 goto 61 (+14)
50 aload_2
51 ldc #19"Have a meal2>
53 invokevirtual #18 <java/lang/String.equals>
56 ifeq 61 (+5)
59 iconst_1
60 istore_3
61 iload_3
62 lookupswitch 2
	0:  88 (+26)
	1:  91 (+29)
	default:  95 (+33)
88 bipush 100
90 ireturn
91 sipush 200
94 ireturn
95 iconst_m1
96 ireturn

Copy the code

The decompilation looks like this:

public int test_2_3_3_2(String name) {
        byte var3 = -1;
        switch(name.hashCode()) {
        case 21885863:
            if (name.equals("Eat 1")) {
                var3 = 0;
            }
            break;
        case 21885864:
            if (name.equals("Eat 2")) {
                var3 = 1; }}switch(var3) {
        case 0:
            return 100;
        case 1:
            return 200;
        default:
            return -1; }}Copy the code

As you can see, the data in the case is hashed at compile time. To prevent hash collisions, use if in the case to really determine the branches of logic

2.4 Object-oriented programming

2.4.1 Object initialization instructions

In short, there are three instructions involved in object initialization

  1. new
  2. <init>
  3. <clinit>

Object initialization can be divided into object initialization and class static initialization

2.4.1.1 Initializing objects

For example:

A a = new A();
Copy the code

There will be after decompiling bytecode

0: new           #2                  // class A
3: dup
4: invokespecial #3                  // Method A."<init>":()V
7: astore_1
Copy the code

An object creation statement must be a triplet of new, DUP, and invokespecial statements.

New creates only one object instance, but the object is ok only after the invokespecial command calls

. After new, first plug to the top of the stack. After executing invokespecial, initialize the object on the top of the stack. This will result in astore_1 running out of stuff, so you need to dup the data at the top of the stack.

Dup is an abbreviation for duplicate. Copy what’s at the top of the stack and push it in.

The invokespecial can fetch the Object from the top of the stack.

2.4.1.2 Static initialization of classes


is a static initialization of the class that precedes

and is not called directly.

It triggers calls with four instructions: new, getStatic, putstatic, and Invokestatic.

That is, when you initialize an instance of a class, access a static variable, or access a static method, the class’s static initialization method is triggered.

Such as:

public class Initializer {
    static int a;
    static int b;
    static {
        a = 1;
        b = 2; }}// Part of the bytecode is as follows
static {};
     0: iconst_1
     1: putstatic     #2                  // Field a:I
     4: iconst_2
     5: putstatic     #3                  // Field b:I
     8: return
Copy the code
2.4.1.3 Frequently met questions are solved from the perspective of bytecode
  1. A a = new B(); Output the results in the correct order
public class A {
    static {
        System.out.println("A init");
    }
    public A(a) {
        System.out.println("A Instance"); }}public class B extends A {
    static {
        System.out.println("B init");
    }
    public B(a) {
        System.out.println("B Instance"); }}Copy the code
public B(a);
     0: aload_0
     1: invokespecial #1                  // Method A."<init>":()V
     4: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
     7: ldc           #3                  // String B Instance
     9: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;) V
    12: return
Copy the code

Passive-driven logic takes the lead. B initializes the static initialization of B’s class, which drives the static initialization of its parent class.

Subclass initialization must first call the parent class’s initialization method, so we have:

A) static initialization of A

B) static initialization of b’s class

C) Initialization of instances of class A

D) initialization of instances of class B

  1. B[] arr = new B[10] output
bipush 10
anewarray 'B'
astore 1
Copy the code

You can see that there is no need to initialize without using any instructions related to initialization.

2.4.2 Method Call instruction

I have already contacted invokespecial before, and there are 4 other ones as troublesome as him, this damn.

  • Invokestatic: used to invokestatic methods
  • Invokespecial: Used to call private instance methods, constructors, and instance methods or constructors of the parent class using the super keyword, and the default method of the interface implemented
  • Invokevirtual: Used to invoke non-private instance methods
  • Invokeinterface: used to invokeinterface methods
  • Invokedynamic: used to invokedynamic methods

Technology exists for a purpose, certainly to address a real need. Why does a simple method call split into five brothers?

If you look at Java, as long as the behavior is determined at the compiler stage, it’s static binding. Those that need to be identified dynamically at run time based on the type of caller are called dynamic bindings

This term is so difficult that I could not understand it before. How do you do it at runtime? I can guess it by looking at the code.

Let’s rephrase it — see, constructors can’t be overridden, that’s statically bound, static methods can’t be overridden, that’s statically bound, too — everything else is dynamic! Polymorphisms, subclasses overwriting superclasses, and all that stuff are statically bound!

Okay, so invokeStatic and Invokespecial are responsible for calling static binding methods and constructors (private methods, static binding methods that can’t be overridden).

The rest is for dynamically bound methods.

2.4.2.1 invokevirtual

Methods used to invoke the public, protected, package access level. As follows:

public class Color {
    public void printColorName(a)  {
        System.out.println("Color name from parent"); }}public class Red extends Color {
    @Override
    public void printColorName(a) {
        System.out.println("Color name is Red"); }}public class Yellow extends Color {
    @Override
    public void printColorName(a) {
        System.out.println("Color name is Yellow"); }}public class InvokeVirtualTest {
    private static Color yellowColor = new Yellow();
    private static Color redColor = new Red();
    public static void main(String[] args) { yellowColor.printColorName(); redColor.printColorName(); }} Color name is Yellow Color name is RedCopy the code

Here is the bytecode

0: getstatic     #2                  // Field yellowColor:LColor;
3: invokevirtual #3                  // Method Color.printColorName:()V
6: getstatic     #4                  // Field redColor:LColor;
9: invokevirtual #3                  // Method Color.printColorName:()V
Copy the code

PrintColorName is not overwritten by the compiler to yellow. printColorName and red. printColorName. They end up calling different target methods. Invokevirtual dispatts based on the actual type of the object (virtual method dispatch), and it is uncertain at compile time whether subclass or superclass methods will end up being called.

2.4.2.2 invokeinterface

As the name suggests, those that specifically call interface methods are also those that call dynamically bound methods.

What is the difference between invokevirtual and invokevirtual? Why not use Invokevirtual to invoke interface methods?

A new concept needs to be introduced here: Java method dispatch

When discussing polymorphism (commonly called runtime polymorphism), comparisons to Overload are unavoidable

  1. The virtual machine creates a virtual method table (VTABLE) data structure in the method area of the class.

    Method table will be in the connection stages of class initialization, method table storage is a map at the mouth of the method, if a subclass inherits the parent class, but is not subclasses override a superclass method, then the method in a subclass table inside the method points to the parent class method entries and subclass will not regenerate a method, and then let the method table to point to this generation, There is no point in doing so. Also, if a subclass overrides a method of the parent class, the index of the overridden method of the subclass is the same as the index of the method of the parent class. The purpose of this is to quickly find, for example, a method whose index is 1 cannot be found in a subclass, so the JVM will go directly to the parent class to find a method whose index is 1, without having to iterate through the parent class again.

  2. For the InvokeInterface directive, the virtual machine creates a data structure called interface Method Table (itable)

    When an interface method needs to be invoked, the vM finds the corresponding method table location and method location in the Offset table of itable, and then searches for the method implementation in the Method table. The implementation of Invokevirtual relies on Java’s single-inheritance feature, in which the virtual method tables of subclasses retain the order of their parent class’s virtual method tables, but this feature is not available because of Java’s multi-interface implementation.

Vtable and ITABLE are the basis of Java polymorphism.

  • A subclass inherits the parent class’s Vtable. Since all Java classes inherit from Object, and Object has five methods that can be inherited, the vtable size of an empty Java class is also equal to 5.
  • Final and static methods do not appear in vtable because there is no way to override them by inheritance, and private methods do not appear in Vtable either.
  • The invokeinterface command is used to invokeinterface methods. Java supports multi-interface implementation using itable, which consists of offset table and method table. When an interface method is called, the offset position of the Method table is looked up in the Offset table, followed by the specific interface implementation in the Method table.
2.4.2.3 invokeDynamic

Jdk7 was introduced, it was not used at that time, then Groovy,JRuby,Kotlin and so on began to blossom. Lamda expressions will finally use this instruction in JDK8. No one knows what to do at compile time, and the real logic is relegated to the user’s code.

There will be a very important core class: MethodHandle.

MethodHandle, also known as a MethodHandle or method pointer, is a class in the java.lang. Invoke package that allows Java to pass functions as arguments, as in other languages. This looks a lot like Method reflection, but is much lighter than Method and enjoys the benefits of JIT.

Here’s a practical example of how to use MethodHandle

public class Foo {
    public void print(String s) {
        System.out.println("hello, " + s);
    }
    public static void main(String[] args) throws Throwable {
        Foo foo = new Foo();
        MethodType methodType = MethodType.methodType(void.class, String.class);
        MethodHandle methodHandle = MethodHandles.lookup().findVirtual(Foo.class, "print", methodType);
        methodHandle.invokeExact(foo, "world"); }}// Run output
hello, world
Copy the code

The steps of the method using MethodHandle are:

  • Create a MethodType object. MethodType is used to represent method signatures, and each MethodHandle has a MethodType instance that specifies the return value type and parameter types of the method
  • Returns by calling the methodhandles.lookup static methodMethodHandles.LookupFindStatic, findSpecial, findVirtual, and so on, depending on the type of method, to find method handles whose methods are signed as MethodType
  • Once you get the method handle, you can execute it. Pass in the parameters of the target method usinginvokeorinvokeExactYou can call the method.

Kotlin or Groovy, for example, are somewhat similar to scripting languages, and the difference between a scripting language and a static language is that you don’t know until you’ve executed the parameters

Here we create a test. groovy

def add(a, b) {
    new Exception().printStackTrace()    
    return a + b
}
Copy the code

Here Groovy is having trouble translating code to.class. A, B I don’t know what it is? How do I make the method? Use invoke_dynamic. Using invoke_dynamic, you can translate the code like this:

public static void main(String[] args) throws Throwable {
    MethodHandles.Lookup lookup = MethodHandles.lookup();
    MethodType mt = MethodType.methodType(Object.class,Object.class, Object.class);    
    CallSite callSite = IndyInterface.bootstrap(lookup, "invoke", mt,"add", 0);    
    MethodHandle mh = callSite.getTarget();        
    mh.invokeExact(obj, "hello", "world");
}
Copy the code

The IndyInterface is the entry point provided by Groovy, so it gives Groovy a lot of control over how it executes and delivers it to the jar provided by Groovy.

Of course, not only does the invoke_dynamic instruction shine in different languages, but it can also be found in lamda expressions introduced in java8.

2.5 Exception Handling Mechanism

We create an exception handling code:

    public  void test2_5_1_exception(a){
        throw new RuntimeException();
    }

    public  void test2_5_1_handler(Exception e){
        System.out.println("Exception caught");
    }
    
    public  void test2_5_1(a){
        try {
            test2_5_1_exception();
        } catch(RuntimeException e) { test2_5_1_handler(e); }}Copy the code

The resulting bytecode is:

 0 aload_0
 1 invokevirtual #23 <com/zifang/util/core/TestByteCode.test2_5_1_exception>
 4 goto 13 (+9)
 7 astore_1
 8 aload_0
 9 aload_1
10 invokevirtual #24 <com/zifang/util/core/TestByteCode.test2_5_1_handler>
13 return
Copy the code

There is also the exception table information, which is:

In compiled bytecode, each method comes with an Exception table, and each row in the Exception table represents an Exception handler consisting of a POINTER to from, to, target, and the caught Exception type. The value of these Pointers is the bytecode index used to locate the bytecode meaning that if an exception of type type is thrown in the range of [from, to] bytecode, it will jump to the bytecode represented by target. For example, the above exception table represents: If a RuntimeException is thrown between 0 and 4 (excluding 4), jump to 7.

There are a lot of catches

    public void test2_5_2(a){
        try {
            test2_5_1_exception();
        } catch (NullPointerException e) {
            test2_5_1_handler(e);
        }catch(RuntimeException e){ test2_5_1_handler(e); }}Copy the code

You can see that one more Exception will add one more record to the Exception table.

When an exception occurs in the program, the Java virtual machine iterates through all the entries in the exception table from top to bottom. When the bytecode index value that triggers an exception is in the [from, to) range of an exception entry, it determines whether the thrown exception matches the exception that the entry is trying to catch.

  • If it matches, the Java virtual machine redirects control flow to the bytecode pointed to by Target. If there is no match, continue traversing the exception table
  • If all exception tables are iterated and no exception handler is matched, the exception will be propagated to the caller to repeat the above operation. In the worst case, the virtual machine needs to traverse the exception table for all methods on the thread’s Java stack

If I add finally

public void test2_5_3(){ try { test2_5_1_exception(); } catch (NullPointerException e) { test2_5_1_handler(e); }finally { test2_5_finally(); }} public void test2_5_finally(){system.out.println ("finally statement block "); }Copy the code

The bytecode and exception table are obtained as follows:

 0 aload_0
 1 invokevirtual #24 <com/zifang/util/core/TestByteCode.test2_5_1_exception>
 4 aload_0
 5 invokevirtual #27 <com/zifang/util/core/TestByteCode.test2_5_finally>
 8 goto 31 (+23)
11 astore_1
12 aload_0
13 aload_1
14 invokevirtual #25 <com/zifang/util/core/TestByteCode.test2_5_1_handler>
17 aload_0
18 invokevirtual #27 <com/zifang/util/core/TestByteCode.test2_5_finally>
21 goto 31 (+10)
24 astore_2
25 aload_0
26 invokevirtual #27 <com/zifang/util/core/TestByteCode.test2_5_finally>
29 aload_2
30 athrow
31 return
Copy the code

As you can see, the bytecode contains three finally statement blocks, all before the program’s normal return and exception throw. Two of these are before a try and catch call to return, and one is before an exception throw.

Java does this by copying the contents of the finally block and placing them before all normal returns and exception throws of the try Catch block, respectively.

So translate the bytecode equivalent above:

public void foo(a) {
    try {        
      test2_5_1_exception();        
      test2_5_finally();    
    } catch (NullPointerException e) {        
      try {            
        test2_5_1_handler(e);        
      } catch (Throwable e2) {
        test2_5_finally();            
        throwe2; }}catch (Throwable e) {
      test2_5_finally();        
      throwe; }}Copy the code

Well, with this foundation, try to analyze:

public int test2_5_3_1() { try { int a = 1 / 0; return 0; } catch (Exception e) { int b = 1 / 0; return 1; } finally { return 2; } } public int test2_5_3_2() { int i = 100; try { return i; } finally { ++i; } } public int test2_5_3_3() { int i = 100; try { return i; } finally { i++; } } public String test2_5_3_4() { String s = "hello"; try { return s; } finally { s = "xyz"; }}Copy the code

What do these methods return?

2.6 The generics principle

Make the following code

public class Pair<T> {
    public T first;    
    public T second;    
    public Pair(T first, T second) {        
    	this.first = first;        
    	this.second = second;    
    }
}
public void foo(Pair<String> pair) {    
		String left = pair.left;
}
Copy the code

The bytecode obtained for foo is:

0: aload_1
1: getfield      #2                  // Field left:Ljava/lang/Object;
4: checkcast     #4                  // class java/lang/String
7: astore_2
8: return
Copy the code

You can see that the left field is of type Object, not String.

Checkcast instruction is used to check whether the object conforms to a given type, if do not conform to the conditions of the thrown Java. Lang. ClassCastException.

Translate the above code to:

public class Pair<T> {
    public Object first;    
    public Object second;    
    public Pair(T first, T second) {        
    	this.first = first;        
    	this.second = second;    
    }
}
public void foo(Pair pair) {    
		String left = (String)pair.left;
}
Copy the code

The most important thing to understand about generics is to understand type erasure.

For example, define within a class:

public void print(List<String> list)  { }
public void print(List<Integer> list) { }
Copy the code

The JVM does not allow methods with the same signature to exist in a class at the same time, so the above code will fail to compile.

2.7 Principle of lambda expressions

2.7.1 Implementation of anonymous inner Class

    public void test2_7_1(a) {
        Runnable r1 = new Runnable() {
            public void run(a) {
                System.out.println("hello, inner class"); }}; r1.run(); }Copy the code

The resulting bytecode is:

 0 new #31 <com/zifang/util/core/TestByteCode$1>
 3 dup
 4 aload_0
 5 invokespecial #32 <com/zifang/util/core/TestByteCode$1.<init>>
 8 astore_1
 9 aload_1
10 invokeinterface #33 <java/lang/Runnable.run> count 1
15 return
Copy the code

A human flesh to see the implementation of the code for

class TestByteCodeThe $1implements Runnable {
    public TestByteCode$1(TestByteCode var) {}@Override
    public void run(a) {
        System.out.println("hello, inner class");
    }

public class TestByteCode {
    public void test2_7_1(a){
        Runnable r1 = new TestByteCode$1(this); r1.run(); }}Copy the code

The anonymous inner class generates a class, passes itself into the class, and is called by the outer class.

2.7.2 How lambda expressions are implemented

    public void test2_7_2(a){
        Runnable r1 = () -> System.out.println("hello, inner class");
        r1.run();
    }
Copy the code

Decompilation using Java -p -v (otherwise lambdatest2_7_20 would not appear) yields bytecode

public void test2_7_2(); descriptor: ()V flags: ACC_PUBLIC Code: stack=1, locals=2, args_size=1 0: invokedynamic #34, 0 // InvokeDynamic #0:run:()Ljava/lang/Runnable; 5: astore_1 6: aload_1 7: invokeinterface #33, 1 // InterfaceMethod java/lang/Runnable.run:()V 12: return LineNumberTable: line 197: 0 line 198: 6 line 199: 12 LocalVariableTable: Start Length Slot Name Signature 0 13 0 this Lcom/zifang/util/core/TestByteCode; 6 7 1 r1 Ljava/lang/Runnable; private static void lambda$test2_7_2$0(); descriptor: ()V flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC Code: stack=2, locals=0, args_size=0 0: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #44 // String hello, inner class 5: invokevirtual #15 // Method java/io/PrintStream.println:(Ljava/lang/String;) V 8: return LineNumberTable: line 197: 0 BootstrapMethods: 0: #161 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup; Ljava/lang/String; Ljava/lang/invoke/MethodType; Ljava/lang/invoke/MethodType; Ljava/lang/invoke/MethodHandle; Ljava/lang/invoke/MethodType;) Ljava/lang/invoke/CallSite; Method arguments: #162 ()V #163 invokestatic com/zifang/util/core/TestByteCode.lambda$test2_7_2$0:()V #162 ()VCopy the code

Static method lambdatest2_7_20 static method lambdatest2_7_20 static method lambdatest2_7_20

private static void lambda$main$0() {
    System.out.println("hello, inner class");
}
Copy the code

The current constant pool is

Constant pool: #34 = InvokeDynamic #0:#164 // #0:run:()Ljava/lang/Runnable; #164 = NameAndType #197:#200 // run:()Ljava/lang/Runnable; BootstrapMethods: 0: #161 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup; Ljava/lang/String; Ljava/lang/invoke/MethodType; Ljava/lang/invoke/MethodType; Ljava/lang/invoke/MethodHandle; Ljava/lang/invoke/MethodType;) Ljava/lang/invoke/CallSite; Method arguments: #162 ()V #163 invokestatic com/zifang/util/core/TestByteCode.lambda$test2_7_2$0:()V #162 ()VCopy the code

Where # 0 is a special lookup, corresponding BootstrapMethods the zero line, you can see this is a method of static LambdaMetafactory. Metafactory () call, Its return value is a Java. Lang. Invoke. CallSite object, the goal of this object represents the real execution method call.

Public static CallSite metaFactory (MethodHandles.Lookup caller, String invokedName, MethodType invokedType, MethodType samMethodType, MethodHandle implMethod, MethodType instantiatedMethodType) throws LambdaConversionException { AbstractValidatingLambdaMetafactory mf; mf = new InnerClassLambdaMetafactory(caller, invokedType, invokedName, samMethodType, implMethod, instantiatedMethodType, false, EMPTY_CLASS_ARRAY, EMPTY_MT_ARRAY); mf.validateMetafactoryArgs(); return mf.buildCallSite(); }Copy the code
  • Caller: indicates the lookup context provided by the JVM.
  • InvokedName: Invokes the function name. In this case, invokedName is run.
  • SamMethodType: Represents the method signature (parameter type and return value type) defined by the functional interface. In this case, the signature of the run method is “()void”.
  • ImplMethod: Represents the static method invokestatic testBytecode.lambdatest2_7_20 corresponding to the Lambda expression generated at compile time
  • InstantiatedMethodType: Generally the same as or a special case of samMethodType, in this case “()void”.

Here is the most important and most complex: InnerClassLambdaMetafactory method call

public InnerClassLambdaMetafactory(MethodHandles.Lookup caller,
                                       MethodType invokedType,
                                       String samMethodName,
                                       MethodType samMethodType,
                                       MethodHandle implMethod,
                                       MethodType instantiatedMethodType,
                                       boolean isSerializable,
                                       Class<?>[] markerInterfaces,
                                       MethodType[] additionalBridges)
            throws LambdaConversionException {
        super(caller, invokedType, samMethodName, samMethodType,
              implMethod, instantiatedMethodType,
              isSerializable, markerInterfaces, additionalBridges);
        implMethodClassName = implDefiningClass.getName().replace('.', '/');
        implMethodName = implInfo.getName();
        implMethodDesc = implMethodType.toMethodDescriptorString();
        implMethodReturnClass = (implKind == MethodHandleInfo.REF_newInvokeSpecial)
                ? implDefiningClass
                : implMethodType.returnType();
        constructorType = invokedType.changeReturnType(Void.TYPE);
        lambdaClassName = targetClass.getName().replace('.', '/') + "$$Lambda$" + counter.incrementAndGet();
        cw = new ClassWriter(ClassWriter.COMPUTE_MAXS);
        int parameterCount = invokedType.parameterCount();
        if (parameterCount > 0) {
            argNames = new String[parameterCount];
            argDescs = new String[parameterCount];
            for (int i = 0; i < parameterCount; i++) {
                argNames[i] = "arg$" + (i + 1);
                argDescs[i] = BytecodeDescriptor.unparse(invokedType.parameterType(i));
            }
        } else {
            argNames = argDescs = EMPTY_STRING_ARRAY;
        }
    }
Copy the code

Inside this method, the inner class is silently generated, with the rule ClassName$$Lambda$n. Where ClassName is the name of the class where the Lambda resides, followed by the number n in ascending order of generation. The underlying class is generated using ASM. The class looks like this:

final class TestByteCode$$Lambda$1 implements Runnable { @Override public void run() { TestByteCode.lambda$test2_7_2$0(); }}Copy the code

So overall, the order of execution for Dynamic is:

  • Where lambda expressions are declared, an InvokeDynamic instruction is generated, and the compiler generates a corresponding Bootstrap Method.
  • Perform invokedynamic instruction for the first time, will call the corresponding guidance Method (the Bootstrap Method), the guidance Method will be called LambdaMetafactory. Metafactory Method dynamically generated inner classes
  • The bootstrap method returns a dynamically invoked CallSite object that links to the inner class that implements the Runnable interface
  • The contents of lambda expressions are compiled into static methods that are called directly by the previously dynamically generated inner class
  • Lambda calls are actually performed using the InvokeInterface instruction
public class TestByteCode{ public void test2_7_2(){ Runnable r1 = () -> System.out.println("hello, inner class"); r1.run(); }}Copy the code
Public class TestByteCode{public void test2_7_2(){// This code is equivalent to a bunch of callsite-like method handles calling a call TestByteCode$$Lambda$1 # run() } } private static void lambda$main$0() { System.out.println("hello, inner class"); } final class TestByteCode$$Lambda$1 implements Runnable { @Override public void run() { TestByteCode.lambda$test2_7_2$0(); }}Copy the code

2.8 Implementation principle of synchronized

Synchronized is too important to get around in concurrent programming.

When we wrap an Object with the Synchronized keyword:

private Object lock = new Object();
public void foo(a) {
    synchronized(lock) { bar(); }}public void bar(a) {}Copy the code

After decompiling:

public void foo(a);
    Code:
       0: aload_0
       1: getfield      #3                  // Field lock:Ljava/lang/Object;
       4: dup
       5: astore_1
       6: monitorenter
       7: aload_0
       8: invokevirtual #4                  // Method bar:()V
      11: aload_1
      12: monitorexit
      13: goto          21
      16: astore_2
      17: aload_1
      18: monitorexit
      19: aload_2
      20: athrow
      21: return
    Exception table:
       from    to  target type
           7    13    16   any
          16    19    16   any
Copy the code
  • 0 ~ 5: Pushes the lock object onto the stack, uses the DUP instruction to copy the top element of the stack, and stores it at local variable table position 1. Now there is a lock object left on the stack
  • 6: Lock the top element of the stack, and use monitorenter to start the synchronization
  • 7 ~ 8: Call the bar() method
  • 11-12: Pushes the lock object onto the stack and calls Monitorexit to release the lock

When we use synchronized to modify a method:

synchronized public void testMe(a) {}// Corresponding bytecode
public synchronized void testMe(a);
descriptor: ()V
flags: ACC_PUBLIC, ACC_SYNCHRONIZED
Copy the code

The JVM does not use special bytecode to invoke synchronized methods, and when the JVM parses symbolic references to methods, it determines whether the method is synchronized (checking that the method ACC_SYNCHRONIZED is set). If so, the thread of execution tries to acquire the lock first. In the case of an instance method, the JVM attempts to acquire the lock on the instance object; in the case of a class method, the JVM attempts to acquire the class lock. After the synchronized method completes, the lock is released, whether it is a normal return or an abnormal return.

2.10 Bytecode and reflection

When the same method reflection executes beyond a certain limit, a new class is created to call using ASM. More on that later. I’m tired.

3. Summary

No summary. I’m tired