JVM: Bytecode execution engine

This chapter is excerpted from Chapter 8 of Getting Into the JVM VIRTUAL Machine. It can be roughly divided into three parts:

Understand Java is a stack – based bytecode engine, stack frame concept.
Understanding that Java is “static multi-dispatch, dynamic single-dispatch” affects how Java dispatses methods when overloads or overrides occur.
Learn about the Java.lang. invoke package introduced by the JVM to add dynamic language support, as well as the InvokeDynamic directive, including a quick look at how Java uses this directive to implement Lambda expressions.

1. Stack-based bytecode engine

The bytecode Instruction streams generated by the Javac compiler are largely stack-based Instruction Set Architecture (ISA). Instruction streams are mostly zero-address instructions, so they rely on the operand stack. The other is register-based instruction set architecture, which relies on registers.

For example, for the same “1 + 1” arithmetic problem, the two instruction set architectures will have different implementations. The first is a stack-based instruction set:

iconst_1
iconst_1
iadd
istore_0
Copy the code

After the two iconst_1 instructions push 1 onto the stack, IADD adds the two numbers at the top of the stack, and finally istrore_0 saves the operation result at the top of the stack into the local variable table no. 0. The second is register-based instruction set:

mov eax,1
add eax,1 
Copy the code

The first instruction stores the value 1 in the EAX register. The second instruction, + 1 and save the value in the EAX register. In register-based instruction set architecture, two-address instruction is the mainstay of x86 instruction set (typical register-based instruction set architecture).

Compared with register-based instruction set architecture, stack-based instruction set architecture has the advantages of high portability and conforms to the original design purpose of Java. Of course, this comes at the expense of a little performance. First, the stack-based instruction set requires more steps to perform the same function (such as pushing operands on and off the stack, etc.). Second, the implementation of the stack is in memory, and frequent access to the stack means frequent access to memory, so the inconsistent processing speed of the processor and memory also leads to performance bottlenecks.

2. The stack frame

The execution of a function (or method, for that matter) involves the following actions: evaluating, evaluating, returning, or throwing an exception. Stack frames are the data structures that the VIRTUAL machine uses to call, execute methods. The stack frame is roughly divided into several Spaces based on the actions of the above functions:

Data is the raw material of computation. The data for a function comes from a list of arguments, or from internally defined local variables, and the stack frame needs a small space to store them. This space is called a local variable table. When it comes to computing, there is a natural need for operating space. This space is called the operand stack. It can be thought of as a workbench, with each stack frame having its own workbench. How much space should be allocated for these two parts of each stack frame is already calculated in advance by the Javac compiler when analyzing the source code.

So, who is responsible for executing the bytecode instruction stream? The answer is the JVM’s Execute Engine, which interprets bytecode instructions as local machine instructions on the corresponding platform. Each stack frame can be located to a method reference of a class (each stack frame contains a reference to the method of the stack frame in the runtime constant pool), the execution engine will locate the corresponding method reference of the stack frame according to a dynamic link, fetch and execute the bytecode instructions stored in the Code property, and so on. The dynamic linking nature of Java leads to the method invocation section below, which explains why it is called dynamic.

When the function completes, it returns, which requires a method to return the address. In a chain of calls, a stack frame FN returns with a thread that: restores the local table of the previous stack frame FN-1, pushes the operand stack (if FN returns a value, it is pushed into the stack), adjusts the PC counter to the next instruction, and so on. To put it more simply, it is to pop up the stack frame of fn after the call, value (if any), and then restore the call scene of FN-1.

More vividly, it can be likened to the “cook” process (one dish, one cutting board, one vegetable basket, one plate, very elaborate) :

In a time slice, the CPU will only execute the Stack Frame at the top of a thread Stack, which is also called the Current Stack Frame, and the associated Method is called the Current Method.

Of course, the JVM makes a few optimizations, such as two stack frames next to each other, which may have a slight overlap between the operand stack and the local variator table, saving space and theoretically allowing data to be shared without some value copying.

There are other details: local variable tables store data in slots, not the usual bytes. Java specifies that some “short” data types use 1 Slot, while Long and Double use 2 Slot space. However, the Java Virtual Machine Specification does not clearly define how many bytes a Slot is. In addition, due to the variable slot reuse technology, the total space of the local variable table is not simply summed up (the actual space will be slightly lower).

3. Method calls

There is only one important point to understand in this article: Java is a static multi-dispatch, dynamic single-dispatch language.

Because of the concepts of polymorphism, overloading, overwriting, etc., method calls can actually be “ambiguous”, so the process of method calls is not as simple as it is supposed to be. These “ambiguous” methods are called “virtual” methods, and from the name, this concept is borrowed from C++. The bytecode instruction responsible for calling virtual methods is invokevirtual.

The actual invocation of a method may be analyzed on a case-by-case basis, perhaps at compile time, but most are deferred to run time (the earlier the better, of course). Because of this uncertainty, stack frames use “dynamic links” rather than “static links”. It certainly introduces complexity to Java method calls, but the benefit is flexibility.

However, some methods are indeed immutable at run time. The methods that can be determined at compile time are static (invokestatic), private (private), instance constructor, and parent (which correspond to the Invokespecial). In addition, there is a special method that is decorated with the final keyword (although it is still called using the Invokevirtual directive). These five categories of “explicit methods” are called “non-virtual methods”. The process of dealing with these five types of methods is called Resolution.

Other methods of using invokevirtual or invokeInterface instructions require Dispatch. Yes, the “normal methods” defined in Java are virtual functions, but Java dispatch is the default behavior and we don’t care if they are “virtual” when programming. Dispatching itself can be divided into four sections based on static/dynamic dispatching and single/multiple dispatching:

The blue area represents the area occupied by the Java language. In other sources, static dispatching is referred to as Method Overload Resolution, meaning that the concept of static dispatching is sometimes reduced to parsing. In this article, in the tradition of Getting Into the JVM VIRTUAL Machine, you will be presented at the dispatch.

3.1 Static Type and Dynamic Type

There are a few more concepts to mention here. Assuming Father is the parent of Son, we have an upper transition object:

// Son is an uptransition object.
Father son = new Son();
Copy the code

The upper transition object is a typical use case in learning about Java dynamic binding (dynamic dispatch in this article). In this assignment, Father is called a “static type”, which must be determined at compile time. But son clearly refers to an instance of type son, which is called the “real type” of the variable. Static types can be cast up and down.

If you call one of son’s methods hi(), that method is exactly an unknowable “virtual method.”

// Son is an uptransition object
Father son = new Son();

// Assume that hi() is a superclass implementation that inherits from subclasses.
son.hi();
Copy the code

The context is so simple that for those of us who are used to Java thinking, hi() seems to be an instance method of the Son class. But from the compiler’s point of view, it can’t make this assertion until it actually runs. The following code might seem even more obvious:

Father son;
son = new Random().nextInt(100) %2= =0 ? new Son() : new Father();

// If you run in a loop, the result may not be the same every time.
for(int i = 0; i<=100; i++)
{
	// Should the Father method be called, or the Son() method?
    // No one knows what the result will be unless the code is run.
	son.hi();
}
Copy the code

At this point, no one can say whether the recipient of hi() is Father or Son. Hi () seems to be stuck in a “superposition”…… Unless the program is actually run.

3.2 Single dispatch and multiple dispatch

The receiver of the method (in Go, whoever is bound to the function is the receiver of the function, which is equivalent to “whose method”, and the same applies here), and the parameters of the method are collectively referred to as the “case” of the method. The name comes from that book, but if you don’t feel comfortable with it, you can call it “variable.” In the case of “ambiguity” in a method call — that is, overloading, overwriting — this dispatching is called “single dispatching” if you intend to select the “appropriate method” based on only one case. If, on the other hand, multiple cases are considered and the most appropriate method is chosen, this dispatching is called “multiple dispatching”.

3.3 Static dispatch and Dynamic Dispatch

In layman’s terms, static dispatch refers to dispatch that can be determined at compile time, and dynamic dispatch refers to dispatch that can be determined at run time.

These two concepts are closely related to static/dynamic typing. The two terms are typically used for static assignment and dynamic assignment Override, respectively.

First, static dispatch. Static dispatch means that the dispatch of a method depends only on the static type. And because static typing is compile-time knowable, static dispatch is compile-time knowable, too. As you can see from the description, statically dispatched tasks are not actually the responsibility of the JVM, but of the JavAC compiler.

Dynamic dispatch then means that the dispatch of a method depends on the actual type of method receiver. The actual type is known only at run time, so dynamic dispatch can only occur at run time. As you can see from this description, dynamic dispatch is what the JVM does at runtime.

These concepts are a prerequisite for understanding the phrase “Java static multiple dispatch, dynamic single dispatch.” It may sound abstract, but there are some experiments or phenomena that we can use to get to this conclusion.

3.4 Static Multiple Dispatch

First, Java static dispatching occurs during the Overload process.

public class StaticDispatcher {

    static private class Father{}
    static private class Son extends Father{}
    
    // The overload receive method is overload only.
    public void receive(Son a){
        System.out.println("receive(Son)");
    }

    public void receive(Father a){
        System.out.println("receive(Father)");
    }

    public static void main(String[] args) {
        StaticDispatcher staticDispatcher = new StaticDispatcher();

        Father a = new Son();
        // the static type of a is Father, calling the receive(Father) method
        staticDispatcher.receive(a);

       	Son b = new Son();
        // the static type of b is Son, calling receive(Son)staticDispatcher.receive(b); }}Copy the code

The reason Java calls it “static multiple dispatch” is that the compiler checks based on two cases:

Determines the static type of the passed parameter.
Does the method receiver have a method that handles the corresponding static type? Otherwise, check if there are methods that can accept the parent of this static type. Otherwise, it checks the recipient’s upper-level type for appropriate methods, and so on.

According to these two arguments, the compiler can select the most appropriate method of all overloaded methods during compilation with only the static type, so it is also called “static multiple dispatch”. For example, suppose will StaticDispatcher: : receive commented out (Son) method, the code can still compile and run successfully, the result is:

receive(Father)
receive(Father)
Copy the code

Obviously, when there is no receive(Son) method, the compiler still has the option of dispatching the second StaticDispatcher.receive (b) call to the Receive (Father) method.

3.5 Dynamic Single Dispatch

First, Java’s dynamic single dispatch typically occurs when an Override on a transition object is triggered.

public class BaseDynamicDispatcher {
    / / 1
    public void receive(Object a){
        System.out.println("BaseDynamicDispatcher::receive(Object)"); }}class DynamicDispatcher extends BaseDynamicDispatcher {
    / / 2
    public void receive(Object a){
        System.out.println("DynamicDispatcher::receive(Object)");
    }

    // 3, note that this method is not inherited, is an extension of DynamicDispatcher method.
    public void receive(String a){
        System.out.println("DynamicDispatcher::receive(String)");
    }

    public static void main(String[] args) {
        BaseDynamicDispatcher baseDynamicDispatcher = new DynamicDispatcher();
        baseDynamicDispatcher.receive("aString"); }}Copy the code

Because the static type of baseDynamicDispatcher is baseDynamicDispatcher, the runtime JVM will only choose between methods 1 and 2, depending only on the actual type of the method receiver. Receive (String) is not a method of the parent BaseDynamicDispatcher and is ignored during dynamic dispatch.

In other words, Java behaves as if it “ignores” the type of the argument, even though method 3 seems to us to be the most appropriate choice. Therefore, dynamic dispatch in Java is also called dynamic single dispatch.

An example of dynamic multi-dispatch is the Groovy language, which also runs on the JVM. Unlike Java, Groovy can “pinpoint” method 3 under the same code semantics. Dynamic multiple dispatch allows Groovy to always choose the most “appropriate” method to call, which is in keeping with Groovy’s “flexible dynamic language” style, at the expense of making the function call process more complicated. This section was covered in my Groovy column: Understanding Dynamic Languages with Groovy (Juejin. Cn) Section 3.5: Method polymorphism with Groovy.

So, is there a way to get baseDynamicDispatcher to dispatch to the Receive (String) method? There are two ideas:

ifbaseDynamicDispatcherIf the static type of thereceive(String)Methods are migrated to the parent classBaseDynamicDispatcher, the idea is to refer to the static multiple dispatch above.
willbaseDynamicDispatcherThe static type declaration of theDynamicDispatcherOr cast at call time to satisfy the premise of dynamic single dispatch.

Resources available: Java polymorphism (detail overloading and overwriting, static dispatch and dynamic dispatch) -CSDN blog

JVM counter (PC Register) – Blind Scavenger – Cnblogs.com

The Understanding of dynamic connection in Stack Frames – Tom cat Xiao Qi – Cnblogs.com

4. On dynamic languages

A key feature of a dynamically typed language is that the main process of type checking is deferred to runtime determination, not runtime. Languages that do this include Python, JavaScript, Ruby, and so on. Obviously, Java and C++ are typical static languages that do type checking at compile time.

Dynamic languages are more flexible than static languages. Instead, it requires developers to implement a series of “self-disciplined constraints” to ensure that the program does not fail during execution. Static languages make the code a little more redundant, but the benefit is that the compiler can check for potential problems on behalf of the developer to prevent some run-time accidents.

Back to Java. The first version of the Java Virtual Machine Specification promised, “In the future, we will extend the Java Virtual Machine appropriately to better support other languages running on the Java Virtual Machine.” In fact, there are quite a few languages that can run on top of the JVM, both static and dynamic.

However, the JVM started out as a platform for static languages like Java, and many tasks were left to compile time, resulting in poor JVM support for dynamic languages in the early days (JDK 7 and before). As a result, the Java.lang. invoke package appeared in JDK 7 in the JSR-292 proposal, and the JVM finally introduced a new bytecode instruction invokeDynamic for the underlying dynamic language (or autonomous dispatch) in more than two decades.

4.1 Java. Lang. Invoke & MethodHandle

Java.lang. Invoke is designed to provide a new, source-level mechanism for dynamically determining target methods in addition to method dispatch by symbolic references alone.

Java, for example, treated objects as first-class citizens, with no way to pass defined functions (expressions) directly. Until the introduction of Lambda expressions, this dilemma was alleviated. The Java.lang. Invoke package provides a MethodHandle to compensate for Java’s lack of a “function pointer” concept.

The following code shows an example of using the MethodHandle type to get and call the println() method “handle” by run-time searching for the actual type of obj:

public class MethodHandleTest {

    static private class FakePrintStream {
        public void println(String any){ System.out.println(any); }}public static void main(String[] args) throws Throwable {
        
        // Regardless of the actual type of obj, this code will work as long as the "void println(String)" method is found.
        Object obj =new Random().nextInt(100) %2= =0 ? System.out : new FakePrintStream();

        MethodHandle methodHandle = bindPrintln(obj);
        // Equivalent to calling system.out.println (...)
        methodHandle.invoke("invoke System.out::println");
    }

    /** * This method searches for and extracts "handles" to println methods in the object receiver. *@paramReceiver Search object. *@returnReturns a method "handle" binding receiver to the receiver. *@throws NoSuchMethodException
     * @throws IllegalAccessException
     */
    static MethodHandle bindPrintln(Object receiver) throws NoSuchMethodException, IllegalAccessException {

        // Use it to define the signature of the target method. The first parameter is the return value type, followed by the parameter list type.
        MethodType mt = MethodType.methodType(void.class,String.class);

        /* * findVirtual(...) Find virtual methods named for bytecode instructions, findStatic, findSpecial, and so on. * bindTo(...) We bind the method defined by this class to the instance, which is equivalent to binding a "this". * */
        return lookup().findVirtual(receiver.getClass(),"println",mt).bindTo(receiver); }}Copy the code

This code is a manual implementation of the Invokevirtual directive, only this time at the level of user code. From a Java perspective, this code also works with Reflection. The main differences between the two are:

ReflectionSimulate the invocation of Java code,MethodHandleSimulate calls at the bytecode level.
ReflectionCan extract all kinds of information (heavyweight) relevant to the method, whileMethodHandleFocus only on the calling method itself (lightweight).
MethodHandleIn theory you can enjoy the JVM’s optimization of method instruction calls, but reflection does not.
ReflectionPermission modifiers can be ignored, resulting in potentially worse security, such as out-of-bounds access, forced calls, and so on.

In a word, Reflection is easy to use, but slightly slower. MethodHandle is hard to use (meaning you don’t get more information about the method), but it runs faster. Aside from the “Java perspective” standpoint, MethodHandle is designed to be a language that works with all JVMS, of which Java is just one part.

At the same time, this code style can also be called “capability” design: Whether obj is a PrintStream or some other type designed by the user, methodHandle always runs correctly as long as the program can find a method called void println(String) without defining any interface to constrain it. Obj can also be described as a “duck type” — an obvious design style that fits well with dynamic languages — “when you say nothing at all.”

4.2 InvokeDynamic and Lambda expressions

For further information on Lambda and Invokedynamic, see the following excellent article:

(1) the JVM instructions invokestatic, invokespecial, invokeinterface, invokevirtual, invokedynamic_helloworld column – CSDN blog

Understand invokedynamic | DouO ‘s Blog (dourok. Info)

Invokedynamic: Java’s Secret Weapon – Zhihu (zhihu.com)

Java language dynamic -invokedynamic_ program development log [learning never ends] -CSDN blog _invokedynamic

Invokedynamic bytecode was also introduced to solve the “rigid method dispatch strategy” problem caused by the previous four Invoke * instructions. Like MethodHandle, it aims to move the work of method dispatch from the bytecode level to the code level, and the general idea can be summarized as follows: The InvokeDynamic is associated with a Bootstrap Method (BSM), which dynamically dispatches instructions depending on what action is taken by the bound BSM, which itself can be designed by the user at the code level.

In this way, the decision to find the method is transferred from the virtual machine interaction to the user, including some good language designers. For example, Groovy’s Invokedynamic directive is the cornerstone of its dynamic multiple dispatch feature. In Java, this instruction applies to Lambda expressions. Unlike in the original book, this section focuses on how lambdas are dispatched using Invokedynamic. Here is a sample code:

public static void main(String[] args) {

    Function<Integer, Integer> f;
    Function<Integer, Integer> _x2 = i -> 2 * i;
    Function<Integer, Integer> _x3 = i -> 3 * i;

    f = new Random().nextInt(100) % 2= =0 ? _x2 : _x3;
}
Copy the code

After JDK 8, Java code was desugared for every Lambda expression in the source code when compiled. A private method that comples its logic into the current class (which may be abbreviated to a desugaring method) named lambda$

$

, where func stands for the domain of the lambda expression, X represents the number of a Lambda expression. Parsing the above code using the javap -c -v -p <.class> command yields the private methods of the _x2 and _x3 expressions after they are desugred:

private static java.lang.Integer lambda$main$1(java.lang.Integer); descriptor: (Ljava/lang/Integer;) Ljava/lang/Integer; flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC Code: stack=2, locals=1, args_size=1 0: iconst_3 1: aload_0 2: invokevirtual #8 // Method java/lang/Integer.intValue:()I 5: imul 6: invokestatic #5 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 9: areturn LineNumberTable: line 13: 0 LocalVariableTable: Start Length Slot Name Signature 0 10 0 i Ljava/lang/Integer; private static java.lang.Integer lambda$main$0(java.lang.Integer); descriptor: (Ljava/lang/Integer;) Ljava/lang/Integer; flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC Code: stack=2, locals=1, args_size=1 0: iconst_2 1: aload_0 2: invokevirtual #8 // Method java/lang/Integer.intValue:()I 5: imul 6: invokestatic #5 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 9: areturn LineNumberTable: line 12: 0 LocalVariableTable: Start Length Slot Name Signature 0 10 0 i Ljava/lang/Integer; }Copy the code

In addition, the invocation locations of Lambda expressions are replaced by invokedynamic instructions, also known as dynamic invocation points. Note the bytecode instruction on line 2,8:

     0: aconst_null
     1: astore_1
     2: invokedynamic #2,  0              // InvokeDynamic #0:apply:()Ljava/util/function/Function;
     7: astore_2
     8: invokedynamic #3,  0              // InvokeDynamic #1:apply:()Ljava/util/function/Function;
    13: astore_3
Copy the code

As mentioned above, each Invokedynamic instruction is associated with a Bootstrap Method (BSM). The BSM numbers 0 and 1 that Bootstrap _x2 and _x3 respectively can be seen at the bottom of javap analysis results. Bootstrap Methods are essentially a property table.

BootstrapMethods: 0: #37 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup; Ljava/lang/String; Ljava/lang/invoke/MethodType; Ljava/lang/invoke/MethodType; Ljava/lang/invoke/MethodHandle; Ljava/lang/invoke/MethodType;) Ljava/lang/invoke/CallSite; Method arguments: #38 (Ljava/lang/Object;) Ljava/lang/Object; #39 invokestatic forJava/InvokeDynamicTest.lambda$main$0:(Ljava/lang/Integer;) Ljava/lang/Integer; #40 (Ljava/lang/Integer;) Ljava/lang/Integer; 1: #37 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup; Ljava/lang/String; Ljava/lang/invoke/MethodType; Ljava/lang/invoke/MethodType; Ljava/lang/invoke/MethodHandle; Ljava/lang/invoke/MethodType;) Ljava/lang/invoke/CallSite; Method arguments: #38 (Ljava/lang/Object;) Ljava/lang/Object; #42 invokestatic forJava/InvokeDynamicTest.lambda$main$1:(Ljava/lang/Integer;) Ljava/lang/Integer; #40 (Ljava/lang/Integer;) Ljava/lang/Integer;Copy the code

At runtime, BSM is invoked when the execution engine encounters an InvokeDynamic instruction. From the content above, we can infer roughly guide work of Lambda expressions is Java lang. Invoke. LambdaMetafactory: : metafactory method is responsible for. The Method takes six arguments, the first three of which are fixed and the last three from the Method arguments below.

First, it constructs an anonymous class that loads the private methods after the Lambda has been desugaged. The class is generated in memory using ASM braided bytecode, and then loaded directly through Unsafe without being written to a file. However, you can make it output to a file at runtime by using the following virtual machine parameters:

-Djdk.internal.lambda.dumpProxyClasses=<path>
Copy the code

Take one of the output results as an example:

import java.lang.invoke.LambdaForm.Hidden;
import java.util.function.Function;

// $FF: synthetic class
final class InvokeDynamicTest$$LambdaThe $1implements Function {
    / / privatization
    private InvokeDynamicTest$$Lambda$1() {}
    @Hidden
    public Object apply(Object var1) {
        // Associate and actually execute the desugar method generated in the original class, return the result of the run.
        return InvokeDynamicTest.lambda$main$0((Integer)var1); }}Copy the code

Finally, BSM returns a CallSite object that internally encapsulates the InvokeDynamicTest$$Lambda$1:: Apply method. In this way, Java Lambda expressions are far from simply sugar-coating functional interfaces.

Here’s a very brief summary of the process:

The compiler desugar-compiles the Lambda to a private method and replaces the place where the Lambda is called withinvokedynamicLink to theLambdaMetaFactoryBSM CallSite.
At run time, its CallSite is replaced with a call to the corresponding desugar method.

So why did Java choose to use Invokedynamic in such a convoluted way? One reason is that in the future, if Java comes up with a better BSM solution for deicing calls to Lambda expressions, we can simply replace the invokedynamic “hook” at the bottom, or we can optimize as much as possible at the BSM level. This prevents the logic from hardening so that subsequent updates have to be “knocked down and started over” from the underlying logic. For more information, see this zhihu link: Why are Java 8 Lambda expressions based on InvokeDynamic? – zhihu (zhihu.com)

4.3 * Invokedynamic instruction practice

The best way to learn about invokedynamic bytecode instructions is to use them. Note! This part of the code content is excerpted from CSDN’s blog: JVM Invokedynamic call instruction _FEATHER (Feather)-CSDN blog. I briefly analyze the general process based on the original text.

For now, however, user Java code cannot construct invokedynamic bytecode underneath, and if we want to use invokeDynamic bytecode for dynamic allocation ourselves, we need to bind BSM to it with the ASM tool. This requires developers who specialize in this area in the future to be at least familiar with JVM bytecode instructions and the ASM framework provided by Java to manipulate bytecode.

Our goal here is just to get a better understanding of how InvokeDynamic works, so I won’t go into the details of ASM below. Here is an example:

Prepare a Horse POJO with the race() method.

public class Horse {
    public void race(a){
        System.out.println("horse is running..."); }}Copy the code

The second step is to prepare the Match class, which has a startRace method inside and expects to receive an object O and call its race() method. But so far there is no implementation at the source level.

package forJava.inDyT;

import java.lang.invoke.*;
public class Match {
    O, here as the recipient of the race() method, should have the race() method.
    public static void startRace(Object o){
        // Use ASM to embed the following bytecode in this method:
        // the equivalent of Java code obj.race(); In bytecode, it is first pushed using the ALOAD instruction.
        // Note that invokeDynamic actually invokes BSM, the dynamic call to race() returned by BSM.
        
        // aload obj
        // invokedynamic race()}}Copy the code

Prepare the RunApp class, which is only used to test your program and calls the Match::startRace() method, so don’t worry too much about it.

public class RunApp {
    public static void main(String[] args) {
        Match.startRace(newHorse()); }}Copy the code

Our goal is to use the InvokeDynamic instruction to dynamically invoke the Horse:: Race () method. To make the example simpler, part of the program’s logic has been written down.

Since there is now nothing inside the Match::startRace method, running RunApp now displays nothing. May I have an analogy: Lambda expressions of BSM by Java. Lang. Invoke. LambdaMetafactory offer, and during the compilation of the mid-term directly to its associated with invokedynamic instruction. Obviously, if we want to use invokedynamic instructions, we need to have our own BSM. This part of the implementation is optionally placed inside Match (as long as subsequent ASM can find it, there is no requirement on where the BSM should be placed).

For simplicity, BSM internally looks for virtual methods directly from horse.class and returns them as call points without much flexibility. The complete Match class is defined as follows:

package forJava.inDyT;

import java.lang.invoke.*;
public class Match {
    public static void startRace(Object o){}

    /**===================================================
     * BootStrap
     * @paramLookup Lookup instance *@paramTargetMethodName targetMethodName *@paramMethodType The type of the method handle for the call point link *@returnCall point * = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = * /
    public static CallSite bootstrap(MethodHandles.Lookup lookup, String targetMethodName, MethodType methodType) 
        throws NoSuchMethodException, IllegalAccessException {
        // create method handle
        // Horse. Class is dead for simplicity.
        MethodHandle methodHandle = lookup.findVirtual(Horse.class, targetMethodName, MethodType.methodType(void.class));

        // create a call point. The adapters that generate method handles from the old method handles create call points in turn.
        return newConstantCallSite(methodHandle.asType(methodType)); }}Copy the code

Regular user code cannot get Javac to build invokeDynamic instructions; this must be done by introducing ASM framework tools. The entire program is actually divided into two steps: first, run the core class ASMHelper and insert bytecode instructions through ASM (bind BSM), and then output the bytecode binary stream to the class file path (for example, the default output path of IDEA is target/classes/.. Folder). I then start the RunApp main program and have it load the Match binary modified by ASM to get the correct result.

Following this logic, the core class ASMHelper is implemented as follows. Of course, if you’re just looking at the Invokedynamic instructions, you don’t need to worry too much about the implementation details, because most of these are ASM-related operations:

package forJava.inDyT;
/ /!!!!!! The author is running JDK 1.8. Under higher JDK versions, the java.base module will need to be introduced.
import jdk.internal.org.objectweb.asm.*;
import java.io.IOException;
import java.lang.invoke.CallSite;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.nio.file.Files;
import java.nio.file.Paths;
public class ASMHelper implements Opcodes {
    / * * = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = * 1, visitors to the custom class: MyClassVisitor * 1.visitMethod () gets the method's access request and can be replaced with a custom MethodVisitor based on judgment. * = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = * /
    static class MyClassVisitor extends ClassVisitor {
        public MyClassVisitor(int api, ClassVisitor cv) {
            super(api, cv);
        }
        @Override
        public MethodVisitor visitMethod(int access, String name, String descriptor, String signature, String[] exceptions) {
            MethodVisitor visitor = super.visitMethod(access, name, descriptor, signature, exceptions);

            // Search for the method named startRace among all methods and return its MethodVisitor.
            if ("startRace".equals(name)) {
                return new MyMethodVisitor(ASM5, visitor);
            }
            returnvisitor; }}/ * * = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = * 2, custom methods visitors * = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = * /
    static class MyMethodVisitor extends MethodVisitor {

        // BootStrapMethod- BootStrapMethod
        private static final String BOOTSTRAP_CLASS_NAME = Match.class.getName().replace('. '.'/');
        private static final String BOOTSTRAP_METHOD_NAME = "bootstrap";
        private static final String BOOTSTRAP_METHOD_DESC = MethodType
                .methodType(CallSite.class, MethodHandles.Lookup.class, String.class, MethodType.class)
                .toMethodDescriptorString();

        // Target method
        private static final String TARGET_METHOD_NAME = "race";
        // Notice that the race method doesn't actually have any arguments, Ljava/lang/Object; Refers to the recipient of the method "this".
        private static final String TARGET_METHOD_DESC = "(Ljava/lang/Object;) V";

        private MethodVisitor mv;
        public MyMethodVisitor(int api, MethodVisitor mv) {
            super(api, null);
            this.mv = mv;
        }

        @Override
        public void visitCode(a) {
            mv.visitCode();
            // generate bytecode in startRace() : aload obj
            mv.visitVarInsn(ALOAD, 0); // Local variable directives

            // the bootstrap method of the Match class
            Handle handle = new Handle(H_INVOKESTATIC,
                    BOOTSTRAP_CLASS_NAME,
                    BOOTSTRAP_METHOD_NAME,
                    BOOTSTRAP_METHOD_DESC);
            / * * = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = * 1 * 3, generate invokedynamic instructions. Bind the call point generated by bootstrap() of the Match class to the Invokedynamic instruction. * 2. The method handle of the target method is also linked to the call point. This is convenient for subsequent direct invocation. * = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = * /
            mv.visitInvokeDynamicInsn(TARGET_METHOD_NAME, TARGET_METHOD_DESC, handle);

            //4
            mv.visitInsn(RETURN);
            mv.visitMaxs(1.1); mv.visitEnd(); }}/ * * = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = * operation to Match the class files in the bytecode modifications, Increase invokedynamic instruction * = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = * /
    public static void main(String[] args) throws IOException {
        // 1, Class reader. Load the Student's raw bytes and translate them into an access request.
        // forjava.indyt is the package name, here is the fully qualified name of the class.
        ClassReader cr = new ClassReader("forJava.inDyT.Match");

        // 2, Class writer.
        ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES);

        // The middle layer of Reader and Writer intercepts and processes access operations. If the target method is found, it is replaced with a custom MethodVisitor. Again to the Writer
        ClassVisitor cv = new ASMHelper.MyClassVisitor(ASM5, cw);
        cr.accept(cv, ClassReader.SKIP_FRAMES);
        // 4, convert the data in Write to byte array and Write to class file.
        // IDEA is compiled in target/classes/... In the corresponding directory. Use the Match. Class class file compiled by ASM instead of the IDE's default compilation.
        Files.write(Paths.get("C:\\Users\\i\\IdeaProjects\\groovyInJdk11\\target\\classes\\forJava\\inDyT\\Match.class"), cw.toByteArray()); }}Copy the code

After the project files are compiled, the main program of ASMHelper is first executed to override mate.class. If the following code snippet can be found when the javap-v instruction is used to analyze it by extracting it separately, the ASM work has completed successfully:

public static void startRace(java.lang.Object); descriptor: (Ljava/lang/Object;) V flags: ACC_PUBLIC, ACC_STATIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokedynamic #26, 0 // InvokeDynamic #0:race:(Ljava/lang/Object;) V 6: return ..... BootstrapMethods: 0: #23 invokestatic forJava/inDyT/Match.bootstrap:(Ljava/lang/invoke/MethodHandles$Lookup; Ljava/lang/String; Ljava/lang/invoke/MethodType;) Ljava/lang/invoke/CallSite;Copy the code

The 0 and 1 instructions in startRace’s Code table are generated by ASM, and the binary stream ends with BSM methods that we implemented ourselves. Then, run the RunApp method and the console will display:

horse is running...
Copy the code

Obviously, we haven’t explicitly called the obj.race() method anywhere at the source level, which is invoked using invokeDynamic with BSM.