background

Let’s look at code from a class like this:

  public static void main(String[] args) {
    OverWriteCode object = new OverWriteCode();
    object.invoke(null.1);
    object.invoke(null.1.2);
    // The first invoke method can only be called if you manually bypass the syntactic sugar of the variable length argument
    object.invoke(null.new Object[]{1});
  }

  public void invoke(Object obj, Object... args) {
    System.out.println("print invoke1");
  }

  public void invoke(String s, Object obj, Object... args) {
    System.out.println("print invoke2");
  }
Copy the code

Two overloaded methods with the same name are defined in the code above: the first accepts an Object, and the declaration is Object… Variable length parameter of; The second accepts a String, an Object, and is declared Object… Variable length parameter of.

Executing the above code results in the following:

print invoke2
print invoke2
print invoke1
Copy the code

In general, we discourage overloading variable-length parameter methods because the Java compiler may not be able to decide which target method to call.

But as we see in the results, the Java compiler recognizes the second method directly. Why? With that in mind, let’s look at how the Java virtual machine recognizes target methods.

But before that, we first preheat, I believe that you are beginning to learn the basic knowledge of Java, have heard of overloading and rewriting these two concepts, a wide range of applications, in contact with the source code, almost all have their figure.

Overloading and rewriting

overloading

In A Java program, if more than one method with the same name and argument type appears in the same class, it will not compile. And if the same class has the same method name and different argument lists, the relationship between these methods is called overloading.

Overloaded methods are identified at compile time. For each method call, the Java compiler selects overloaded methods based on the declared type of the argument passed in (note the distinction from the actual type).

The selection process is divided into three stages:

  1. Select an overloaded method without considering auto-boxing (auto-unboxing) and variable length parameters for basic types;
  2. If no suitable method is found in stage 1, the heavy-duty method is selected under the condition that automatic packing and unpacking are allowed, but variable length parameters are not allowed.
  3. If no suitable method is found in phase 2, the overloaded method is selected with automatic unpacking and variable length parameters.

If the Java compiler finds more than one suitable method for the same phase, it will choose the one that is most appropriate, and one of the keys to determining the appropriateness is the inheritance relationship of formal parameter types.

In conjunction with our initial test case, the null in invoke(NULL, 1) matches either a formal parameter declared as Object in the first method or a formal parameter declared as String in the second method. Since String is a subclass of Object, the Java compiler considers the second method more appropriate.

Primitive types are not considered above, and we will use a case study to demonstrate how the Javac compiler can choose a “more appropriate” method if the input parameter of an overloaded method contains both primitive and wrapper types.

public class OverloadTest {

  public static void sayHello(Object arg) {
    System.out.println("hello Object");
  }

  public static void sayHello(int arg) {
    System.out.println("hello int");
  }

  public static void sayHello(long arg) {
    System.out.println("hello long");
  }

  public static void sayHello(float arg) {
    System.out.println("hello float");
  }

  public static void sayHello(double arg) {
    System.out.println("hello double");
  }

  public static void sayHello(Character arg) {
    System.out.println("hello Character");
  }

  public static void sayHello(char arg) {
    System.out.println("hello char");
  }

  public static void sayHello(char. arg) {
    System.out.println("hello char ...");
  }

  public static void sayHello(Serializable arg) {
    System.out.println("hello Serializable");
  }

  public static void main(String[] args) {
    sayHello('a'); }}Copy the code

The output of the above code is:

hello char
Copy the code

If sayHello(char arg) is commented out, the output would be: sayHello(char arg)

hello int
Copy the code

An automatic conversion occurs. ‘a’ can also represent the number 97 in addition to a string (the Unicode value for the character ‘a’ is the decimal number 97), so overloading of the argument type int is also appropriate. Comment out the sayHello(int arg) method again, and the output becomes:

hello long
Copy the code

Two automatic type conversions occur. After ‘a’ is cast to the integer 97, it is further cast to the long integer 97L, matching the overload of the argument type long. In addition, automatic transitions can only be made upward, as in this case char>int>long>float>double, but will not match overloads of byte and short because char to byte or short is not safe. Continue to comment out the three methods with arguments of type long, float, and double to see what the output becomes.

hello Character
Copy the code

An auto-boxing occurs, and ‘a’ is wrapped as its wrapper type java.lang.Character, so an overload of type Character is matched. Comment out the method, and the output becomes:

hello Serializable
Copy the code

This result is really surprising. Why is it being printed? We know that java.lang.Character implements java.io.Serializable. After automatic boxing, we still can’t find the boxing class, but we can find the interface type implemented by the boxing class. Char can be converted to int, but Character can never be converted to Integer; it can only safely be converted to the interface or parent class it implements.

The java.lang.Character parable also implements another interface, java.lang.Com, and if we add an overloaded method with type Comparable to our test code, the priority is the same. The compiler is unable to determine which Type to transition to automatically, prompting “Type Ambiguous” and rejecting compilation. A program must explicitly specify the static type of a literal when called, such as sayHello((Serializable)’a’), to compile.

Since the compiler doesn’t allow the two to coexist, you can actually bypass the compiler and build your own bytecode tools, but you’re not sure which one to choose.

If you comment out the sayHello(Serializable ARg) method, the output will be:

hello Object
Copy the code

Char > Character > Character > Character > Character > Character > Character > Character If no explicit parent is found, because Object is the parent of all classes, the Object type is located. This rule applies even if a method call passes in a null parameter value.

One thing to note here is that because Character in our test code only implements two interfaces, if extends a class, then the class and interface have the same priority.

SayHello (Object arg)

hello char ...
Copy the code

It can be seen that the variable-length argument has the lowest overload priority, when the character ‘a’ is treated as an element of a char[] array. It is important to note that some automatic transformations that are true for a single parameter, such as char to int, are not true for variable-length parameters. In addition (char… Arg) and (Character… The arG parameter types are of the same priority to the compiler and cannot be matched.

If the argument types of overloaded methods include primitive data types, wrapper types, interfaces, parent classes, and so on, they take precedence in the following order:

  1. First match the corresponding basic data type;
  2. If the corresponding basic data type cannot be found, an upward transformation can be performed to match the basic data type. The basic data type transformation order is byte> Short >char>int>long>float>double.
  3. If the basic data type does not match, the automatic packing is the packaging type, and the corresponding packaging type is matched.
  4. If the wrapper type is not found, the interface or immediate parent implemented by the wrapper type is matched with the same priority;
  5. If it still can’t find it, it matches a higher parent class until it reaches Object.
  6. If all else fails, the variable-length argument is matched.

Bytecode tools bypass compilation

By definition of overloading, the Javac compiler does not allow methods with the same method name and parameter type in the same class. This limitation can be circumvented by bytecode tools. That is, after the compilation is complete, we can add methods with the same method name and parameter type but different return type to the class file. How does the Java compiler determine which method to call when a class containing multiple methods with the same method name, the same parameter type, and different return types appears on the user classpath? The current version of the Java compiler simply selects the first method name and the method whose argument type matches. In addition, it determines whether to compile and convert values based on the return type of the selected method.

Let’s take a look at an example that uses Javassist to dynamically process bytecode.

There are a number of technologies in the Java ecosystem that can handle bytecode dynamically. Two of the most popular are ASM and Javassist.

  • ASM: Directly operate bytecode instructions with high execution efficiency. However, it involves operations and instructions of JVM and requires users to master Java bytecode file formats and instructions, which has high requirements on users.

  • Javassist: Provides a more advanced API that is relatively inefficient to execute, but requires no knowledge of bytecode instructions. It is simple, fast, and requires little to the user.

For ease of use, the Javassist tool is chosen.

Start by creating a User class.

package com.msdn.java.hotspot.byteCode;

public class User {

  public String study(a) {
    System.out.println("study day by day");
    return "i love studying"; }}Copy the code

Then create a test class

package com.msdn.java.hotspot.byteCode;

import javassist.ClassPool;
import javassist.CtClass;
import javassist.CtMethod;
import javassist.Loader;
import javassist.Translator;

public class OverWriteCode {


  public static void main(String[] args) {
    updateUserClass();
  }

  public static void updateUserClass(a) {
    try {
      / / get the ClassPool
      ClassPool pool = ClassPool.getDefault();
      // Get User class
      CtClass ctClass = pool.get("com.msdn.java.hotspot.byteCode.User");
      CtMethod cm = ctClass.getDeclaredMethod("study".null);
      cm.setBody("{" + "System.out.println(\" hello: \");" + ""
          + "return \"123\"; }");

      ctClass.addMethod(CtMethod.make("public void study() {\n"
          + " System.out.println(\"study\"); \n"
          + "}", ctClass));
      // The created class object will be compiled into a.class file
      ctClass.writeFile(".. /path/");

      Translator translator = new Translator() {
        @Override
        public void start(ClassPool classPool) {
          System.out.println("start");
        }

        @Override
        public void onLoad(ClassPool classPool, String paramString) {
          System.out.println("onLoad:" + paramString); //com.msdn.java.hotspot.byteCode.User
          new User().study();// Call the method of the original class}}; Loader classLoader =new Loader(pool); //Javassist provides a Classloader
      classLoader.addTranslator(pool, translator); // Listen for the life cycle of the ClassLoader

      Class uClass = classLoader.loadClass("com.msdn.java.hotspot.byteCode.User");
      Object instance = uClass.newInstance();
      uClass.getDeclaredMethod("study").invoke(instance);
    } catch(Exception ex) { ex.printStackTrace(); }}}Copy the code

Executing the above code results in the following:

Start onLoad: com. MSDN. Java. Hotspot. The byteCode. The User study day by day hello: JavaCopy the code

Let’s take a look at the compiled user.class file, which looks like this:

You can verify the above statement based on the output above and the contents of the user.class file.

One of the ways to modify a class is to modify a class that has not been loaded. The other way is to modify a class that has been loaded.

1, modify the unloaded class, the brief code is as follows:

/ / get the ClassPool
CtClass ctClass = pool.get("com.msdn.java.hotspot.byteCode.User");
CtMethod cm = ctClass.getDeclaredMethod("study".null);
cm.setBody("{" + "System.out.println(\" hello: \");" + ""
           + "return \"123\"; }");
// The created class object will be compiled into a.class file
ctClass.writeFile(".. /path/");
// Compile into a bytecode file and load the class using the current thread context class loader
ctClass.toClass();
new User().study();
Copy the code

2. Modify the loaded class. The same Class cannot be loaded twice in the same ClassLoader, so you need to note when exporting CtClass. The ctClass.toclass () statement means loaded, so the solution is to specify an unloaded ClassLoader.

The implementation code is shown in the OverWriteCode file.

In addition to methods in the same class, overloading can also apply to methods inherited from that class. If a subclass defines a method with the same name as a nonprivate method of the parent class, and the two methods have different parameter types, then both methods constitute overloading in a subclass.

public class Person {

  public void eat(String msg) {
    System.out.println("eat "+ msg); }}public class Man extends Person {

  public void eat(String msg, int num) {
    System.out.println("eat " + msg + "Cost"+ num); }}Copy the code

Summary of knowledge about overloading

  • Method names must be the same, parameter lists must be different (different number, or type, parameter types in different order, etc.)
  • Methods can have the same or different return types.
  • Overloading occurs in the same class or subclass
  • Overloading implements compile-time polymorphism

rewrite

If a subclass defines a method with the same name as a nonprivate method of the parent class, and the two methods have the same parameter type, then the relationship between the two methods is called overwriting.

Note, however, that if both methods are static, the method in the subclass hides the method in the parent class, and the static method form can be overridden, but hidden. The reason is that method rewriting is based on dynamic binding at runtime, whereas static methods are statically bound at compile time. If neither method is static and neither is private, then the subclass’s method overrides the method in the parent class.

The following code looks like this:

public class DynamicDispatch {

  static abstract class Human {

    protected abstract void sayHello(a);
  }

  static class Man extends Human {

    @Override
    protected void sayHello(a) {
      System.out.println("man say hello"); }}static class Woman extends Human {

    @Override
    protected void sayHello(a) {
      System.out.println("woman say hello"); }}public static void main(String[] args) {
    Human man = new Man();
    Human woman = new Woman();
    man.sayHello();
    woman.sayHello();
    man = newWoman(); man.sayHello(); }}Copy the code

The output is:

man say hello
woman say hello
woman say hello
Copy the code

The above results should not surprise you, and this is what we know as polymorphism. Which method to call is not determined at compile time. We can look at the bytecode file as follows:

0: new           #2                  // class com/msdn/java/hotspot/byteCode/method/DynamicDispatch$Man
3: dup
4: invokespecial #3                  // Method com/msdn/java/hotspot/byteCode/method/DynamicDispatch$Man."<init>":()V
7: astore_1
8: new           #4                  // class com/msdn/java/hotspot/byteCode/method/DynamicDispatch$Woman
11: dup
12: invokespecial #5                  // Method com/msdn/java/hotspot/byteCode/method/DynamicDispatch$Woman."<init>":()V
15: astore_2
16: aload_1
17: invokevirtual #6                  // Method com/msdn/java/hotspot/byteCode/method/DynamicDispatch$Human.sayHello:()V
20: aload_2
21: invokevirtual #6                  // Method com/msdn/java/hotspot/byteCode/method/DynamicDispatch$Human.sayHello:()V
24: new           #4                  // class com/msdn/java/hotspot/byteCode/method/DynamicDispatch$Woman
27: dup
28: invokespecial #5                  // Method com/msdn/java/hotspot/byteCode/method/DynamicDispatch$Woman."<init>":()V
31: astore_1
32: aload_1
33: invokevirtual #6                  // Method com/msdn/java/hotspot/byteCode/method/DynamicDispatch$Human.sayHello:()V
36: return
Copy the code

Although the sayHello method refers to Human, the actual type of the object is different at runtime, resulting in different behavior. The real reasons behind this are: Because the invokevirtual directive first determines the actual recipient type at runtime, the invokevirtual directive does not resolve symbolic references to methods in the constant pool to direct references. Instead, the invokevirtual directive selects method versions based on the actual type of method recipients. This process is the essence of method rewriting in the Java language.

Since polymorphism is mentioned, and the root of this polymorphism lies in the execution logic of virtual methods (described later) that invoke the invokevirtual directive, does polymorphism apply to fields? Instead of using the Invokevirtual directive, custom calls use the getField directive. Let’s start with a case study:

public class FieldHasNoPolymorphic {

  static class Father {
    public int money = 1;

    public Father(a) {
      money = 2;
      showMeTheMoney();
    }

    public void showMeTheMoney(a) {
      System.out.println("I am Father, i have $"+ money); }}static class Son extends Father {
    public int money = 3;

    public Son(a) {
      money = 4;
      showMeTheMoney();
    }

    public void showMeTheMoney(a) {
      System.out.println("I am Son, i have $"+ money); }}public static void main(String[] args) {
    Father gay = new Son();
    System.out.println("This gay has $"+ gay.money); }}Copy the code

The output is:

I am Son, i have $0
I am Son, i have $4
This gay has $2
Copy the code

The Father constructor calls showMeTheMoney() as a virtual method. The Father constructor calls showMeTheMoney() as a virtual method. The actual version executed is the Son::showMeTheMoney() method, so the output is “I am Son”. The Son::showMeTheMoney() method calls the money field of the child class, and the result is still 0 because it will not be initialized until the child class’s constructor executes. The last sentence of main() accesses money in the parent class via static typing, printing 2. This can also be seen by looking at the bytecode.

 getfield      #9                  // Field com/msdn/java/hotspot/byteCode/method/FieldHasNoPolymorphic$Father.money:I
Copy the code

Of course, if you implement the getter method for money and then use the getter method to get money later, then you can access money in the subclass.

Conclusion:

  • The method name, argument list must be the same, and the return type can be the same or a subtype of the original type
  • Overriding methods cannot be less accessible than the original method (that is, access permissions cannot be narrowed).
  • Overriding methods cannot throw more exceptions than the original method.
  • Overrides occur between subclasses and superclasses
  • Rewrite to implement runtime polymorphism

Static and dynamic binding

We’ve spent a lot of time on overloading and overwriting, and you can see that both overloading and overwriting require that methods be unique, and how the Java virtual machine recognizes methods once Javac is compiled.

The key to Java VIRTUAL machine method identification is the class name, method name, and method descriptor. As for the method descriptor, it consists of a list of parameters to the method and the return type. If multiple methods with the same name and descriptor appear simultaneously in the same class, the Java virtual Machine will report an error during the validation phase of the class.

If the method name and parameter list are the same, the two methods will report an error at compile time. However, we can use Javassist to modify the bytecode so that both methods can exist at the same time. And the JVM can execute the code without a problem.

Determination of method rewriting in the Java virtual machine is also based on method descriptors. That is, if a subclass defines a method with the same name as a non-private, non-static method of the parent class, the Java virtual machine will judge the subclass to be overridden only if the argument lists and return types of the two methods are the same (after Java7, the return types can be different, but must be derived from the parent class’s return value).

In the case of overrides in the Java language but not in the Java virtual machine, the compiler implements the overrides semantics in Java by generating bridge methods.

Take a look at an example:

public class Parent<T> { public void sayHello(T value) { System.out.println("This is Parent Class, value is " + value); } } public class Child extends Parent<String> { public void sayHello(String value) { System.out.println("This is Child class, value is " + value); } public static void main(String[] args) { Child child = new Child(); Parent<String> object = child; object.sayHello("Java"); }}Copy the code

Then run the following command:

javac Child.java Parent.java 
javap -v -c Child 
Copy the code

You can see a method like this:

public void sayHello(java.lang.Object); descriptor: (Ljava/lang/Object;) V flags: ACC_PUBLIC, ACC_BRIDGE, ACC_SYNTHETIC Code: stack=2, locals=2, args_size=2 0: aload_0 1: aload_1 2: checkcast #13 // class java/lang/String 5: invokevirtual #14 // Method sayHello:(Ljava/lang/String;) V 8: return LineNumberTable: line 8: 0Copy the code

Because of type erasures, the T keyword is replaced with Object, and the compiler generates a bridge method to ensure polymorphism. This article is recommended.

The Javac compiler and the Java virtual machine handle overloading and overwriting differently.

  • If the method name is the same as the parameter list, the compiler will report an error. However, Java virtual machines have more general validation conditions, and an error will only be reported during the validation phase if the method name and method descriptor are the same.

  • The Java virtual machine is considered overridden if the subclass and parent class have the same method with the same method name and method descriptor. In the case of overrides in the Java language but non-overrides in the Java Virtual machine (that is, method descriptors are different), the compiler implements the overrides semantics in Java by generating bridge methods.

The Javac compiler corresponds to the compile phase, the Java virtual machine corresponds to the run phase, reloading occurs at the compile phase, and rewriting occurs at the run phase. Overloading is also known in some places as static binding or compile-time polymorphism; Overwriting is called a dynamic binding. But this is not entirely true in the context of the Java virtual machine. This is because overloaded methods in a class can be overridden by their subclasses, so the Java compiler compiles all calls to non-private instance methods into types that need to be dynamically bound.

Because of this, the compiler uses dynamic binding directly when dealing with non-static, non-private, non-final methods (final methods use static binding because they are not inherited). In summary, methods that cannot be inherited by subclasses (static methods, private methods, final methods) are compiled into static bindings. Classes that can be overridden by subclass inheritance to require the runtime to determine the object instance type before deciding which method to call are compiled into dynamic binding.

As we’ve looked at bytecode files several times before, it’s worth taking a look at the five call-related instructions in Java bytecode.

  1. Invokestatic: used to invokestatic methods.
  2. Invokespecial: Used to call private instance methods, constructors, and instance methods or constructors of the parent class using the super keyword, and the default method of the interface implemented.
  3. Invokevirtual: Used to invoke non-private instance methods.
  4. Invokeinterface: used to invokeinterface methods.
  5. Invokedynamic: used to invokedynamic methods.

Invokestatic and Invokespecial are statically bound, while Invokevirtual and InvokeInterface are dynamically bound. Invokedynamic is more complex and will be covered in subsequent chapters.

Virtual method call

We mentioned the term virtual methods earlier, which is briefly introduced here and will be described later when we talk about method inlining.

Only private methods called with the Invokespecial directive, instance constructors, superclass methods, and static methods called with the InvokeStatic directive are parsed at compile-time, plus the final modified method (although it is called with the Invokevirtual directive), These five method calls resolve symbolic references to direct references to the method at class load time. These methods are collectively referred to as “non-virtual methods”, whereas other methods are called “Virtual methods”.

So what about final methods? Final methods are first called using the Invokevirtual directive, and we all know that when a method is declared with the final keyword, it is called a final method. Final methods cannot be overridden (overridden), so the Java VIRTUAL machine can determine the type of caller. If the virtual method call points to a method marked final, then the Java VIRTUAL machine can statically bind the target method of the virtual method call.

A symbolic reference to the calling instruction

During compilation, we do not know the exact memory address of the target method. Therefore, the Java compiler temporarily represents the target method with symbolic references. This symbolic reference includes the name of the class or interface of the target method, as well as the method name and method descriptor of the target method.

Symbolic references are stored in the constant pool of the class file. Depending on whether the target method is an interface method, these references can be divided into interface symbolic references and non-interface symbolic references.

Before the bytecode using symbolic references can be executed, the Java virtual machine needs to parse these symbolic references and replace them with actual references.

For non-interface symbolic references, assuming that the symbolic reference refers to a class C, the Java virtual machine does the following.

  1. Find methods in C that match names and descriptors.
  2. If not, continue searching in C’s parent class until you reach the Object class.
  3. If not, search the interface implemented directly or indirectly by C, and the target method that results from this search must be non-private and non-static. In addition, if the target method is in the indirectly implemented interface, there is no other qualified target method between C and the interface. If there are multiple qualified target methods, one of them is returned arbitrarily.

We’ve seen before that when a static method of a subclass is called in an inheritance relationship, the static method of the subclass hides (distinguish from overrides) static methods of the same name and descriptor in the parent class.

Take a look at a small case:

public class Person {
  public static int eat(int num){
    System.out.println("person eat");
    returnnum; }}public class Man extends Person {
  public static int eat(int num) {
    System.out.println("man eat");
    return num;
  }

  public static void main(String[] args) {
    Man.eat(12); }}Copy the code

For example we call eat method, its symbols referred to as “com/MSDN/Java/hotspot/the byteCode/Man eat: (I) I”, then we will go to the Man in the class to find. The method descriptor of eat is eat:(I)I. The input parameter is a variable of type int, and the method return type is also int. We find the corresponding method in Man, so we directly return it.

If you delete the eat method in the Man class in the above code, recompile the execution, and check the bytecode, it can be seen that the search is still started from man.java. If no search is found, the search is carried out in the parent class Person, and finally locate the eat method.

For an interface symbol reference, assume that the interface to which the symbol reference refers is I, then the Java virtual machine performs the following steps to search for it.

  1. Find a method in I that matches the name and descriptor.
  2. If not, search for the Object public instance method (such as the toString method).
  3. If it does not, it searches for the superinterface of I. The requirements for the search results in this step are the same as those in Step 3 for non-interface symbolic references.

For step 3 above, let’s look at an example.

public interface Rent { default void rent() { System.out.println("parent rent"); } } public interface RentChild extends Rent { void say(); } public class Host implements RentChild {@override public void say() {system.out.println (" 用 两句 "); } @Override public void rent() { System.out.println("child rent"); } } public class Tenant { public void rent(RentChild rent) { rent.rent(); System.out.println(" Cost me a fortune "); System.out.println(rent.toString()); } public static void main(String[] args) { Tenant tenant = new Tenant(); RentChild rentChild = new Host(); tenant.rent(rentChild); }}Copy the code

If the rent() implementation of Host is deleted, the rent() method in the rent interface will be called.

After the above parsing steps, symbolic references are parsed into actual references. For a method call that can be statically bound, the actual reference is a pointer to the method. For method calls that need to be dynamically bound, the actual reference is the index of a method table.

We’ll talk more about what a method table means next.

Method table

In the wiring section that introduced the class loading mechanism, in the preparation phase, in addition to allocating memory for static fields, it also constructs the method table associated with the class.

The method table is the key to the dynamic binding of Java VIRTUAL machine. Next, the virtual method table used by Invokevirtual is used as an example to introduce the usage of the method table. The interface Method table used by invokeInterface is used as the interface method table. Itable) is a bit more complicated, but the principle is similar.

A method table is essentially an array, with each array element pointing to a non-private instance method of the current class and its ancestors.

These methods may be concrete, executable methods, or abstract methods with no corresponding bytecode. The method table satisfies two characteristics: first, the subclass method table contains all the methods in the parent class method table; Second, a subclass method has the same index value in the method table as the parent method it overrides.

We know that symbolic references in method call instructions are resolved to actual references before execution. For statically bound method calls, the actual reference will point to the specific target method. In the case of dynamically bound method calls, the actual reference is the index value of the method table (not really just the index value).

Let’s look at a simple example:

public abstract class Passenger {

  abstract void say(a);

  @Override
  public String toString(a) {
    return "Passenger"; }}public class ForeignerPassenger extends Passenger {

  @Override
  void say(a) {
    System.out.println("say hello"); }}public class ChinesePassenger extends Passenger {

  @Override
  void say(a) {
    System.out.println("Say hello");
  }

  void eat(a) {
    System.out.println("Eat steamed stuffed bun");
  }

  public static void main(String[] args) {
    Passenger passenger = newChinesePassenger(); passenger.say(); }}Copy the code

The table of methods for each file in the above example is shown below:

The toString method exists in the method table of each file. If the index value of toString method in the parent class is 0, the index value of the method in the subclass is also 0.

In fact, dynamic binding using a method table has only a few more memory dereference operations than static binding:

1. Access data on the stack, which holds a reference to an object instance in the heap;

2, object in the heap memory is divided into three regions: object, instance data, and alignment filling, information object header includes two parts: the runtime data, and a pointer type, after having obtained the pointer type, we find method through pointer area (to store virtual machine has been loaded the class information, static variables, constants, and the real-time compilers compiled code and other data) in the class of information;

3. Obtain the method table of the class, and finally find the corresponding method from the method table according to the index.

These memory dereference operations are relatively inexpensive compared to creating and initializing Java stack frames. It’s not that virtual method calls don’t have a significant impact on performance, but there are two optimizations for just-in-time compilation that perform better: Inlining cache and method inlining.

Inline cache

Inline caching is an optimization technique to speed up dynamic binding. It can cache the dynamic type of the caller in a virtual method call, as well as the target method corresponding to that type. During subsequent execution, if a cached type is encountered, the inline cache directly calls the target method for that type. If no cached type is encountered, inline caching degrades to using dynamic binding based on method tables.

To put it bluntly, it is similar to the common caching technique. It will cache the used class instance and method name. Next time the virtual method is called, it will look up in the cache first.

Three terms are commonly referred to in optimization approaches to polymorphism.

  • Monomorphic refers to a situation where there is only one state.
  • A polymorphic is a situation with a limited number of states. Bimorphic is one type of polymorphism.
  • Megamorphic is a situation where there are many more states. Usually we use a specific number to distinguish polymorphism from hyperpolymorphism. Below this value, we call it polymorphism. Otherwise, we call it hyperpolymorphism.

For inline caches, there are singlet inline caches, polymorphic inline caches, and hyperpolymorphic inline caches.

Singlet inline caches are easy to understand, that is, caches only one object type and its corresponding target method; Polymorphic inline caching, on the other hand, caches multiple object types (subclasses) and their target methods. If there are too many subclasses, each polymorphic inline cache will cache data of multiple object types, and in practice, singleton is used in a wider range of scenarios. To save memory space, Java virtual machines only use singleton inline caching.

In addition, we will put the more popular dynamic types (i.e., hot data) first.

Once you decide to use singlet inline caching, you know that caching takes up space and you can’t cache everything every time new data comes in. How does the JVM handle new data (method calls that don’t hit the cache)?

For the content in the inline cache, we have two options.

1. Replace records in singleton inline cache. This approach is similar to data caching in the CPU, where there is a requirement for data locality that the dynamic types of callers of method calls should be consistent for a period of time after the inline cache is replaced to make efficient use of the inline cache.

But the problem with this choice is that we alternate the method calls with two different types of callers, and each method call replaces the inline cache. That is, there is only the overhead of the write cache, not the performance gain of the cache. This reminds me of the write cache in MySQL, change buffer, which is suitable for the situation of writing too much and reading too little. If you query it immediately after writing, it not only does not play its due value, but will increase the maintenance cost.

2. Deteriorate into a superpolymorphic state. This is how the JVM is implemented. In other words, it is a matter of passing up the opportunity to optimize, or simply accessing the method table in a step-by-step manner so that you do not need to consider the additional overhead of the write cache.

Although inline cache comes with the word inline, it does not have an inline target method. To be clear here, any method call has fixed overhead unless it is inlined. The overhead comes from saving the execution position of the program within the method and the stack frames used to create, press, and pop new methods.

reference

Introduction to Javassist bytecode

Javassist USES

Implement dynamic section based on Javassist and Javaagent

Analyze the nature of method calls from a bytecode perspective

“Symbol reference to Direct Reference”

In-depth Dismantling of Java Virtual Machines by Zheng Yudi, Geek Time

In-depth Understanding of the Java Virtual Machine