Gradle add-ons + ASM Plugged-in: “Gradle add-ons + ASM Plugged-in:” “Gradle add-in + ASM plugged-in:” “Gradle add-in + ASM plugged-in:” “Gradle add-in + ASM plugged-in:” ” This is often one of the questions I have to ask when interviewing new students, because many functions, such as plug-in and hot repair, performance optimization, coverage statistics and so on, are difficult to implement if you do not know this. Small companies are rarely used, this is also true, as for everyone to learn, it depends on the individual situation, in fact, it is not a matter of whether to use, it depends on everyone is willing to do a crab. We mainly discuss it from the following three aspects:

1. Class file bytecode structure

1.1 Class bytecode example

Let’s start with a very simple helloWorld.java

public class HelloWorld {
    public HelloWorld() {
    }

    public static void main(String[] args) {
        System.out.println("Hello World!");
    }
}
Copy the code

Open the generated helloWorld.class file with a text editor like this:

cafe babe 0000 0033 0022 0a00 0600 1409
0015 0016 0800 170a 0018 0019 0700 1a07
001b 0100 063c 696e 6974 3e01 0003 2829
5601 0004 436f 6465 0100 0f4c 696e 654e
756d 6265 7254 6162 6c65 0100 124c 6f63
616c 5661 7269 6162 6c65 5461 626c 6501
0004 7468 6973 0100 264c 636f 6d2f 6578
616d 706c 652f 6d79 6170 706c 6963 6174
696f 6e2f 4865 6c6c 6f57 6f72 6c64 3b01
0004 6d61 696e 0100 1628 5b4c 6a61 7661
2f6c 616e 672f 5374 7269 6e67 3b29 5601
0004 6172 6773 0100 135b 4c6a 6176 612f
6c61 6e67 2f53 7472 696e 673b 0100 0a53
6f75 7263 6546 696c 6501 000f 4865 6c6c
6f57 6f72 6c64 2e6a 6176 610c 0007 0008
0700 1c0c 001d 001e 0100 0c48 656c 6c6f
2057 6f72 6c64 2107 001f 0c00 2000 2101
0024 636f 6d2f 6578 616d 706c 652f 6d79
6170 706c 6963 6174 696f 6e2f 4865 6c6c
6f57 6f72 6c64 0100 106a 6176 612f 6c61
6e67 2f4f 626a 6563 7401 0010 6a61 7661
2f6c 616e 672f 5379 7374 656d 0100 036f
7574 0100 154c 6a61 7661 2f69 6f2f 5072
696e 7453 7472 6561 6d3b 0100 136a 6176
612f 696f 2f50 7269 6e74 5374 7265 616d
0100 0770 7269 6e74 6c6e 0100 1528 4c6a
6176 612f 6c61 6e67 2f53 7472 696e 673b
2956 0021 0005 0006 0000 0000 0002 0001
0007 0008 0001 0009 0000 002f 0001 0001
0000 0005 2ab7 0001 b100 0000 0200 0a00
0000 0600 0100 0000 0a00 0b00 0000 0c00
0100 0000 0500 0c00 0d00 0000 0900 0e00
0f00 0100 0900 0000 3700 0200 0100 0000
09b2 0002 1203 b600 04b1 0000 0002 000a
0000 000a 0002 0000 000c 0008 000d 000b
0000 000c 0001 0000 0009 0010 0011 0000
0001 0012 0000 0002 0013 
Copy the code

Boy, how can I understand that? Javap-verbose helloworld.class is a bit easier to use: javap-verbose helloworld.class

Last modified 2021-1-7; size 586 bytes MD5 checksum bf91e508b76a0dc7d4c0250b0e55f75b Compiled from "HelloWorld.java" public class com.example.myapplication.HelloWorld minor version: 0 major version: 51 flags: ACC_PUBLIC, ACC_SUPER Constant pool: #1 = Methodref #6.#20 // java/lang/Object."<init>":()V #2 = Fieldref #21.#22 // java/lang/System.out:Ljava/io/PrintStream; #3 = String #23 // Hello World! #4 = Methodref #24.#25 // java/io/PrintStream.println:(Ljava/lang/String;) V #5 = Class #26 // com/example/myapplication/HelloWorld #6 = Class #27 // java/lang/Object #7 = Utf8 <init> #8 = Utf8 ()V #9 = Utf8 Code #10 = Utf8 LineNumberTable #11 = Utf8 LocalVariableTable #12 = Utf8 this #13 = Utf8 Lcom/example/myapplication/HelloWorld; #14 = Utf8 main #15 = Utf8 ([Ljava/lang/String;)V #16 = Utf8 args #17 = Utf8 [Ljava/lang/String;  #18 = Utf8 SourceFile #19 = Utf8 HelloWorld.java #20 = NameAndType #7:#8 // "<init>":()V #21 = Class #28 // java/lang/System #22 = NameAndType #29:#30 // out:Ljava/io/PrintStream; #23 = Utf8 Hello World!  #24 = Class #31 // java/io/PrintStream #25 = NameAndType #32:#33 // println:(Ljava/lang/String; )V #26 = Utf8 com/example/myapplication/HelloWorld #27 = Utf8 java/lang/Object #28 = Utf8 java/lang/System #29 = Utf8 out #30 = Utf8 Ljava/io/PrintStream; #31 = Utf8 java/io/PrintStream #32 = Utf8 println #33 = Utf8 (Ljava/lang/String; )V { public com.example.myapplication.HelloWorld(); descriptor: ()V flags: ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 10: 0 LocalVariableTable: Start Length Slot Name Signature 0 5 0 this Lcom/example/myapplication/HelloWorld;  public static void main(java.lang.String[]); descriptor: ([Ljava/lang/String;)V flags: ACC_PUBLIC, ACC_STATIC Code: stack=2, locals=1, args_size=1 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #3 // String Hello World! 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return LineNumberTable: line 12: 0 line 13: 8 LocalVariableTable: Start Length Slot Name Signature 0 9 0 args [Ljava/lang/String; }Copy the code

1.2 Class file structure

A.class file is a set of binary streams in 8-bit byte units. The data items are arranged in a tight sequence in a.class file without any delimiters. This makes the entire.class file store almost all the data that the program needs. As for the specific content, here is a table you can refer to.

Javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose: javap-verbose

2. Loading mechanism of JVM classes

2.1 Loading time of classes

The JVM virtual machine specification does not specify when to load, but it does specify when to initialize, and there are five cases where classes must be initialized immediately:

  • When you encounter four bytecode instructions — New, getstatic, putstatic, or Invokestatic — you need to trigger initialization if the class has not already been initialized. The most common Java code scenarios for generating these four instructions are when an object is instantiated using the new keyword, when static fields of a class are read or set (except for static fields that are modified by final and have been put into the constant pool at compile time), and when static methods of a class are called
  • When a reflection call is made to a class using the methods of the java.lang.Reflect package
  • When initializing a class, if its parent has not been initialized, the initialization of its parent must be triggered first
  • When the virtual machine starts, the user needs to specify a primary class (the class containing the main() method) to execute, and the virtual machine initializes this primary class first
  • When using JDK 1.7 dynamic language support, if a Java lang. Invoke. The final analytical results REF_getStatic MethodHandle instance, REF_putStatic, REF_invodeStatic method handles, If the class to which the method handle corresponds has not been initialized, it needs to be initialized first.

2.2 Class loading process

Class loading process is roughly divided into five steps: load, validation, preparation, analytical and initialization, and the pre early I made serious mistakes, that is to interview habits back, found that it is easy to forget this period of time, and similar problems in the development are often at a loss, so we hope you can understand thoroughly understand, so you can do it once and for all:

2.2.1 load
  • Gets the binary byte stream that defines a class by its fully qualified name
  • Converts the static storage structure represented by the binary byte stream into a runtime data structure in the method area
  • Generate an in-memory object representing the java.lang.Class of this Class as an access point for this Class in the method area
  • The JVM virtual machine does not specify where to get the binary byte stream. We can get it from a.class statically stored file, from apK, ZIP, JAR, etc., from a database, from the network, or even automatically generate it ourselves at runtime.
  • After instantiating a java.lang.Class object representing this Class in memory, it is not specified that the Class object is in the method Java heap. Some virtual machines place the Class object in the method area, such as HotSpot, A ClassLoader instantiates only one Class object.
2.2.2 validation
  • File format verification: Verifies whether the binary byte stream data complies with the specification of the.class file and whether the.class file is within the processing scope of the virtual machine (version verification). Only after the file format is verified, the binary byte stream is stored in the method area of memory. And only after the file format validation is passed will the next three validations occur, which are based on the storage structure in the method area
  • Metadata verification: it mainly checks the semantics of class metadata to ensure that there is no metadata that does not conform to Java semantic specifications
  • Bytecode verification: Bytecode verification is the most complex process in the whole verification process. In metadata verification, after verifying the data types in the metadata information, bytecode verification mainly verifies and analyzes the method body of the class to ensure that the methods of the verified class do not harm virtual machines
  • Symbolic reference validation: Symbolic reference validation occurs during the parse phase, the third stage of the join, to ensure that the parse process is performed correctly. Symbolic reference validation is the validation of other classes referenced by the class itself, including whether the corresponding class can be found by the fully qualified name of a class, whether the fields and methods in other classes accessed exist, and whether the accessibility is appropriate
Then prepare
  • Only class variables (static modified variables) are allocated memory in the method area, not instance variables, which will follow the object through the Java heap to allocate memory for it
  • When you initialize a class variable, you initialize it to the value 0 of its type. For example, if you have a class variable, after the preparation phase, val is set to 0 instead of being set, and val is copied to a specific value during the initialization phase
  • For constants, the value is stored in the ConstantValue property of the field table at compile time, so after the preparation stage, the ConstantValue is the value specified by ConstantValue.
2.2.4 parsing
  • The virtual machine specification does not specify when the parsing phase takes place, only that the symbolic references used by the 16 instructions, such as Newarray, New, PutFidle, PutStatic, GetField, and GetStatic, are resolved before they are executed. So the virtual machine can parse the class after it has been loaded by the loader, or it can parse it before these instructions are executed
  • It is common to parse the same symbolic reference multiple times, and with the exception of invokedynamic instructions, virtual machine implementations can cache the result of the first parse and simply fetch the cached result for subsequent parse of the same symbolic reference
  • The parse action resolves the class or interface, field, class method, interface method, method type, method handle, and call point qualifier 7 class symbol references
2.2.5 initialization
  • Class constructor () by the compiler automatically collect class in the class variables and static statements consolidation in a block of code to produce, order is collected in order to appear in the source files, static code can access in static code blocks before class variables, a static block after class variables, only can assign a value, but can not access.
  • The () class constructor, unlike the () instance constructor, does not need to display the parent’s class constructor. The parent’s class constructor is automatically called before the subclass’s class constructor is called. So the first () method called in the virtual machine is the class constructor for java.lang.object
  • Because the class constructor of the parent class takes precedence over the class constructor of the child class, the static{} code block in the parent class takes precedence over the static{} code of the child class
  • A class constructor () is not required for a class. If a class has no class variables and no static{}, it does not have a class constructor ().
  • Interface cannot have static {}, but in the interface can also have a class variable, so the interface can also have class constructor {}, but the class constructor of interface and class constructor of a class is different, the interface in the calling class constructor, if you don’t need, don’t have to call the father of the interface class constructor, unless with the class variables in the interface to the father, The implementation class of the interface also does not call the class constructor of the interface when initialized
  • The virtual machine ensures that a class’s () methods are properly locked and synchronized in a multithreaded environment. If multiple threads initialize a class at the same time, only one thread executes the class constructor (), and the other threads block until the active thread finishes executing the class constructor () method

2.3 Parental delegation model

The parent delegate model, we can see the source code of ClassLoader, our company’s Shadow is to use this point to do plug-in class loading, after I came to the company to learn the first source code is Shadow, Shadow is an Android plug-in framework independently developed by Tencent, which has been tested by hundreds of millions of users online. Shadow not only open-source shares the key code for plug-in technology, but also fully shares all the design required for live deployment. Compared to other plug-in frameworks on the market, Shadow has the following features:

  • Reuse independently installed App source code: plug-in App source code is normally installed to run.
  • Zero reflection without Hack implementation plug-in technology: theoretically it has been determined that there is no need to do compatible development for any system, nor any hidden API calls, and Google’s policy of restricting access to non-public SDK interfaces is completely not in conflict.
  • Fully dynamic plug-in frameworks: It’s difficult to implement perfect plug-in frameworks all at once, but Shadow makes these implementations dynamic, making the plug-in framework code part of the plug-in. Iteration of a plug-in is no longer limited by the host packaging an older version of the plug-in framework.
  • Minimal host increment: Thanks to full dynamic implementation, the amount of code that actually fits into the host program is minimal (15KB, around 160 methods).

Kotlin implementation: Core. loader, core.transform core code is fully implemented by Kotlin, the code is simple and easy to maintain.

protected Class<? > loadClass(String className, Boolean resolve) throws ClassNotFoundException {// Whether Class<? > clazz = findLoadedClass(className); if (clazz == null) { ClassNotFoundException suppressed = null; Clazz = parent. LoadClass (className, false); } catch (ClassNotFoundException e) { suppressed = e; } if (clazz == null) {try {clazz = findClass(className); } catch (ClassNotFoundException e) { e.addSuppressed(suppressed); throw e; } } } return clazz; }Copy the code

3. JVM VM execution engine

Knowing what’s in.class, knowing how.class is parsed and loaded, and finally knowing how bytecode commands are executed. Before we do that, we need to understand two concepts: what is a stack frame? What is dispatching?

3.1 the stack frame

A Stack Frame is a data structure used to support vm method invocation and method execution. It is a Stack element of the Virtual Machine Stack in the data area when the VM runs. A stack frame stores information about a method’s local variogram, operand stack, dynamic linkage, and method return address. Each method from the call to the completion of the process, corresponding to a stack frame in the virtual machine stack from the process of loading and unloading. Each stack frame contains a local variable table, operand stack, dynamic linkage, method return address, and some additional information. When compile the program Code, how much local variables in the stack frame, how deep the operand stack are completely determined, and the attributes of Code written to the method table, so a stack frame needs to how much memory allocation, not affected by the program run time variable data, but only depends on the specific virtual machine implementation. The chain of method calls in a thread can be long, with many methods running at the same time. For the execution engine, in the active thread, only the Stack Frame at the top of the Stack is valid, called the Current Stack Frame. The Method associated with this Stack Frame is called the Current Method. All bytecode instructions run by the execution engine operate only on the Current Stack Frame.

3.2 the dispatch

Dispatching calls can be static or dynamic, and if we understand this, we can see how polymorphism is implemented in Java, such as “overloading” and “overwriting.” The key to Java virtual machine method recognition is the class name, method name, and method descriptor. The first two will not be explained too much, but the method descriptor is made up of the parameter types and return types of the method. If multiple methods with the same name and descriptor appear simultaneously in the same class, the Java virtual Machine will report an error during the validation phase of the class.

As you can see, the Java virtual machine, unlike the Java language, it does not limit the name and parameter types are the same, but the different ways in which the return type, appear in the same class for the bytecode call these methods, as the method of bytecode attached descriptor contains a return type, so the Java virtual machine can accurately identify the target method.

Static dispatch refers to the case where the target method is directly identified at parsing time, while dynamic dispatch refers to the case where the target method needs to be identified at run time based on the caller’s dynamic type. There is no such thing as overloading in the Java virtual machine, because at compile time we can determine which method needs to be executed. If we have to differentiate, overloading is called static binding or compile-time polymorphism; Rewriting is called dynamic binding. Specifically, static dispatch in the Java virtual machine refers to a situation where the target method is directly identified at parsing time, while dynamic dispatch refers to a situation where the target method needs to be identified at run time based on the caller’s dynamic type. Java virtual machine execution methods generally have five kinds of instructions:

  • Invokestatic: used to invokestatic methods.
  • Invokespecial: Used to call private instance methods, constructors, and instance methods or constructors of the parent class using the super keyword, and the default method of the interface implemented.
  • Invokevirtual: Used to invoke non-private instance methods.
  • Invokeinterface: used to invokeinterface methods.
  • Invokedynamic: used to invokedynamic methods.

3.3 instance

With these two concepts in mind, we need to look at a concrete example:

public class HelloWorld { public static void main(String[] args){ int num1 = 100; int num2 = 200; int sum = sum(num1, num2); System.out.println("sum = "+sum); } private static final int sum(int num1, int num2){ return num1 + num2; }}Copy the code

Javap verbose HelloWorld class:

public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=4, args_size=1
         0: bipush        100
         2: istore_1
         3: sipush        200
         6: istore_2
         7: iload_1
         8: iload_2
         9: invokestatic  #2                  // Method sum:(II)I
        12: istore_3
        13: getstatic     #3                  // Field java/lang/System.out:Ljava/io/PrintStream;
        16: new           #4                  // class java/lang/StringBuilder
        19: dup
        20: invokespecial #5                  // Method java/lang/StringBuilder."<init>":()V
        23: ldc           #6                  // String sum =
        25: invokevirtual #7                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        28: iload_3
        29: invokevirtual #8                  // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
        32: invokevirtual #9                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        35: invokevirtual #10                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        38: return
      LineNumberTable:
        line 12: 0
        line 13: 3
        line 14: 7
        line 15: 13
        line 16: 38
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      39     0  args   [Ljava/lang/String;
            3      36     1  num1   I
            7      32     2  num2   I
           13      26     3   sum   I
Copy the code

This is important to understand, even though we’re going to do some dumb operations later in asm, but understanding how to write and why to write that way depends on understanding each instruction set. We need to know what each instruction means, such as bipush 100 for pushing the number 100, istore_1 for pushing the newly pushed 100 into the local variable table. We need to know exactly how the data in the current stack and local variable table changes with each instruction run.

This article is basically text principles, you should be patient, if you can understand it is actually very simple things. This itself is three or four lectures, but I’ve condensed it into one or two lectures. Given your different levels, a lot of you may feel that you haven’t done enough, so you can find some extra articles to help you understand, but the general direction is definitely in this direction.

Video address: pan.baidu.com/s/1ozvNawIJ…

Video password: Q9KJ