Bytecode is the foundation of ASM, and an understanding of bytecode is a prerequisite for proficient use of ASM.

The file format of Class

Class files are direct files executed by the Java virtual machine. The internal structure design has a fixed protocol. Each Class file corresponds to only one Class or interface definition information.

Each Class file consists of a byte stream in 8-bit units. The following is the contents of a Class file. The contents of a Class file are stored in a strict sequence.

ClassFile { 
    u4 magic; 
    u2 minor_version; 
    u2 major_version; 
    u2 constant_pool_count; 
    cp_info constant_pool[constant_pool_count-1]; 
    u2 access_flags; 
    u2 this_class; 
    u2 super_class; 
    u2 interfaces_count; 
    u2 interfaces[interfaces_count]; 
    u2 fields_count; 
    field_info fields[fields_count]; 
    u2 methods_count; 
    method_info methods[methods_count]; 
    u2 attributes_count; 
    attribute_info attributes[attributes_count]; 
}
Copy the code

In the Class file structure, the above items have the following meanings.

Name meaning
magic As a magic number, determine whether the file is a class file that can be accepted by the virtual machine. The fixed value is 0xCAFEBABE.
Minor_version, major_version Indicates the minor and major version of a class file. Different versions of VMS support different versions of class files.
constant_pool_count Constant_pool_count is equal to the number of members in the constant pool table plus one.
constant_pool Constant_pool is a table structure that contains all character constants, class or interface names, field names, and other constants referenced in the class file structure and its substructures.
access_flags Access_flags is an access flag that indicates the access permissions and properties of the class or interface, including ACC_PUBLIC, ACC_FINAL, ACC_SUPER, and so on.
this_class Class index that points to an index of an item in the constant pool table.
super_class The value must be 0 or a valid index of the item in the constant pool. If 0, the class must be Object, the only class that does not have a parent class.
interfaces_count Interface calculator, which represents the number of direct parent interfaces of the current class or interface.
interfaces[] Interface table, in which each member’s value must be a valid index value to an item in the constant pool table.
fields_count Field calculator, which represents the number of members of the fields table in the current class file. Each member is a field_info.
fields Table of fields, each member of which is a complete fields_info structure, representing a complete description of a field in the current class or interface, excluding the parent class or part of the parent interface.
methods_count Method counter that represents the number of members of the current class file methos table.
methods The method table, each member of which is a complete method_INFO structure, can represent all methods defined in a class or interface, including instance methods, class methods, and class or interface initialization methods.
attributes_count The property list, which is each attribute_info, contains the following properties: InnerClasses, EnclosingMethod, Synthetic, Signature, Annonation, and so on.

The above content is from the network, I do not know where to copy from.

Bytecode is very different from Java code.

  • A bytecode file can only describe one class, whereas a Java file can contain multiple classes. When a Java file describes a class that contains an inner class, the Java file is compiled into two class files, distinguished by a “$” in the file name. The main class file contains references to its inner class, and the inner class that defines an inner method contains external references
  • Bytecode files contain no comments, only valid executable code such as classes, fields, methods, and properties
  • The bytecode file does not contain package and import sections, and all type names must be fully qualified
  • Bytecode file also contains a constant pool (constant pool), the content is generated at compile time, constant pool is essentially an array is stored in the class all numeric, string constants and types, these constants need only defined in the constant pool part time, it can use its index, all the other parts in the class file for reference

The execution of bytecode

In the Java virtual Machine, bytecodes are calculated on a stack, similar to registers in a CPU. In the Java virtual machine, it uses the stack to perform operations, such as adding “a+ B”. In the Java virtual machine, “a” is pushed onto the stack, and then “B” is pushed onto the stack. Finally, the “ADD” instruction is executed to fetch the two variables used for the calculation. After the calculation is completed, the return value “A +b” is pushed onto the stack to complete the instruction.

Type descriptor

The types we have in Java code, in bytecode, have corresponding representation protocols.

Java Type Type description
boolean Z
char C
byte B
short S
int I
float F
long J
double D
object Ljava/lang/Object;
int[] [I
Object[][] [[Ljava/lang/Object;
void V
Reference types L
  • The descriptors of Java’s basic types are single characters, such as Z for Boolean and C for char
  • The descriptor for the type of a class is the fully qualified name of the class, preceded by the character L followed by a “;” For example, the type descriptor of String is Ljava/lang/String;
  • The descriptor for an array type is a square bracket followed by a descriptor for the element type of the array. Multidimensional arrays use multiple square brackets

With the protocol analysis above, it is relatively easy to see the types of parameters in the bytecode.

Method descriptor

A method descriptor (method signature) is a list of type descriptors that describe the parameter types and return types of a method in a string.

The method descriptor begins with an open parenthesis, followed by the type descriptor for each parameter, followed by a close parenthesis, followed by the type descriptor for the return type. For example, if the method returns void, it is V. Note that the method descriptor does not contain the method name or the parameter name.

Java method Declaration Method descriptor instructions
void m(int i, float f) (IF)V Takes an int and float argument and returns no value
int m(Object o) (Ljava/lang/Object;) I Returns int for an Object argument
int[] m(int i, String s) (ILjava/lang/String;) [I Take an int and a String and return an int[]
Object m(int[] i) ([I)Ljava/lang/Object; Take an int[] and return Object

Bytecode example

So let’s see what this simple code looks like under bytecode.

Using ASMPlugin, let’s take a look at the generated bytecode, as shown below.

As you can see, there are two main parts — init and onCreate.

Each method in Java is assigned a “stack frame” by the Java Virtual Machine when it executes. The stack frame is used to store all the data needed to calculate the method.

The 0th element is “this”, followed by any arguments passed to the method.

There are many instructions in bytecode, and the following are some of the more common ones.

  • ALOAD 0: this instruction is one of the LOAD instructions. It means push the 0th element on the stack. The code equivalent is “this,” where A indicates that the data element’s type is A reference type. ALOAD, ILOAD, LLOAD, FLOAD, and DLOAD are used for loads that do not use data types
  • INVOKESPECIAL: This directive is one in a series of calls. Its purpose is to call methods of the object class. We need to fully sign the methods of the parent class
  • INVOKEVIRTUAL: This directive differs from INVOKESPECIAL in that it calls methods of an object class by reference
  • INVOKESTATIC: Calls the static methods of the class

You don’t have to know all the instructions, but if you look at it in the code, you’ll be able to understand it, and what we need to do is change the bytecode, not start at zero.

For Java source files: if there is only one method, when compiled, there will also be two methods, one of which is the default constructor for Kotlin source files: If there is only one method, when compiled, there will be four methods: one is the default constructor, two are kotlin synthesized methods, and the default function to clear memory on exit

ASM Code

Combined with ASM Code, it is the same example above.

The default constructor.

OnCreate:

There is some generated code, such as:

Label label0 = new Label(); methodVisitor.visitLabel(label0); methodVisitor.visitLineNumber(9, label0); methodVisitor.visitLocalVariable("this", "Lcom/yw/asmtest/MainActivity;" , null, label0, label4, 0);Copy the code

These are all ways to debug code and write variable tables, so we don’t have to worry about them.

The rest of the code is what we need in ASM.

I’d like to recommend my website Xuyisheng. Top/focus on Android-Kotlin-flutter. Welcome to visit