ASM introduction
ASM is a general-purpose Java bytecode manipulation and analysis framework that can be used to modify existing classes or generate classes dynamically directly in binary form. ASM provides some common bytecode conversion and analysis algorithms from which you can build custom complex conversion and code analysis tools. ASM provides similar functionality to other Java bytecode frameworks, but with a focus on performance. Because it is designed and implemented as small and fast as possible, it is ideal for use in dynamic systems (and of course, statically, for example, in compilers).
ASM is used in many projects, including the following:
- OpenJDK, generating lambda call sites, and Nashorn compiler;
- The Groovy and Kotlin compilers;
- Cobertura and Jacoco, using instrumental classes to measure code coverage;
- CGLIB, for dynamically generating proxy classes;
- Gradle generates classes at run time.
For more information, see asM.ow2.io /
IDE plug-ins
ASM works directly with BytecodeOutline. If you are not familiar with the BytecodeOutline set of BytecodeOutline operations, it can be difficult to write.
- IDEA: ASM Bytecode Outline
- Eclipse: BytecodeOutline
Take IDEA as an example, just right-click ->Show Bytecode Outline in the corresponding class, as shown in the figure below:
The panel contains three tabs:
Bytecode
: bytecode file corresponding to the class;ASMified
:ASM
Generate bytecode corresponding code;Groovified
: the bytecode instruction corresponding to the class;
ASM API
The ASM library provides two apis for generating and transforming compiled classes. One is the core API, which represents classes in event-based form. Another is the tree API, which represents classes in object-based form. You can compare how XML files are parsed: SAX mode and DOM mode; The core API corresponds to SAX mode, and the tree API corresponds to DOM mode. Each model has its own pros and cons:
- Event-based apis are faster and require less memory than object-based apis, but class transformations can be more difficult to implement when using event-based apis;
- Object-based apis load the entire class into memory;
ASM libraries are organized in packages that are distributed in separate JAR files:
org.objectweb.asm
andorg.objectweb.asm.signature
Package: defines event-based apis and provides class parser and writer components, which are contained in asM.jar;org.objectweb.asm.util
Package: Provides a variety of tools based on the core API that can be used during the development and debugging of ASM applicationsasm-util.jar
;org.objectweb.asm.commons
Package: provides several useful predefined class converters, mainly based on the core API, contained inasm-commons.jar
;org.objectweb.asm.tree
Package: defines object-based apis and provides tools for converting between event-based and object-based representations, contained inasm-tree.jar
;org.objectweb.asm.tree.analysis
Package: The package provides a tree-based API class analysis framework and several predefined class parsers, included in theasm-analysis.jar
;
Core API
Before studying the core API, it is recommended to look at the visitor pattern, which is how ASM operates and analyzes bytecode.
Visitor pattern
The Visitor pattern suggests putting the new behavior into a separate class called visitor, rather than trying to integrate it into an existing class. Now, the original object to perform the operation on is passed as a parameter to the methods in the visitor, giving the methods access to any necessary data that the object contains; Common application scenarios:
- If you need to perform certain operations on all elements of a complex object structure, such as a tree of objects, use the visitor pattern.
- The visitor pattern can be used to clean up the business logic of the auxiliary behavior;
- This pattern can be used when a behavior makes sense only in some classes in a class hierarchy and not in others;
In fact, bytecode is a complex object structure, and for example, SQL parsing in Sharding-JDBC also uses visitor mode, which can be found to be some data with stable data structure and fixed syntax.
More references: Visitor pattern
class
The visitor pattern has two core classes: independent visitor, receiving visitor event generator; There are two core classes for ASM: ClassVisitor and ClassReader.
ClassVisitor
The ASM API for generating and transforming compiled classes is based on the ClassVisitor abstract class, each method in which corresponds to a class file structure of the same name:
public abstract class ClassVisitor {
public ClassVisitor(int api);
public ClassVisitor(int api, ClassVisitor cv);
public void visit(int version, int access, String name,String signature, String superName, String[] interfaces);
public void visitSource(String source, String debug);
public void visitOuterClass(String owner, String name, String desc);
AnnotationVisitor visitAnnotation(String desc, boolean visible);
public void visitAttribute(Attribute attr);
public void visitInnerClass(String name, String outerName,String innerName, int access);
public FieldVisitor visitField(int access, String name, String desc,String signature, Object value);
public MethodVisitor visitMethod(int access, String name, String desc,String signature, String[] exceptions);
void visitEnd(a);
}
Copy the code
Parts whose content can be of any length and complexity are returned by the helper visitor class, which includes AnnotationVisitor, FieldVisitor, MethodVisitor; See the Java Virtual Machine specification for more;
All of the above methods are called by the event producer ClassReader, and all of the arguments in the method are provided by ClassReader. Of course, each method is called in order:
visit visitSource? visitOuterClass? ( visitAnnotation | visitAttribute )* ( visitInnerClass | visitField |visitMethod )* visitEnd
Copy the code
Visit is called first, followed by at most one call to visitSource, followed by at most one call to visitOuterClass, followed by any number of visits to VisitAnnotations and VisitAttributes that can be done in any order, This is followed by any number of calls to visitInnerClass, visitField, and visitMethod, which can be made in any order, ending with a visitEnd call.
ClassReader
The main function of this class is to read the bytecode file and notify the ClassVisitor of the read data. The bytecode file can be passed in a variety of ways:
public ClassReader(final InputStream inputStream)
: byte stream mode;public ClassReader(final String className)
: Full file path.public ClassReader(final byte[] classFile)
: binary file;
The common usage modes are as follows:
ClassReader classReader = new ClassReader("com/zh/asm/TestService");
ClassWriter classVisitor = new ClassWriter(ClassWriter.COMPUTE_MAXS);
classReader.accept(classVisitor, 0);
Copy the code
The accept method of ClassReader handles receiving a visitor and includes another parsingOptions argument, including:
SKIP_CODE
Skip access to compiled code (which can be useful if you only need class structures);SKIP_DEBUG
: Does not access debugging information or create manual labels for it.SKIP_FRAMES
Skip stack mapping frame;EXPAND_FRAMES
: Decompress these frames;
ClassWriter
The above example uses ClassWriter, which is inherited from ClassVisitor to generate classes and can be used independently, as follows:
ClassWriter cw = new ClassWriter(0);
cw.visit(V1_5, ACC_PUBLIC + ACC_ABSTRACT + ACC_INTERFACE,"pkg/Comparable".null."java/lang/Object".new String[]{"pkg/Mesurable"});
cw.visitField(ACC_PUBLIC + ACC_FINAL + ACC_STATIC, "LESS"."I".null.new Integer(-1)).visitEnd();
cw.visitField(ACC_PUBLIC + ACC_FINAL + ACC_STATIC, "EQUAL"."I".null.new Integer(0)).visitEnd();
cw.visitField(ACC_PUBLIC + ACC_FINAL + ACC_STATIC, "GREATER"."I".null.new Integer(1)).visitEnd();
cw.visitMethod(ACC_PUBLIC + ACC_ABSTRACT, "compareTo"."(Ljava/lang/Object;) I".null.null).visitEnd();
cw.visitEnd();
byte[] b = cw.toByteArray();
/ / output
FileOutputStream fileOutputStream = new FileOutputStream(new File("F:/asm/Comparable.class"));
fileOutputStream.write(b);
fileOutputStream.close();
Copy the code
ClassWriter generates a bytecode file, converts it to a byte array, and outputs it to a file using FileOutputStream. The decompression result is as follows:
package pkg;
public interface Comparable extends Mesurable {
int LESS = -1;
int EQUAL = 0;
int GREATER = 1;
int compareTo(Object var1);
}
Copy the code
Flags is required to instantiate ClassWriter. Options include:
COMPUTE_MAXS
: will calculate the size of the local variables and operand stack for you; I still have to callvisitMaxs
, but any arguments can be used: they will be ignored and recalculated; When using this option, you still have to calculate the frames yourself;COMPUTE_FRAMES
: Everything is automatically calculated; No longer need to callvisitFrame
, but must still be calledvisitMaxs
Parameters will be ignored and recalculated.- 0: does not automatically calculate anything; You must calculate the size of frames, local variables, and operand stacks yourself;
The above is just a separate use of ClassWriter, but it is more meaningful to use the above three core classes together.
Conversion operations
A ClassVisitor is introduced between the class reader and the class writer, and the code structure is roughly as follows:
ClassReader classReader = new ClassReader("com/zh/asm/TestService");
ClassWriter classWriter = new ClassWriter(ClassWriter.COMPUTE_MAXS);
/ / processing
ClassVisitor classVisitor = newAddFieldAdapter(classWriter...) ; classReader.accept(classVisitor,0);
Copy the code
The architecture corresponding to the above code is shown below:
An adapter is provided to add attributes. You can override the visitEnd method and write the new attributes as follows:
public class AddFieldAdapter extends ClassVisitor {
private int fAcc;
private String fName;
private String fDesc;
// Whether an attribute with the same name already exists
private boolean isFieldPresent;
public AddFieldAdapter(ClassVisitor cv, int fAcc, String fName,
String fDesc) {
super(ASM4, cv);
this.fAcc = fAcc;
this.fName = fName;
this.fDesc = fDesc;
}
@Override
public FieldVisitor visitField(int access, String name, String desc,
String signature, Object value) {
// Check if there is a field with the same name. If not, add it to visitEnd
if (name.equals(fName)) {
isFieldPresent = true;
}
return cv.visitField(access, name, desc, signature, value);
}
@Override
public void visitEnd(a) {
if(! isFieldPresent) { FieldVisitor fv = cv.visitField(fAcc, fName, fDesc,null.null);
if(fv ! =null) { fv.visitEnd(); } } cv.visitEnd(); }}Copy the code
Depending on the order in which each method of the ClassVisitor is called, if there are multiple attributes in the class, visitField is called multiple times, each time checking to see if the field to be added already exists, and then saved in the isFieldPresent identifier. This determines whether a new attribute needs to be added in the last visitEnd;
ClassVisitor classVisitor = new AddFieldAdapter(classWriter,ACC_PUBLIC + ACC_FINAL + ACC_STATIC,"id"."I");
Copy the code
Public static final int ID; We can write a byte array to a class file and decompile it:
public class TestService {
public static final intid; . }Copy the code
Utility class
In addition to the above core classes, ASM also provides some utility classes for users to use:
- Type
Type
Object represents a kind ofJava
Type, which can either be constructed by a type descriptor or byClass
Object construction;Type
The class also contains static variables that represent primitive types; - TraceClassVisitorExtend the
ClassVisitor
Class, and builds a textual representation of the class being accessed; useTraceClassVisitor
To get a readable trace of the actual generated content; - CheckClassAdapter
ClassWriter
A class does not verify that its methods are called in the right order and that the arguments are valid; Therefore, it is possible to generate invalid classes that are rejected by the Java virtual machine validator. To detect some of these errors as early as possible, you can useCheckClassAdapter
Class; - ASMifierThis class is
TraceClassVisitor
The tool provides an optional backend (by default, it uses oneTextifier
Back end, producing the type of output shown above). This backend is enabledTraceClassVisitor
Each method of a class prints the Java code used to call it.
methods
In the introduction above the ClassVisitor will access the complexity of the widget by returning the helper visitor classes, which include: AnnotationVisitor, FieldVisitor, MethodVisitor; Take a look at the Java Virtual Machine execution model before introducing the MethodVisitor;
Execution model
As each method is executed, the Java virtual machine synchronously creates a Stack Frame to store information about local variables, operand stacks, dynamic connections, method exits, and so on. Each method is called until the execution is completed, corresponding to a stack frame in the virtual machine stack from the stack to the stack process;
- Table of local variables: contains variables that can be accessed by their indexes in random order;
- Operand stack: the stack of values used as operands by bytecode instructions;
Look at an execution stack with 3 frames:
Frame 1: contains 3 local variables, maximum operand stack 4, contains 2 values;
Frame 2: contains 2 local variables, operand stack maximum 3, contains 2 values;
Frame 3: contains 4 local variables, operand stack Max 2, contains 2 values;
Bytecode instruction
A bytecode instruction consists of an opcode identifying the instruction and a fixed number of parameters:
- Opcode: Is an unsigned byte value, identified by a mnemonic. For example, the opcode value 0 is designed by the mnemonic NOP and corresponds to an instruction that does not perform any operation.
- Parameters: are static values that determine the exact instruction behavior. They are given immediately after the opcodes.
Bytecode instructions fall into two categories:
- A small number of instructions are used to move values from local variables to the operand stack;
- The other instructions only work on the operand stack: they pop values from the stack, evaluate the results from those values, and then push them back onto the stack;
Local variable instruction:
ILOAD
: used to load a Boolean, byte, CHAR, short, or int local variable;LLOAD, FLOAD, DLOAD
: used to load long, float, or double values, respectively;ALOAD
: used to load arbitrary non-primitive values, that is, object and array references;
Operand stack instruction:
ISTORE
: Pops a Boolean, byte, CHAR, short, or int local variable value from the operand stack and stores it in the local variable specified by its index I;LSTORE, FSTORE, DSTORE
: pops long, float, or double values, respectively;ASTORE
: used to pop up any non-primitive value;GETFIELD
,PUTFIELD
:GETFIELD owner name desc
Pop up an object reference and push itname
Field value;PUTFIELD owner name desc
Pops up a value and an object reference and stores the value in itname
In the field. In both cases, the object must beowner
Type, whose fields must bedesc
Type.GETSTATIC
andPUTSTATIC
Is a similar directive, but for static fields.INVOKEVIRTUAL, INVOKESTATIC, INVOKESPECIAL, INVOKEINTERFACE, INVOKEDYNAMIC
:INVOKEVIRTUAL owner name desc
Call the classowner
Defined in thename
Method, whose method descriptor isdesc
.INVOKESTATIC
For static methods,INVOKESPECIAL
For private methods and constructors,INVOKEINTERFACE
For methods defined in the interface. Finally, for the java7 class,INVOKEDYNAMIC
For the new dynamic method invocation mechanism.
MethodVisitor
The ASM API for generating and transforming compiled methods is based on the MethodVisitor abstract class; It is returned by the visitMethod method of the ClassVisitor; This class also defines a method for each bytecode instruction class based on the number and type of arguments to those instructions; These methods must be called in the following order:
visitAnnotationDefault? ( visitAnnotation | visitParameterAnnotation | visitAttribute )*( visitCode( visitTryCatchBlock | visitLabel | visitFrame | visitXxx Insn |visitLocalVariable | visitLineNumber )*visitMaxs )? visitEndCopy the code
Let’s look at an example of transforming an existing method to add a start and end log to the method.
-
Prepare the instance that needs to be transformed, adding the log before and after the Query method processing;
public class TestService { public void query(int param) { System.out.println("service handle..."); }}Copy the code
-
Override the visitMethod in ClassVisitor
public class MyClassVisitor extends ClassVisitor implements Opcodes { public MyClassVisitor(ClassVisitor cv) { super(ASM5, cv); } @Override public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) { MethodVisitor methodVisitor = cv.visitMethod(access, name, desc, signature, exceptions); if(! name.equals("<init>") && methodVisitor ! =null) { methodVisitor = new MyMethodVisitor(methodVisitor); } returnmethodVisitor; }}Copy the code
Filter out the
method, all other methods are wrapped by MyMethodVisitor, and override the methods of MethodVisitor;
-
Overloading MethodVisitor
public class MyMethodVisitor extends MethodVisitor implements Opcodes { public MyMethodVisitor(MethodVisitor mv) { super(Opcodes.ASM4, mv); } @Override public void visitCode(a) { super.visitCode(); mv.visitFieldInsn(GETSTATIC, "java/lang/System"."out"."Ljava/io/PrintStream;"); mv.visitLdcInsn("start"); mv.visitMethodInsn(INVOKEVIRTUAL, "java/io/PrintStream"."println"."(Ljava/lang/String;) V".false); } @Override public void visitInsn(int opcode) { if ((opcode >= Opcodes.IRETURN && opcode <= Opcodes.RETURN) || opcode == Opcodes.ATHROW) { // Method prints "end" before returning mv.visitFieldInsn(GETSTATIC, "java/lang/System"."out"."Ljava/io/PrintStream;"); mv.visitLdcInsn("end"); mv.visitMethodInsn(INVOKEVIRTUAL, "java/io/PrintStream"."println"."(Ljava/lang/String;) V".false); } mv.visitInsn(opcode); }}Copy the code
VisitCode method call before accessing, visitInsn needs to determine whether the operator is a method RETURN, general method before returning will execute mv.visitinsn (RETURN) operation, this time can be determined by opCode;
-
View the new bytecode file generated
public class TestService { public TestService(a) {}public void query(int var1) { System.out.println("start"); System.out.println("service handle..."); System.out.println("end"); }}Copy the code
Utility class
Some utility classes are also provided under methods:
LocalVariablesSorter
The method adapter renumbers local variables used in a method in the order in which they appear in the method and makes them availablenewLocal
Method to create a new local variable;AdviceAdapter
This method adapter is an abstract class that can be used at the beginning of a method as well as anyRETURN
orATHROW
Insert code before instruction; The main advantage is that it also applies to constructors, where code cannot be inserted just at the beginning of the constructor, but after the super constructor is called.
Usage scenarios
ASM is used in many projects. Here are two common usage scenarios: AOP and Substitute reflection;
AOP
Section-oriented programming is mainly used to solve some system level problems in program development, such as log, transaction, permission waiting; The key technology is proxy, including dynamic proxy and static proxy, there are many ways to achieve:
- AspectJ: Static weaving, the principle is static proxy;
- JDK dynamic proxy:
JDK
Dynamic proxy has two core classes:Proxy
andInvocationHandler
; - Cglib dynamic proxy: encapsulated
ASM
Can dynamically generate new ones at run timeClass
; Function thanJDK dynamic proxy
More powerful;
The Cglib dynamic proxy relies on ASM, and we have seen the bytecode enhancements of ASM in the example above.
Instead of reflection
FastJson is known for its speed, including the use of ASM instead of Java reflection; There is also a ReflectASM package designed to replace Java reflection;
ReflectASM is a very small Java class library that provides high performance reflection handling through code generation, automatically providing access classes for GET/SET fields that use bytecode manipulation instead of Java’s reflection technology and are therefore very fast.
Take a look at a simple usage of ReflectASM:
TestBean testBean = new TestBean(1."zhaohui".18);
MethodAccess methodAccess = MethodAccess.get(TestBean.class);
String[] mns = methodAccess.getMethodNames();
for (int i = 0; i < mns.length; i++) {
System.out.println(methodAccess.invoke(testBean, mns[i]));
}
Copy the code
A temporary TestBeanMethodAccess is generated internally using ASM. The invoke method is overridden internally, and the discompiled method looks like this:
public Object invoke(Object var1, int var2, Object... var3) {
TestBean var4 = (TestBean)var1;
switch(var2) {
case 0:
return var4.getName();
case 1:
return var4.getId();
case 2:
return var4.getAge();
default:
throw new IllegalArgumentException("Method not found: "+ var2); }}Copy the code
You can see that Invoke is actually a normal call inside, which is definitely faster than using Java reflection.
Reference documentation
asm4-guide.pdf
ASM4 Manual Chinese version
Thank you for attention
You can pay attention to the wechat public account “Roll back the code”, read the first time, the article continues to update; Focus on Java source code, architecture, algorithms, and interviews.