#1. Write in front

As the saying goes, “any technology that is divorced from business is a castle in the air.” One of the first impulses to work on bytecode piling was to plug in a statistics SDK(I won’t go into specifics here). Their formula is that third party developers need to Plugin their Gradle Plugin, and then they can do the full client-side statistics (full means page opening speed, method time, various click events, etc.) without embedding. At that time, due to the urgent demand schedule, there was no time to study their implementation. During the Spring Festival holiday, I couldn’t control my thirst for knowledge. I finally found the source of technology — bytecode piling — by looking up information and decompilating their code. At the same time, the company will continue to work on a statistical system. In order not to invade the original project architecture, I plan to use bytecode piling technology to implement it. So the purpose of writing this article is to share the pit of pre-research period, so as to avoid more friends into the pit ~

A brief description of what we’re going to do

In simple terms, we want to achieve full statistics of clients without buried points. The statistical summary here covers a wide range of common scenarios:

  • Open events for an Activity or Fragment
  • Statistics of various Click events, including but not limited to Click LongClick TouchEvent
  • During the Debug period, the time of each method needs to be collected. Note that the methods here include access to third-party SDKS.
  • To be added

What technology points do you need to have to implement these features?

  • Aspect Oriented Programming (AOP)
  • Android Packaging Process
  • Customize Gradle plugins
  • Bytecode weaving
  • Combined with their own business statistical code
  • There’s no…
  1. Start tinkering technique points

= = = = = = = = = =

2.1 Technical points — What is AOP

AOP(an acronym for Aspect Oriented Program) is an idea for Aspect Oriented programming. This kind of Programming idea is in contrast to OOP(ObjectOriented Programming). After all, we are still trying to implement statistics, and large-scale repetitive statistical behavior is a typical AOP usage scenario. So it’s important to understand what AOP is and why it should be used

Let’s start with the familiar object-oriented programming: Object-oriented features are inheritance, polymorphism, and encapsulation. Encapsulation, on the other hand, requires the distribution of functionality among different objects, which in software design is often referred to as assignment of responsibilities. In effect, let different classes design different methods. This scatters the code into individual classes. This has the advantage of reducing the complexity of the code and making the classes reusable.

But one of the inherent disadvantages of object-oriented programming is that it increases duplication as well as fragmentation of code. For example, I want to add log statistics modules to all modules in the project. In accordance with OOP thinking, we need to add statistics code to each module, but in accordance with AOP thinking, we can abstract the statistics into aspects, and only need to add statistics code to the aspects.

On the server side, AOP is already being played by a variety of big players, such as Spring, a cross-generational framework. My first exposure to AOP was when I taught myself the Spring framework. The most common way to implement AOP is through proxies.

2.2 Technical points — Android packaging process

Since you want to use bytecode pilings to achieve no buried points, it’s important to understand Android’s packaging process. How else do we know when the system will generate the Class file for us to pile? The packaging process on the website is not so intuitive. So let’s take a look at a more intuitive build process.

After the “Java Compiler step”, the system generates a.class file. These class files are again converted to android-recognized. Dex files through the dex step. Since we are going to do bytecode piling, we must hook the packaging process, scan and reweave the class bytecode before the dex step, and then hand the woven class file to the DEX process. This creates what is called a no-burial point. So how do we know when the “Java Compiler” step is complete? This brings us to the next technical point — customizing Gradle plug-ins.

2.3 Technical points — Customize Gradle plug-ins

Following on from section 2.2, how do we know that the “Java Compiler” step has been completed with the packaging system? Even if you know that the packaging system generates the class bytecode file, how can you Hook up the process and then do the “dex” process after completing the custom bytecode weaving? In the case of Android Gradle Plugin version 1.5.0 or above, Google officially provides transformAPI as the entry point for bytecode plugins. To put it more frankly, you can override Gradle’s transform method to get a callback after the “Java Compiler” process ends and before the “dex” process begins by customizing Gradle plug-ins. This is the perfect time for bytecode reweaving.

How to define Gradle plugin resources

  • Customize Gradle plugin in AndroidStudio
  • Custom Gradle plugin official documentation
  • Embrace Android Studio 5: Gradle plugin development

Since this article focuses on the technical process of bytecode piling and emphasizes covering the technical points involved in this technology from the surface, I will not expand on the content of custom plug-ins. By following the resources recommended above, you can basically go through the process of customizing Gradle plug-ins. If you want to customize Gradle plugins, please contact me. If necessary, I can write a tutorial on how to customize Gradle plugins. The email address is given at the end of the article.

Some resources to consider about Transform:

  • The official documentation
  • Didi plug-in project VirtualApk, the project of virtualapk-Gradle-plugin is to use this pile entrance to the plug-in resources and host resources to peel off, to prevent host APK and plug-in APK resources conflict. See the project StripClassAndResTransform class inside.

2.4 Technical point — bytecode weaving

Knowledge of bytecode is the core technical point of this article

2.4.1 What is Bytecode

Java Bytecode (English: Java Bytecode) is an instruction format executed by the Java Virtual Machine. Bytecode is a Class file that has been compiled with the Javac command. The Class file contains the Java virtual machine instruction set and symbol table, as well as several other ancillary information. A Class file is a set of binary streams based on 8-bit bytes. The data items are tightly arranged in a Class file in strict order. There is no separator in the middle, which makes the contents stored in the Class file almost all the data necessary for the program to run.

The Java VIRTUAL Machine (VM) implementation logic varies from one Java VIRTUAL machine (VM) provider to another. However, these VM providers strictly comply with the Limitations of the Java Vm Specification. So a correct bytecode file can be executed correctly by different virtual machine providers. In the words of understanding the Java Virtual Machine in Depth, “the conversion of compiled code from native machine code to bytecode is a small step in the development of a storage format, and indeed a big step in the development of a programming language.”

2.4.2 Content of bytecode

This diagram is an overview of Java bytecode. There are 10 parts in total, including magic number, version number, constant pool, field table collection and so on. Again, this article will not go into details. Please refer to this blog post, and if you have the conditions, please read the book “Understanding the Java Virtual Machine”. I’ve read it twice now, and every time I read it, I get new insights. Recommend everyone also read, very good for their own growth.

A few important things about bytecode:

Fully qualified name

The Class file uses fully qualified names to refer to a Class. Fully qualified names are easy to understand. With a “/”

For example,

Android.widget.textview copy codeCopy the code

The fully qualified name of

Android/Widget /TextView copy codeCopy the code

The descriptor

The function of the descriptor is to describe the data type of the field, the parameter list (including number, type, and order) of the method, and the return value. According to the rules of the descriptor, the basic data types (byte char double float int long short Boolean) and the void type representing no return value are represented by an uppercase character. Object types are represented by the character “L” followed by the fully qualified name of the object. Generally, object types end with a “; “. To indicate the end of the fully qualified name. The following table

Mark character meaning
B Basic type byte
C Base type CHAR
D Basic type double
F Basic type float
I Basic type int
J Basic type long
S Basic type short
Z Basic type Boolean
V Special type void
L Object type, such as Ljava/lang/Object

For array types, each dimension will be represented by the “[” character. For example, we need to define a two-dimensional array of type String

Java.lang.String[][] will be represented as [[Java /lang/String; int[] will be represented as [I; copy codeCopy the code

When a method is described by a descriptor, it is described in the order that the parameter list is followed by the return value. The argument list is placed within a set of parentheses () in the order of the argument. A few chestnuts:

Void init() would be described as ()V void setText(String s) would be described as (Ljava/lang/String)V; Java.lang.string toString() is described as ()Ljava/lang/String; Copy the codeCopy the code

2.4.3 Knowledge of the VM Bytecode Execution engine

Execution engines are one of the core components of virtual machines. This post is still controlled to avoid a lengthy discussion of specific content that ignores the nature of the problem to be addressed. Let’s focus on the Java runtime memory layout:

The memory of a VM can be divided into heap memory and stack memory. Heap memory is shared by all threads, while stack memory is thread private. The following figure shows the data area when the VM is running

Here the focus is on stack memory. The Java virtual machine stack is thread-private and describes the memory model for the execution of Java methods: Each method executes while creating a stack frame to store information such as the local variable table, operand stack, dynamic link, method return address, and so on. Each method from the call to the execution of the process, corresponds to a stack frame in the virtual machine stack from the stack to the stack of the process. Each stack frame contains local variable tables, operand stacks, dynamic links, method return addresses, and some additional additional information. The size of the local variables and the depth of the operands in the stack frame are already stored in the code attribute of the bytecode file (class file). Therefore, the amount of memory allocated for a stack frame is not affected by the program running, but depends on the specific implementation of the virtual machine. The chain of method calls in a thread can be very long, with many stack frames. For a currently active thread, only the stack Frame at the top of the thread stack is valid. This is called the current stack Frame. The method associated with this stack Frame is called the current Method.

To explain the concepts in the diagram above:

  • Local variable table: A local variable table is a set of variable stores used to store method parameters (that is, method input parameters) and local variables defined within methods. The capacity of the local variable table is the smallest unit (slot). The value of the index ranges from 0 to the maximum slot value of the local variable. In non-static methods, 0 represents “this”, which is the reference to the current call to the method (the caller). The remaining parameters are allocated from 1. Start assigning local variables within the method. Use Android’s click method for an example:
Public void onClick(View v) {} copy codeCopy the code

The capacity slot of the local variable table of this method is:

Slot Number value
0 this
1 View v
  • Operand stack: The operand stack, also known as the operation stack, is a last in, first out stack structure. When a method is first executed, the operand stack is empty. During the execution of the method, various bytecode instructions are written to and extracted from the operand stack. For example, the bytecode instruction iadd (adding two ints) requires that the two elements closest to the top of the operand stack have already stored two ints. When the addition is performed, the two ints are added and the result of the addition is pushed onto the stack. Specific bytecode operation instructions can refer to Wikipedia, can also refer to the domestic hand article

2.4.4 Introduction to ASM in bytecode weaving

After completing the previous knowledge, I finally reached the final step. How do you weave bytecode? Here I chose a powerful open source library called ASM.

What is the ASM?

ASM is a Java bytecode manipulation framework. It can be used to dynamically generate classes or enhance the functionality of existing classes. ASM can generate binary class files directly or dynamically change class behavior before the class is loaded into the Java Virtual Machine. Java classes are stored in strictly formatted.class files that have enough metadata to parse all the elements in the class: class names, methods, attributes, and Java bytecode (instructions). After reading information from class files, ASM can change class behavior, analyze class information, and even generate new classes based on user requirements.

Why was ASM chosen for bytecode weaving?

Because of previous experiments, I did not test the efficiency of the bytecode woven library. Refer to the experimental results of netease Lede team:

Framework First time Later times
Javassist 257 5.2
BCEL 473 5.5
ASM 62.4 1.1

As you can see from the table above, ASM is more efficient. However, it is efficient if the syntax of the library is closer to the bytecode level. So the above virtual machine knowledge is even more important.

This library also has nothing to describe, it is worth referring to the resources:

  • The official documentation
  • IBM Developerworks analysis of ASM

To get started with ASM quickly, Amway has a plugin called ASM Bytecode Outline. Thanks for the slap article here. So much for ASM, just refer to the project code or study the documentation yourself.

  1. The project of actual combat

= = = = = = =

We pile onCreate onDestroy for all activities in the client, using the Activity start as a section. You are advised to clone a demo project first.

3.1 Creating Gradle plug-ins

If you’re smart enough to create a Gradle plugin and run through the process in no time, follow the instructions in section 2.3. Refer to the project source code if your process doesn’t work.

Points to note:

Note 1:

The project needs to change the address of Compile to your local address, otherwise the Compile will fail. Need to change the file traceplugin/gradle. The properties of LOCAL_REPO_URL properties.

And the maven address in the build.gradle file under the project

3.2 Improve custom plug-ins and add scanning and modification logic

For example, traceplugin. groovy in the demo project is the entry point for the scan. By overriding the transform method, we can obtain the entry point to convert the Class file processing into ASM processing.

public class TracePlugin extends Transform implements Plugin<Project> { void apply(Project project) { def android = project.extensions.getByType(AppExtension); / / to registration of plug-in, add insert pile entry android. RegisterTransform (this)} @ Override public String getName () {return "TracePlugin"; } @Override public Set<QualifiedContent.ContentType> getInputTypes() { return TransformManager.CONTENT_CLASS; } @Override public Set<QualifiedContent.Scope> getScopes() { return TransformManager.SCOPE_FULL_PROJECT; } @Override public boolean isIncremental() { return false; } @Override void transform(Context context, Collection<TransformInput> inputs, Collection<TransformInput> referencedInputs, TransformOutputProvider outputProvider, boolean isIncremental) throws IOException, TransformException, InterruptedException {println '/ / = = = = = = = = = = = = = = = TracePlugin visit start = = = = = = = = = = = = = = = / / / / delete before the output of the if (outputProvider ! = null) outputProvider.deleteAll() // Traverse the TransformInput inputs. Each {TransformInput input -> / / traverse input inside DirectoryInput input. DirectoryInputs. Each {DirectoryInput DirectoryInput - > / / whether the if (directoryInput. File. IsDirectory ()) {/ / directory traversal directoryInput. File. EachFileRecurse {file file - > def filename = file.name; TODO if (name.endswith (".class") &&! name.startsWith("R\$") && !" R.class".equals(name) && !" BuildConfig.class".equals(name)) { ClassReader classReader = new ClassReader(file.bytes) ClassWriter classWriter = new ClassWriter(classReader, ClassWriter.COMPUTE_MAXS) def className = name.split(".class")[0] ClassVisitor cv = new TraceVisitor(className, classWriter) classReader.accept(cv, EXPAND_FRAMES) byte[] code = classWriter.toByteArray() FileOutputStream fos = new FileOutputStream( File. ParentFile. AbsolutePath + file. The separator + name) fos. Write (code) fos. Close ()}}} / / after processing the input file, The output to the next task def dest = outputProvider. GetContentLocation (directoryInput. Name, directoryInput contentTypes, directoryInput.scopes, Format.DIRECTORY) FileUtils.copyDirectory(directoryInput.file, } input.jarinputs. Each {JarInput JarInput -> /** * The output file with the same name will overwrite */ def jarName = jarinput.name def md5Name = DigestUtils.md5Hex(jarInput.file.getAbsolutePath()) if (jarName.endsWith(".jar")) { jarName = jarName.substring(0, jarName.length() - 4) } File tmpFile = null; if (jarInput.file.getAbsolutePath().endsWith(".jar")) { JarFile jarFile = new JarFile(jarInput.file); Enumeration enumeration = jarFile.entries(); tmpFile = new File(jarInput.file.getParent() + File.separator + "classes_trace.jar"); If (tmpfile.exists ()) {tmpfile.delete (); } JarOutputStream jarOutputStream = new JarOutputStream(new FileOutputStream(tmpFile)); // To save ArrayList<String> processorList = new ArrayList<>(); while (enumeration.hasMoreElements()) { JarEntry jarEntry = (JarEntry) enumeration.nextElement(); String entryName = jarEntry.getName(); ZipEntry zipEntry = new ZipEntry(entryName); //println "MeetyouCost entryName :" + entryName InputStream inputStream = jarFile.getInputStream(jarEntry); Class if (entryname.endswith (".class") &&! entryName.contains("R\$") && ! entryName.contains("R.class") && ! EntryName. The contains (" BuildConfig class ")) {/ / class file handling jarOutputStream. PutNextEntry (zipEntry); ClassReader classReader = new ClassReader(IOUtils.toByteArray(inputStream)) ClassWriter classWriter = new ClassWriter(classReader, ClassWriter.COMPUTE_MAXS) def className = entryName.split(".class")[0] ClassVisitor cv = new TraceVisitor(className, classWriter) classReader.accept(cv, EXPAND_FRAMES) byte[] code = classWriter.toByteArray() jarOutputStream.write(code) } else if (entryName.contains("META-INF/services/javax.annotation.processing.Processor")) { if (! processorList.contains(entryName)) { processorList.add(entryName) jarOutputStream.putNextEntry(zipEntry); jarOutputStream.write(IOUtils.toByteArray(inputStream)); } else { println "duplicate entry:" + entryName } } else { jarOutputStream.putNextEntry(zipEntry); jarOutputStream.write(IOUtils.toByteArray(inputStream)); } jarOutputStream.closeEntry(); } // write inject annotation // end jaroutputStream.close (); jarFile.close(); } / / processing jar bytecode injection processing TODO def dest = outputProvider. GetContentLocation (jarName + md5Name, jarInput contentTypes, jarInput.scopes, Format.JAR) if (tmpFile == null) { FileUtils.copyFile(jarInput.file, dest) } else { FileUtils.copyFile(tmpFile, Dest) tmpFile. Delete ()}}} println '/ / = = = = = = = = = = = = = = = TracePlugin visit end = = = = = = = = = = = = = = = / /'} copy codeCopy the code

The above traceplugin. groovy file combines bytecode with ASM. How do you modify the bytecode? Create a Visitor class inherited from ClassVisitor

  • Override the visit method to filter which classes need to be staked, such as all classes inherited from the Activity.
  • Override the visitMethod method to filter which methods of the current class need to be staked. For example, filter all onCreate methods before piling. See the code for specific comments:
Public Class TraceVisitor extends ClassVisitor {/** * Class name */ Private String className; /** * superName */ private String superName; Private String[] interfaces; public TraceVisitor(String className, ClassVisitor classVisitor) { super(Opcodes.ASM5, classVisitor); } /** * Callback when ASM enters a class method ** @param access * @param Name Method name * @param desc * @param signature * @param exceptions * @return */ @Override public MethodVisitor visitMethod(final int access, final String name, final String desc, final String signature, String[] exceptions) { MethodVisitor methodVisitor = cv.visitMethod(access, name, desc, signature, exceptions); methodVisitor = new AdviceAdapter(Opcodes.ASM5, methodVisitor, access, name, Desc) {private Boolean isInject () {/ / if the parent class name is AppCompatActivity intercept this method, the practical application of can change their parent BaseActivity if, for example (superName.contains("AppCompatActivity")) { return true; } return false; } @Override public void visitCode() { super.visitCode(); } @Override public AnnotationVisitor visitAnnotation(String desc, boolean visible) { return super.visitAnnotation(desc, visible); } @Override public void visitFieldInsn(int opcode, String owner, String name, String desc) { super.visitFieldInsn(opcode, owner, name, desc); } @override protected void onMethodEnter() {if (isInject()) {if ("onCreate".equals(name))) { mv.visitVarInsn(ALOAD, 0); mv.visitMethodInsn(INVOKESTATIC, "will/github/com/androidaop/traceutils/TraceUtil", "onActivityCreate", "(Landroid/app/Activity;) V", false); } else if ("onDestroy".equals(name)) { mv.visitVarInsn(ALOAD, 0); mv.visitMethodInsn(INVOKESTATIC, "will/github/com/androidaop/traceutils/TraceUtil" , "onActivityDestroy", "(Landroid/app/Activity;) V", false); } /** * @override protected void onMethodExit(int I) {super.onmethodexit (I);} /** * @override protected void onMethodExit(int I) {super. }}; return methodVisitor; } /** * callback when ASM enters a class ** @param version * @param access * @param Name Class name * @param signature * @param superName Superclass name * @override public void visit(int version, int access, String name, String signature, String superName, String[] interfaces) { super.visit(version, access, name, signature, superName, interfaces); this.className = name; this.superName = superName; this.interfaces = interfaces; }} Copy the codeCopy the code

Note:

If you’re not comfortable with ASM, don’t forget the ASM Bytecode Outline plugin. The onMethodEnter method internal code in TraceVisitor. Java above is copied directly from the ASM Bytecode Outline. How to use this plug-in is described in section 2.4.4.

3.3 Improve custom statistical tools to achieve final data statistics

In the demo project, the app/ Traceutil. Java class is used for statistics. In the project, I just pop up a Toast when onCreate and onDestroy are executed. The traceutils. Java code is as follows:

/** * Created by will on 2018/3/9. */ public class TraceUtil { private final String TAG = "TraceUtil"; @param Activity */ public static void onActivityCreate(Activity Activity) { Toast.makeText(activity , activity.getClass().getName() + "call onCreate" , Toast.LENGTH_LONG).show(); } @param Activity */ public static void onActivityDestroy(Activity Activity) { Toast.makeText(activity , activity.getClass().getName() + "call onDestroy" , Toast.LENGTH_LONG).show(); }} Copy the codeCopy the code

If you look at this, when did TraceUtil’s onActivityCreate and onActivityDestroy get executed? Through TraceVisitor’s visitMethod pile, of course.

Run Demo & Enjoy by yourself

Project code

Looking at the effect of the project, the statistics code has been successfully injected.

  1. Other Tips

= = = = = = = = = = = =

  • Bytecode pilings are application-specific pilings. What if we want to Pilate only one function? For example, I only want to insert the onCreate function for MainActivity. I don’t want to insert the onCreate function for other activities. Custom annotations can be used to solve this problem. The solution is to customize a annotation, put it on the method you want to count, and override the visitAnnotation method in ASM’s ClassVisitor class to determine whether to insert the stub. How to customize annotations to see my blog post
  • What if you want to drive a different pile? For example, I want to count both the Activity lifecycle function and the View Click event. To be fair, I am not experienced enough in this area, and my scheme is relatively low. I judge by judging the name of the current class, the name of the current class’s parent class, the interfaces that the current class implements, and the names of the current class methods in the ClassVisitor, which is rather bloated. Friends have any good ideas can leave a message or contact me

Write in the last

Because this blog post involves more knowledge points, many places I may not have expanded the writing is rough. If there are any problems in writing, I hope you can bring them up in time, study together and make progress together.

The resources

  • Bytecode piling for Android AOP
  • ASM 4.0 A Java bytecode engineering library
  • Add bytecode by hand to insert code into method body
  • ASM actual combat statistics is time-consuming
  • AOP’s weapon: An introduction to ASM 3.0
  • Function bytecode for ASM dynamic modification of JAVA functions
  • Initial implementation of Android ASM pile
  • The most important — Understanding the Java Virtual Machine

About Me

contact way value
mail [email protected]
wechat W2006292
github https://github.com/weixinjie
blog https://juejin.im/user/57673c83207703006bb92bf6