App fluency optimization: Use bytecode pilings to provide a quick tool for identifying time-consuming methods

Some of the main process of our production line to the page more complex page fluency frequently appear repeatedly in the version of the iteration, often caused by the development of do not pay attention to change card, mainly for fluency lack the necessary monitoring and sustainable means of optimization, in the first half of this series is to practice the App fluency is summarized in the process of monitoring, optimization, I hope I can give you a little reference.

Of course, optimizing App memory to minimize jitter can also significantly improve smoothness. For memory optimization, refer to the previous article: Practicing App Memory Optimization: How to Do Memory Analysis and Optimization in order

The whole series will consist of the following parts:

Parsing the drawing process of Catton and View

This part of the content is more, mainly from the source code level analysis of the whole process, is also the basis for us to do fluency monitoring and optimization
How do I monitor and display the real-time frame rate during the Debug phase

Based on the above principles, designing a tool that displays real-time frame rates can effectively detect problems during development
How to automate fluency tests

Implement a fluency UI automation test, run the UI automation before going live and email a fluency report to the relevant staff
On-line user fluency monitoring scheme

It can reflect the real user’s experience of fluency in real time, and the huge online data can sensitively reflect the changes of fluency between iterations of the publication
Implement a tool to facilitate troubleshooting of time-consuming methods

Use custom Gradle Plugin +ASM pile to find out the time-consuming method quickly and accurately, and carry out targeted optimization
Share some experiences to improve app fluency

Share some low-cost, high-benefit solutions to improve fluency

If you want to do a good job, you must first sharpen your tools. Today, first of all, I will share a convenient tool to quickly check the time-consuming method in the process of optimizing the page fluency. After all, keeping the main process smooth and avoiding the execution of time-consuming methods in the main process is always the most direct way to optimize the lag. As long as we can identify the time-consuming methods quickly and conveniently, we can make targeted optimization.

Implement a tool to facilitate troubleshooting of time-consuming methods

Generally, the familiar tools we use to check Android lag include TraceView, Systrace, etc., which are generally divided into two modes: Instrument and sample. However, no matter what mode these tools are in, they all have their shortcomings. For example, instruement mode can obtain the calling process of all functions as though it is instruement mode, though it has rich information, it brings great performance cost and leads to statistical time not matching with reality. The sample pattern, on the other hand, analyzes by sampling, so the information richness is reduced. Systrace, for example, is of the sample type, and it can only monitor the time consumption of some system calls.

Tools, in addition to the above said famous JackWharton also implements a method time-consuming tools, Hugo, you can print and it is based on the annotation to trigger, in a method with a particular annotation can print out this method is time-consuming and other information, but if we want to screen high time consuming method, obviously one annotations on all method is too difficult.

So what kind of tool do we need to do the lag optimization process?

It is convenient to count the time consuming of all methods
It has little impact on performance, and can accurately calculate the exact time consumption of the method
Support time filtering, thread filtering, method name search and other functions, can quickly find the main thread high time methods

The first thought to implement such a tool is to use the pile-in technique, where all methods are pile-in during compilation, and where methods enter and end, you can count the time spent on each method in a way that has little impact on performance. After counting the time consuming data of each method, we will implement a UI interface to display these data, and implement time filtering, thread filtering, method name search and other functions, so that we can quickly find the high time consuming methods of the main thread for targeted optimization.

1. Effect preview

Let’s take a look at the final implementation preview:

Output the time of all methods, high time method with red alert, and support for time filtering, thread filtering, method name search, etc., for example, to screen out the main thread time-consuming more than 50ms, it can be very convenient to find out.

Detailed integration and usage documentation can be found at:MethodTraceMan

2. Technology selection

As a matter of fact, pile-in technology is rife in all aspects of our common development, which can help us achieve many complicated functions and improve the stability of functions. For example, ButterKnife, Protocol Buffers and others will generate code during compilation. Of course, there are many different pile-in technologies. For example, ButterKnife uses APT to manipulate Java files at the beginning of compilation, while AscpectJ, ASM and others operate on bytecode files after Java files are compiled into bytecode files. Of course, there are frameworks that can manipulate dex files after bytecode files are compiled into dex files. Since our requirement is to have time statistics at compile time on where all methods enter and end, the final technical selection is locked in the operation of the bytecode file. So let’s compare AspectJ and ASM bytecode piling frameworks:

The AspectJ.

AspectJ bytecode handling framework is established, its advantage is simple to use the easier, does not need to know about the bytecode knowledge can also be integrated in the project use, as long as you specify simple rules can be completed to the code of the pile, such as we are now in order to achieve the entry and exit of all methods of pile, is very simple, as follows:

@Before("execution(* **(..) )"
public void beforeMethod(JoinPoint joinPoint) {
    //TODO time statistics
}

@After("execution(* **(..) )"
public void afterMethod(a) {
    //TODO time statistics
}

Copy the code

Of course, the downside to AspectJ’s advantages is that because it is rules-based, its pointcuts are relatively fixed, and it has less freedom of operation over bytecode files and less control over development. We want to insert only the code we want to insert, and AspectJ will generate some extra wrapper code, which will have an impact on performance and package size.

2. The ASM

The ASM bytecode handling framework is a very strong, basically can achieve any, to the operation of the bytecode is freedom and development of control degree is high, but its relatively higher than AspectJ to fit the difficulty, need a better understanding of Java bytecode, but ASM provides us with the visitor pattern to access the bytecode file, This mode can be relatively simple to do some bytecode operations, to achieve some functions. At the same time, ASM can precisely inject only the code we want to inject, and no additional wrapper code is generated, so the performance impact is minimal.

So much for the above, here’s a quick introduction to Java bytecode:

Java bytecode

For example, let’s write a simple test.java file:

public class Test {
    private int m = 1;

    public int add(a) {
        int j = 2;
        int k = m + j;
        returnk; }}Copy the code

And then we go throughjavac Test.java -gTo compile to test.class, open with a text editor as follows:

You can see a bunch of hexadecimal numbers, but the hexadecimal numbers are grouped together in strict order: Cafe Babe, Java version number, constant pool, access flag, current class index, parent class index, interface index, field table, method table, additional attributes, and so on ten parts, these parts are expressed in hexadecimal form and tightly joined together, is the class bytecode file seen above.

Of course, the hexadecimal file above is not readable, so we can decompile it using javap-verbose Test. If you are interested in decomcompiling it, you can see the above ten parts. Since bytecode pegs are usually associated with the method table, let’s focus on the method table. Here is the decompiled add() method:

As you can see, there are three parts:

CodeThis section is the JVM instruction opcode in the method, and it is the most important part, because the logic in our method is actually done by the instruction opcode. You can see here that our add method is done with nine instruction opcodes. Of course, the main operation of piling is also this piece, as long as you can modify instructions, you can also manipulate any code.
LineNumberTableThis is the table representing row numbers. Is our Java source code corresponding to the line number of the instruction line. For example, our Add method Java source code has three lines, line10, line11, and line12 in the figure above, which correspond to the number of JVM instruction lines. With such a correspondence, we can implement functions such as Debug debugging. When the instruction is executed, we can locate the location of the corresponding source code for the instruction.
LocalVariableTable: a list of local variables, including local variables in This and methods. As you can see from the figure above, the add method has three local variables: this, j, and k.

Since the JVM instruction set is stack based, we have seen that the add method logic is compiled into 9 instruction opcodes in a class file. Let’s take a look at how these opcodes work with the stack of operands + local variables + constant pool to execute the add method logic:

Execute 9 instruction opcodes in sequence:

0: pushes the number 2
1: assign 2 to j in the local variable table
2, 3: Get m push in constant pool
6: Push j in the local variable table
7, 8: add m and j and assign to k in the local variable table
9, 10: push k in the local variable table and return

All right, so this is a brief introduction to Java bytecode for now, but it basically gives us a basic idea of the structure of a bytecode file and how the code works when compiled. ASM can generate bytecodes or plugins by manipulating instructions. When you can use ASM to access bytecodes and manipulate bytecodes using ASM apis, you have a lot of freedom to generate, modify, manipulate, and so on, which can produce very powerful functions.

Gradle Plugin + Transform

As for the choice of pile framework above, we finally choose ASM through comparison. However, ASM is only responsible for the operation of bytecode. We also need to intervene in the compilation process by defining gradle Plugin to obtain all class files and JAR packages during the compilation process and then iterate over them. Using ASM to modify the bytecode to achieve the purpose of piling.

In this case, our first idea to interfere with the compilation process is to hook the class to dex, get all the class files before the class is converted to dex, and then use ASM to pile these bytecode files. The processed bytecode file is then used as input to the transformClassesWithDex task. The advantage of this approach is that it is easy to control. We know exactly which bytecode file we are working on is the final bytecode because we get the bytecode file just before the transformClassesWithDex task. On the minus side, if the project turns on obfuscates, the bytecode file that you get just before the transformClassesWithDex task is obviously obfuscates, so you need to work with the mapping file to find the correct place to insert the bytecode using ASM.

Fortunately, Gradle also provides us with another way to interfere with the compile transformation process: Transform. In fact, if we take a look at the source code of the Gradle compilation process, we will find that some familiar functions are implemented by Transform. One more thing is about the problem of confusion. As mentioned above, if you use the Hook transformClassesWithDex task to implement the pile, there will be problems if you turn on the confusion. Will there be problems with the Transform? Let’s look at the gradle source code to find the answer:

We from com. Android. Build. Gradle. Internal. The TaskManager createCompileTask in class () method to look, this is obviously a create compiling methods:

protected void createCompileTask(@NonNull VariantScope variantScope) {
        // Create a task to compile Java files into class files
        JavaCompile javacTask = createJavacTask(variantScope);
        addJavacClassesStream(variantScope);
        setJavaCompilerTask(javacTask, variantScope);
        
        // Create some additional tasks to perform after compiling into class files, such as some Transform, etc
        createPostCompilationTasks(variantScope);
    }
Copy the code

Next we see createPostCompilationTasks () method, this method is longer, only a few important code below:

public void createPostCompilationTasks(@NonNull final VariantScope variantScope) {,,,,,,,, TransformManager TransformManager. = variantScope getTransformManager (); 、、、、、// ----- External Transforms This is our custom Transform-----
     // apply all the external transforms.
        List<Transform> customTransforms = extension.getTransforms();
        List<List<Object>> customTransformsDependencies = extension.getTransformsDependencies();
        、、、、、、
        、、、、、、
        // ----- Minify next This is the Transform----- that confuses the codeCodeShrinker shrinker = maybeCreateJavaCodeShrinkerTransform(variantScope); 、、、、、、 、、、、、、}Copy the code

In fact, there are many other transformations in this method that are omitted here, Let’s just focus on our custom registered Transform and the obfuscated Transform. From the above code our custom Transform is added to the TransformManager before obfuscating the Transform, Therefore, our custom Transform will be executed before the obfuscations occur, which means that the way we use our custom Transform to plug the code is not affected by the obfuscations.

So our final solution is the technical solution of Gradle Plugin + Transform +ASM. Below we officially talk about the use of the technical solution for specific implementation.

3. Concrete implementation

Here specific implementation only pick the key implementation steps, detailed can see the specific source code, the end of the article provides the github address of the project.

Customize Gradle Plugin

Gradle Plgin inherits gradle Plugin from the plugin class and uses the apply method to create a new gradle plugin. Our apply method is as simple as creating a custom extension configuration and then registering our custom Transform:

@Override
    void apply(Project project) {

        println '*****************MethodTraceMan Plugin apply*********************'
        project.extensions.create("traceMan", TraceManConfig)

        def android = project.extensions.getByType(AppExtension)
        android.registerTransform(new TraceManTransform(project))
    }
Copy the code

Custom Transform implementation

Here we create an extension called traceMan, so that we can use the plugin to configure the scope of the plugins, whether to enable the plugins, etc., so that we can configure them according to our own needs.

Let’s take a look at the implementation of TraceManTransform:

public void transform(TransformInvocation transformInvocation) throws TransformException, InterruptedException, IOException {
        println '[MethodTraceMan]: transform()'
        def traceManConfig = project.traceMan
        String output = traceManConfig.output
        if (output == null || output.isEmpty()) {
            traceManConfig.output = project.getBuildDir().getAbsolutePath() + File.separator + "traceman_output"
        }

        if (traceManConfig.open) {
            // Read the configuration
            Config traceConfig = initConfig()
            traceConfig.parseTraceConfigFile()


            Collection<TransformInput> inputs = transformInvocation.inputs
            TransformOutputProvider outputProvider = transformInvocation.outputProvider
            if (outputProvider ! =null) {
                outputProvider.deleteAll()
            }

            // Iterate over the class variable and the JAR package
            inputs.each { TransformInput input ->
                input.directoryInputs.each { DirectoryInput directoryInput ->
                    traceSrcFiles(directoryInput, outputProvider, traceConfig)
                }

                input.jarInputs.each { JarInput jarInput ->
                    traceJarFiles(jarInput, outputProvider, traceConfig)
                }
            }
        }
    }
Copy the code

Use ASM for piling

Let’s take a look at how to use ASM’s visitor pattern to pile after iterating through the class file:

static void traceSrcFiles(DirectoryInput directoryInput, TransformOutputProvider outputProvider, Config traceConfig) {
        if (directoryInput.file.isDirectory()) {
            directoryInput.file.eachFileRecurse { File file ->
                def name = file.name
                // Based on the configured pile scope, a class file is to be processed
                if (traceConfig.isNeedTraceClass(name)) {
                    // Use the ASM API to access class files
                    ClassReader classReader = new ClassReader(file.bytes)
                    ClassWriter classWriter = new ClassWriter(classReader, ClassWriter.COMPUTE_MAXS)
                    ClassVisitor cv = new TraceClassVisitor(Opcodes.ASM5, classWriter, traceConfig)
                    classReader.accept(cv, EXPAND_FRAMES)
                    byte[] code = classWriter.toByteArray()
                    FileOutputStream fos = new FileOutputStream(
                            file.parentFile.absolutePath + File.separator + name)
                    fos.write(code)
                    fos.close()
                }
            }
        }

        // Process the output to the next task as input
        def dest = outputProvider.getContentLocation(directoryInput.name,
                directoryInput.contentTypes, directoryInput.scopes,
                Format.DIRECTORY)
        FileUtils.copyDirectory(directoryInput.file, dest)
    }
Copy the code

As you can see, the class file is ultimately processed in the TraceClassVisitor class. Let’s take a look at TraceClassVisitor:

class TraceClassVisitor(api: Int.cv: ClassVisitor? .var traceConfig: Config) : ClassVisitor(api.cv) {

    private var className: String? = null
    private var isABSClass = false
    private var isBeatClass = false
    private var isConfigTraceClass = false

    override fun visit( version: Int, access: Int, name: String? , signature: String? , superName: String? , interfaces: Array
       
        ? )
        {
        super.visit(version, access, name, signature, superName, interfaces)

        this.className = name
        // Abstract method or interface
        if (access and Opcodes.ACC_ABSTRACT > 0 || access and Opcodes.ACC_INTERFACE > 0) {
            this.isABSClass = true
        }

        // The class to which the pile code belongsval resultClassName = name? .replace("."."/")
        if (resultClassName == traceConfig.mBeatClass) {
            this.isBeatClass = true
        }

        // Is it a configured class that requires pilingname? .let { className -> isConfigTraceClass = traceConfig.isConfigTraceClass(className) } }override fun visitMethod( access: Int, name: String? , desc: String? , signature: String? , exceptions: Array
       
        ? )
       : MethodVisitor {
        val isConstructor = MethodFilter.isConstructor(name)
        // If the abstract method, construction method, or method is not within the scope of pile insertion, the pile insertion is not carried out
        return if(isABSClass || isBeatClass || ! isConfigTraceClass || isConstructor) {super.visitMethod(access, name, desc, signature, exceptions)
        } else {
            // Insert the method in TraceMethodVisitor
            val mv = cv.visitMethod(access, name, desc, signature, exceptions)
            TraceMethodVisitor(api, mv, access, name, desc, className, traceConfig)
        }
    }
}
Copy the code

Take a look at TraceMethodVisitor again:

override fun onMethodEnter(a) {
        super.onMethodEnter()
        // use ASM to call the time statistics method with the insert instruction when the method enters :start()
        mv.visitLdcInsn(generatorMethodName())
        mv.visitMethodInsn(INVOKESTATIC, traceConfig.mBeatClass, "start"."(Ljava/lang/String;) V".false)}override fun onMethodExit(opcode: Int) {
        // use ASM to call the end() method with insert instruction when the method enters :end()
        mv.visitLdcInsn(generatorMethodName())
        mv.visitMethodInsn(INVOKESTATIC, traceConfig.mBeatClass, "end"."(Ljava/lang/String;) V".false)}Copy the code

In this way, we can call traceman.start () for all methods configured within the scope of the pile when the method enters and traceman.end () for time statistics when the method exits. The TraceMan class is also configurable, meaning that you can configure which method of which class to call when the method enters and exits.

Traceman.start () and traceman.end () can be used to calculate the time of a method and output the time of all methods. You can see the implementation of the TraceMan class in the source code.

4. UI display

Through the above methods of pile, and takes the processing of data, we have been able to obtain the time of all the methods of statistics, so this tool for ease of use, we implement a UI display interface, can let method time-consuming data can be real-time display on the browser, and support the time-consuming screening, screening, thread method name search, and other functions.

We use React to implement a UI display interface, and then build a server on the mobile phone, so that the browser can access the UI display interface through the address, and data transmission through socket. Our piling code generation method takes time to data. The React UI then receives the data, consumes the data, and presents the data.

UI interface display this part of the implementation is relatively trivial, here will not expand in detail, interested students can look at the source code.

I maintain a detailed documentation on Github for the source code and detailed integration and usage of the project: MethodTraceMan

5. To summarize

The above is a tool we implemented in the process of optimizing the fluency to help us quickly solve the problem, but also a simple share of the relevant technical knowledge, hoping to provide a little idea for students who are also troubled with the fluency of the page. The rest of the section will include: How Android View is drawn, framerate fluency monitoring, framerate automation testing, practical tips for fluency optimization, and more. Of course, there is still a lot of work to be done on the monitoring and optimization of lag and fluency. Our main goal is to form a closed-loop solution from monitoring to troubleshooting tools to lag solutions, so that the fluency problems between version iterations can be controlled, discovered and easily solved. This is the direction we are striving for.