Introduction to the

Due to the need of my work, I need to solve some performance problems. Although THERE are tools such as Profiler and Systrace, but it is not convenient to monitor performance in real time, so I plan to write a small tool that can monitor performance in real time. Through the study of the leaders of the article, the final completion of the open source performance real-time detection library. Preliminary can achieve the desired effect, here to make a record, is a summary.

The open source library is at:

Github.com/XanderWang/…

Thank you for giving me a little star encouragement.

This performance check library can detect the following problems:

  • UI thread block detection.
  • App FPS detection.
  • Thread creation and startup monitoring and thread pool creation monitoring.
  • IPC (inter-process communication) monitoring.

At the same time, it also realizes the following functions:

  • Print detected problems through LogCAT in real time.
  • Save detected information to a file.
  • Provides an interface for reporting information files.

Access to the guide

1 Add the following content to build.gradle in the APP project directory.

dependencies {
  // Base dependencies must be added
  debugImplementation 'IO. Making. Xanderwang: performance: 0.3.1'
  releaseImplementation 'the IO. Making. Xanderwang: performance - it: 0.3.1'

  // Hook package must be added
  debugImplementation 'IO. Making. Xanderwang: hook: 0.3.1'

  // Select one of the following hook schemes. If the run returns an error, change to another one. If it still returns an error, raise an issue
  // The SandHook scheme is recommended. If an error occurs, you can replace it with the Epic library.
  debugImplementation 'the IO. Making. Xanderwang: hook - sandhook: 0.3.1'

  // Epic method. If an error occurs, you can replace it with SandHook.
  / / debugImplementation 'IO. Making. Xanderwang: hook - epic: 0.3.1'
}
Copy the code

2 New initialization code similar to the following is added to the Application class of APP project.

Java initialization example

  private void initPERF(final Context context) {
    final PERF.LogFileUploader logFileUploader = new PERF.LogFileUploader() {
      @Override
      public boolean upload(File logFile) {
        return false; }}; PERF.init(new PERF.Builder()
        .checkUI(true.100) // Check UI lock
        .checkIPC(true) // Check ipc calls
        .checkFps(true.1000) / / check the FPS
        .checkThread(true) // Check threads and thread pools
        .globalTag("test_perf") // global logcat tag for easy filtering
        .cacheDirSupplier(new PERF.IssueSupplier<File>() {
          @Override
          public File get(a) {
            // issue file save directory
            return context.getCacheDir();
          }
        })
        .maxCacheSizeSupplier(new PERF.IssueSupplier<Integer>() {
          @Override
          public Integer get(a) {
            // Issue file occupies maximum storage space
            return 10 * 1024 * 1024;
          }
        })
        .uploaderSupplier(new PERF.IssueSupplier<PERF.LogFileUploader>() {
          @Override
          public PERF.LogFileUploader get(a) {
            // issue file upload interface
            return logFileUploader;
          }
        })
        .build());
  }
Copy the code

Kotlin sample

  private fun doUpload(log: File): Boolean {
    return false
  }

  private fun initPERF(context: Context) {
    PERF.init(PERF.Builder()
        .checkUI(true.100)// Check UI lock
        .checkIPC(true) // Check ipc calls
        .checkFps(true.1000) / / check the FPS
        .checkThread(true)// Check threads and thread pools
        .globalTag("test_perf")// global logcat tag for easy filtering
        .cacheDirSupplier { context.cacheDir } // issue file save directory
        .maxCacheSizeSupplier { 10 * 1024 * 1024 } // Issue file occupies maximum storage space
        .uploaderSupplier { // Issue file upload interface implementation
          PERF.LogFileUploader { logFile -> doUpload(logFile) }
        }
        .build()
    )
  }
Copy the code

Major update record

  • 0.3.1 Added setting image detection larger than the actual control size to ImageView
  • 0.3.0 Changed the publishing mode of dependent libraries to MavenCentral
  • 0.2.0 monitor thread time consumption, and monitor thread priority changes.
  • Added thread name information collection in 0.1.12 Monitoring thread creation. At the same time, connect to the Startup library for necessary initialization and the configuration file cannot be found when you adjust the Multi Dex.
  • 0.1.11 Optimize the encapsulation of HOOK scheme. Through SandHook open source library, it can be detected according to the time of IPC.
  • The detection interval of 0.1.10 FPS is changed from the default 2s to 1s and supports custom interval.
  • 0.1.9 Optimized monitoring of thread pool creation.
  • 0.1.8 initial release, complete the basic functions.

It is not recommended to use this library directly online. When writing this library and testing hook, there will be different problems on different machines and ROM. It is recommended to use this detection library in offline self-test first.

The principle is introduced

UI thread block detection principle

Mainly refer to the AndroidPerformanceMonitor library, which handles the Message inside of the UI thread process monitoring.

Before Looper starts processing messages, the asynchronous thread starts a delayed task to collect information later. If the Message is processed within the specified period of time, then the delayed task is cancelled after the Message is processed, indicating that the UI thread has no block. If the task is not completed within the specified period of time, the UI thread has blocks. At this point, the asynchronous thread can execute the delayed task. If we print the UI thread’s method call stack in this delayed task, we can see what the UI thread is doing. This is the basic principle of UI thread block detection.

However, this scheme has a disadvantage that it cannot handle input events of InputManager, such as remote control button events of TV terminal. An analysis of the call method chain for key events shows that each key event ends up calling the DecorView class’s dispatchKeyEvent method instead of Looper’s Message processing process. So AndroidPerformanceMonitor library is unable to accurately monitor TV app UI block. To apply keystroke handling to the TV side, you need to find a new pointcut, which is the dispatchKeyEvent method of the DecorView class we just saw.

How do you intervene in the dispatchKeyEvent method of the DecorView class? We can hook this method through the Epic library. If the hook succeeds, we can receive a callback before and after the dispatchKeyEvent method is called in the DecorView class, and we can execute a delayed task on the asynchronous thread before the dispatchKeyEvent method is called. Cancel this delayed task after the dispatchKeyEvent method is called. If the dispatchKeyEvent method takes less than the specified time threshold, the delayed task is cancelled before execution. In this case, the delayed task is removed. If the dispatchKeyEvent method takes longer than the specified time threshold, the UI thread has blocks. At this point, the asynchronous thread can perform this delayed task to collect the necessary information.

Above is the revised UI thread block detection principle, now do is relatively coarse, follow-up plan considering the reference AndroidPerformanceMonitor print CPU, memory, etc. For more information.

The final log print result is as follows:

com.xander.performace.demo W/demo_Issue: =================================================
    type: UI BLOCK
    msg: UI BLOCK
    create time: 2021-01-13 11:24:41
    trace:
    	java.lang.Thread.sleep(Thread.java:-2)
    	java.lang.Thread.sleep(Thread.java:442)
    	java.lang.Thread.sleep(Thread.java:358)
    	com.xander.performance.demo.MainActivity.testANR(MainActivity.kt:49)
    	java.lang.reflect.Method.invoke(Method.java:-2)
    	androidx.appcompat.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:397)
    	android.view.View.performClick(View.java:7496)
    	android.view.View.performClickInternal(View.java:7473)
    	android.view.View.access$3600(View.java:831)
    	android.view.View$PerformClick.run(View.java:28641)
    	android.os.Handler.handleCallback(Handler.java:938)
    	android.os.Handler.dispatchMessage(Handler.java:99)
    	android.os.Looper.loop(Looper.java:236)
    	android.app.ActivityThread.main(ActivityThread.java:7876)
    	java.lang.reflect.Method.invoke(Method.java:-2)
    	com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:656)
    	com.android.internal.os.ZygoteInit.main(ZygoteInit.java:967)
Copy the code

The principle of FPS detection

FPS detection principle, using the Android screen drawing principle. Here’s a brief description of how Android screens are drawn.

The system sends a VSync signal every 16 ms. If the application registers the VSync signal, it will receive a callback when the VSync signal arrives and begin drawing. If the preparation goes well, i.e. CPU data preparation, GPU rasterization, etc., if these tasks are completed within 16 ms, then the interface can be drawn before the next VSync signal arrives. I don’t drop any frames, it’s a smooth interface. If it is not ready within 16 ms, it may take more time for the image to display, in which case frame loss occurs. If you lose a lot of frames, you get stuck.

The principle of FPS detection is actually quite simple, by counting how many frames are drawn over a period of time, such as 1s, you can calculate the FPS. So how do you know how many interfaces are drawn in 1 second of your application? This depends on the VSync signal listening.

Before preparing to draw, place a synchronization barrier in the UI thread’s MessageQueue so that the UI thread will only process asynchronous messages until the synchronization barrier is removed. Before refreshing, the application will register a VSync signal listener. When the VSync signal arrives, the system will notify the application and ask the application to place an asynchronous Message in the UI thread MessageQueue. Since there was a synchronization barrier in the MessageQueue, subsequent UI threads take precedence over the asynchronous Message. What this asynchronous Message does is start with the ViewRootImpl familiar measures, layouts, and draws.

We can register VSync signal listeners through Choreographer. 16ms later, we received VSync signal, put a synchronization message into MessageQueue, we don’t do special processing, just do a count, and then listen to the next VSync signal, so that we can know how many VSync signals we heard in 1s. You can figure out the frame rate.

Why is the number of VSync signals monitored the frame rate?

Since Looper processes messages serially, that is, only one Message is processed at a time before the next Message is processed. When drawing, the drawing task Message is an asynchronous Message and will be executed preferentially. After the drawing task Message is executed, the VSync signal counting task mentioned above will be executed. If the time of counting task is ignored, the final number of VSync signals can be roughly considered as the number of frames drawn in a certain period of time. The frame rate can then be calculated using the length of time and the number of VSync signals.

The final log print result is as follows:

com.xander.performace.demo W/demo_FPSTool: APP FPS is: 54 Hz
com.xander.performace.demo W/demo_FPSTool: APP FPS is: 60 Hz
com.xander.performace.demo W/demo_FPSTool: APP FPS is: 60 Hz
Copy the code

Thread creation and startup monitoring and thread pool creation monitoring

The monitoring of threads and thread pools mainly monitors where threads and thread pools are created and executed. If we can know this information, we can know whether threads and thread pools are created and started at a reasonable time. Thus the optimization scheme is obtained.

An easy way to do this would be for all threads and thread pools in your application code to inherit from the same thread base class and thread pool base class. We then print the method call stack in the constructor and launcher functions so we know where threads or thread pools are created and executed.

Having all threads and thread pools in your application inherit from the same base class can be done by compiling plug-ins that customize a special Transform that changes the inheritance relationship using the bytecode generated by ASM editing. However, this method has a certain difficulty in getting started, not suitable for novices.

In addition to this method, we have another method, which is hook. With the constructor and startup methods of the hook thread or thread pool, we can do some slices before and after the constructor and startup methods of the thread or thread pool, such as printing the current method call stack, etc. This is the rationale behind thread and thread pool monitoring.

Thread pool monitoring is not too difficult. It is usually a subclass of ThreadPoolExecutor, so we hook the ThreadPoolExecutor constructor to monitor the creation of a thread pool. Thread pool execution is basically about hooking the Execute method of the ThreadPoolExecutor class.

Monitoring thread creation and execution is a little trickier, because threads are created in a thread pool, so the creation and execution of this thread should be tied to the thread pool. I need to find the connection between a thread and a thread pool. I saw a library that seemed to be associating threads with a ThreadGroup of a thread pool, and I was planning to write code based on that relationship, but I realized, A ThreadFactory created by a thread pool does not pass in a ThreadGroup. A key class, Worker, is an inner class of ThreadPoolExecutor. Since this class is an inner class, the actual constructor of this class passes in an instance of the external class, ThreadPoolExecutor. In addition, the Worker class is also a Runnable implementation. When the Worker class creates a Thread through ThreadFactory, it passes itself as a Runnable to the Thread. You can know the association between Worker and Thread. The association between ThreadPoolExecutor and its threads can be obtained from the association between ThreadPoolExecutor and the Worker and the Worker and the Thread. This is how threads and thread pools are monitored.

The final log print result is as follows:

com.xander.performace.demo W/demo_Issue: =================================================
    type: THREAD
    msg: THREAD POOL CREATE
    create time: 2021-01-13 11:23:47
    create trace:
    	com.xander.performance.StackTraceUtils.list(StackTraceUtils.java:39)
    	com.xander.performance.ThreadTool$ThreadPoolExecutorConstructorHook.afterHookedMethod(ThreadTool.java:158)
    	de.robv.android.xposed.DexposedBridge.handleHookedArtMethod(DexposedBridge.java:265)
    	me.weishu.epic.art.entry.Entry64.onHookObject(Entry64.java:64)
    	me.weishu.epic.art.entry.Entry64.referenceBridge(Entry64.java:239)
    	java.util.concurrent.Executors.newSingleThreadExecutor(Executors.java:179)
    	com.xander.performance.demo.MainActivity.testThreadPool(MainActivity.kt:38)
    	java.lang.reflect.Method.invoke(Method.java:-2)
    	androidx.appcompat.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:397)
    	android.view.View.performClick(View.java:7496)
    	android.view.View.performClickInternal(View.java:7473)
    	android.view.View.access$3600(View.java:831)
    	android.view.View$PerformClick.run(View.java:28641)
    	android.os.Handler.handleCallback(Handler.java:938)
    	android.os.Handler.dispatchMessage(Handler.java:99)
    	android.os.Looper.loop(Looper.java:236)
    	android.app.ActivityThread.main(ActivityThread.java:7876)
    	java.lang.reflect.Method.invoke(Method.java:-2)
    	com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:656)
    	com.android.internal.os.ZygoteInit.main(ZygoteInit.java:967)
Copy the code

Principle of IPC(inter-process communication) monitoring

The specific principles of interprocess communication, known as the Binder mechanism, are not explained here in detail and are not the principles of the framework library.

The method of detecting interprocess communication is similar to the previous method of detecting thread. It is to find the common ground of all interprocess communication methods, and then make some modifications or slices to the common ground. When the application is conducting interprocess communication, print the call stack, and then continue to do the original thing. The purpose of IPC monitoring is achieved.

How to find common ground, or slices, is the focus of this section.

Interprocess communication requires Binder, which is where you start.

After writing an AIDL demo, I found that interface A inherits from IInterface in the generated code, and then there is an internal abstract class Stub class that inherits from Binder and implements interface A. This Stub class also has an inner class Proxy that implements interface A and holds an IBinder instance.

When using AIDL, we use the Stub class’s asInterFace method, which creates A Proxy instance and passes IBinder to the Proxy instance, or if the IBinder instance is interface A, Force conversion to interface A instance. In general, the IBinder instance is an instance of the callback method of ServiceConnection and is an instance of BinderProxy. Stub class asInterFace will create a Proxy instance, check the implementation method of the Proxy interface, and find that the transact method of BinderProxy will be called. So BinderProxy’s Transact method is a good place to start.

Originally, I also planned to hook the Transact method of BinderProxy class to do IPC detection. However, the Epic library is not stable and will fail to hook methods that contain a Parcel type argument. Unable to resolve this exception for the time being, I had to find a new entry point. In addition to calling BinderProxy’s transact method, AIDL demo called Parcel’s readException method, and decided to hook the IPC call. So as to achieve the purpose of IPC monitoring.

The final log print result is as follows:

com.xander.performace.demo W/demo_Issue: =================================================
    type: IPC
    msg: IPC
    create time: 2021-01-13 11:25:04
    trace:
    	com.xander.performance.StackTraceUtils.list(StackTraceUtils.java:39)
    	com.xander.performance.IPCTool$ParcelReadExceptionHook.beforeHookedMethod(IPCTool.java:96)
    	de.robv.android.xposed.DexposedBridge.handleHookedArtMethod(DexposedBridge.java:229)
    	me.weishu.epic.art.entry.Entry64.onHookVoid(Entry64.java:68)
    	me.weishu.epic.art.entry.Entry64.referenceBridge(Entry64.java:220)
    	me.weishu.epic.art.entry.Entry64.voidBridge(Entry64.java:82)
    	android.app.IActivityManager$Stub$Proxy.getRunningAppProcesses(IActivityManager.java:7285)
    	android.app.ActivityManager.getRunningAppProcesses(ActivityManager.java:3684)
    	com.xander.performance.demo.MainActivity.testIPC(MainActivity.kt:55)
    	java.lang.reflect.Method.invoke(Method.java:-2)
    	androidx.appcompat.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:397)
    	android.view.View.performClick(View.java:7496)
    	android.view.View.performClickInternal(View.java:7473)
    	android.view.View.access$3600(View.java:831)
    	android.view.View$PerformClick.run(View.java:28641)
    	android.os.Handler.handleCallback(Handler.java:938)
    	android.os.Handler.dispatchMessage(Handler.java:99)
    	android.os.Looper.loop(Looper.java:236)
    	android.app.ActivityThread.main(ActivityThread.java:7876)
    	java.lang.reflect.Method.invoke(Method.java:-2)
    	com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:656)
    	com.android.internal.os.ZygoteInit.main(ZygoteInit.java:967)
Copy the code

To contact me

  • Mail

[email protected]

  • WeChat

References:

  1. epic
  2. SandHook
  3. AndroidPerformanceMonitor
  4. Interviewer: How do you monitor the FPS of your application?
  5. Android Caton Optimization (Part 2)