The source of Caton
Through the screen rendering mechanism we know :Android screen rendering is achieved through Vsync, the software layer calculates the data, puts it into the buffer, and the hardware layer reads the data from the buffer and draws it on the screen. The rendering cycle is 16ms, so we can see the constantly changing picture.
If it goes beyond 16ms, a lag will occur, of course, in the software layer (if it happens in the hardware layer, just change the device). So, the computation time of the software layer needs to be less than 16ms, so where does this computation take place?
It’s in the Handler, in the UI Handler, to be more precise.
Android processes interact with the Binder and threads communicate with handlers.
After the software layer receives the vsync signal from the hardware layer, the Java layer will send a message to the UI Handler to calculate the view data, that is, to perform the measurement layout drawing, which is shown in the code layer: Execute the performTraversals() function in ViewRootImpl, which we mentioned in the View measurement layout drawing.
So, the view’s data is being evaluated in the UI Handler, so if something else is being done in the UI Handler, and it takes a long time to execute, it indirectly causes the stackage to occur, and we need to find it and kill it.
So how do you find them?
As we know, the message processing of Handler is distributed through Looper, so we can get the UI Looper first, implant the detection code before and after the message execution, and add the detection logic, then we can analyze and figure out the time spent in the message execution.
// Get the UI Looper
Looper uiLooper = Looper.getMainLooper();
// Record the time before the message is sent
void preHandle(a){
time = System.currentTimeMillis();
}
// The processing logic after the message is distributed, calculates the time difference and prompts
void postHandle(a){
long delay = System.currentTimeMillis() - time;
if(delay > 16) {
// If you think you are stuck, you can do something like print the current thread stack}}// Insert the above two methods before and after the uiLooper message distribution (if there is one)
uiLooper.xxxxxxx();
Copy the code
So how do you integrate these two functions into a uiLooper? There are already apis available in Looper.
How to detect application stutter
According to the above, we only need to record the time before message execution, then calculate the time difference, and compare the time difference with the lag threshold we passed in. If the lag threshold is greater than this value, it is considered that the lag has occurred, and then dump the main thread stack and show it to the developer.
So how do you find the pre – and post-message insertion points?
Looper itself provides a method to set the log printing class:
/**
* Control logging of messages as they are processed by this Looper. If
* enabled, a log message will be written to <var>printer</var>
* at the beginning and ending of each message dispatch, identifying the
* target Handler and message contents.
*
* @param printer A Printer object that will receive log messages, or
* null to disable message logging.
*/
public void setMessageLogging(@Nullable Printer printer) {
mLogging = printer;
}
Copy the code
This printer will be used to print logs before and after message execution. The code is in the loop() function of Looper as follows:
// The message is printed before execution
// This must be in a local variable, in case a UI event sets the logger
final Printer logging = me.mLogging;
if(logging ! =null) {
logging.println(">>>>> Dispatching to " + msg.target + "" +
msg.callback + ":" + msg.what);
}
// The message is printed after execution
if(logging ! =null) {
logging.println("<<<<< Finished to " + msg.target + "" + msg.callback);
}
Copy the code
Using this principle, we can pass in a custom Printer, then duplicate the println() method, then calculate the time difference between before and after message execution, and if it is greater than the target value (say, 500ms), we think that a lag has occurred.
So, how do you tell if println() is called before or after message?
We can judge by using the message contents printed in Println, for example, >>>>> Dispatching to…. is displayed before the command is executed <<<<< Finished to
public void println(String msg) {
if(msg.startsWith(">>>>> Dispatching to")) {
// This is before execution
}else {
// This is after execution}}Copy the code
But this is too low, and string matching is inherently inefficient, so we can use another method.
Since the log is printed in pairs before and after message execution, we can define a Boolean value that indicates whether the log is printed before message execution and changes the value each time the log is printed. Such as:
// Is printed before message execution, since the first print must be done before message execution, so the initial value is true
private boolean isPre = true;
public void println(String msg) {
if(isPre) {
// The main thread stack is dumped before message is executed
}else {
// We can stop the stack of the dump thread by printing after message execution
}
// Change the value once, this time before message execution, the next time must be after execution; This time is after the execution, the next time must be before the executionisPre = ! isPre; }Copy the code
Ok, now that we know the core principles, let’s look at the implementation of the existing project code BlockCanary.
BlockCanary
Simple to use
- 1 Adding a Dependency
dependencies {
// Use it in both debug and release versions
compile 'com. Making. Markzhai: blockcanary - android: 1.5.0'
// Only for debug
/ / debugCompile 'com. Making. Markzhai: blockcanary - android: 1.5.0'
/ / releaseCompile 'com. Making. Markzhai: blockcanary - no - op: 1.5.0'
}
Copy the code
- 2 Code Integration
First, define an AppBlockCanaryContext that inherits BlockCanaryContext. We need to override several methods in the AppBlockCanaryContext.
public class AppBlockCanaryContext extends BlockCanaryContext {
private static final String TAG = "AppBlockCanaryContext";
/** * returns an identifier that can be passed in as app_name, version number, channel, etc. */
public String provideQualifier(a) {
return "my_app" + BuildConfig.VERSION_CODE;
}
/** * returns a user ID to identify */
public String provideUid(a) {
return "10086";
}
/** * Returns the network type, such as :2G, 3G, 4G, wifi, etc. */
public String provideNetworkType(a) {
return "wifi";
}
/**
* Config monitor duration, after this time BlockCanary will stop, use
* with {@code BlockCanary}'s isMonitorDurationEnd
*
* @return monitor last duration (in hour)
*/
public int provideMonitorDuration(a) {
return -1;
}
/** * returns the threshold you think is stuck in milliseconds, which should be passed in different sizes according to the performance of different devices */
public int provideBlockThreshold(a) {
return 500;
}
/** ** Dump interval of the thread. When stall occurs, the main thread is dumped at intervals
public int provideDumpInterval(a) {
return provideBlockThreshold();
}
/** * Path for saving logs */
public String providePath(a) {
return "/blockcanary_log/"
}
/** * Whether a notification is displayed immediately */
public boolean displayNotification(a) {
return true;
}
/** * the card will be called immediately, you can print the log here, or upload to your own server */
public void onBlock(Context context, BlockInfo blockInfo) {
Log.d(TAG, "onBlock: "+ blockInfo); }}Copy the code
Call onCreate() from Application:
public class MainApplication extends Application {
@Override
public void onCreate(a) {
// Pass in the AppBlockCanaryContext we created above
BlockCanary.install(this.newAppBlockCanaryContext()).start(); }}Copy the code
When caton occurs, we are notified and can see our own printed log in Logcat.
This article focuses on source code analysis. The full use can be found on Github.
Source code analysis
BlockCanary. Install ():
public static BlockCanary install(Context context, BlockCanaryContext blockCanaryContext) {
// Initialize BlockCanaryContext
BlockCanaryContext.init(context, blockCanaryContext);
Blockcanarycontext.get () is the argument we passed in, and displayNotification() is the argument we defined above whether to display the notification
setEnabled(context, DisplayActivity.class, BlockCanaryContext.get().displayNotification());
// Create a singleton and return it
return get();
}
// Implementation of the get() function
public static BlockCanary get(a) {
if (sInstance == null) {
synchronized (BlockCanary.class) {
if (sInstance == null) {
sInstance = newBlockCanary(); }}}return sInstance;
}
// BlockCanary construction
private BlockCanary(a) {
BlockCanaryInternals.setContext(BlockCanaryContext.get());
// Create a core class, which includes log analysis, stack dump, and CPU collection
mBlockCanaryCore = BlockCanaryInternals.getInstance();
// Core code to add the interceptor, which is the AppBlockCanaryContext we passed in
// When caton is detected, its onBlock() function is called
mBlockCanaryCore.addBlockInterceptor(BlockCanaryContext.get());
// Return if no notification is displayed
if(! BlockCanaryContext.get().displayNotification()) {return;
}
// Otherwise add a service to pop up notifications
mBlockCanaryCore.addBlockInterceptor(new DisplayService());
}
Copy the code
Key methods in BlockCanaryContext:
static void init(Context context, BlockCanaryContext blockCanaryContext) {
/ / save the Context
sApplicationContext = context;
// Save the BlockCanaryContext argument, which is our custom AppBlockCanaryContext
sInstance = blockCanaryContext;
}
public static BlockCanaryContext get(a) {
if (sInstance == null) {
throw new RuntimeException("BlockCanaryContext null");
} else {
// Returns the arguments saved above
returnsInstance; }}Copy the code
Now, let’s go back to the main line logic and look at blockCanary. Start ():
/ / testing
public void start(a) {
// Add a Boolean value to prevent repeated processing
if(! mMonitorStarted) { mMonitorStarted =true;
// Set the value of this parameterLooper.getMainLooper().setMessageLogging(mBlockCanaryCore.monitor); }}// Stop the test
public void stop(a) {
if (mMonitorStarted) {
mMonitorStarted = false;
/ / remove the Printer
Looper.getMainLooper().setMessageLogging(null);
// Stop stack dump
mBlockCanaryCore.stackSampler.stop();
// Stop CPU collectionmBlockCanaryCore.cpuSampler.stop(); }}Copy the code
MBlockCanaryCore: mBlockCanaryCore: mBlockCanaryCore: mBlockCanaryCore: mBlockCanaryCore: mBlockCanaryCore: mBlockCanaryCore: mBlockCanaryCore
mBlockCanaryCore = BlockCanaryInternals.getInstance();
Copy the code
// this is a singleton
static BlockCanaryInternals getInstance(a) {
if (sInstance == null) {
synchronized (BlockCanaryInternals.class) {
if (sInstance == null) {
sInstance = newBlockCanaryInternals(); }}}return sInstance;
}
// look at the constructor
public BlockCanaryInternals(a) {
// stack dump, the first argument is the UI thread, and the second argument is the dump interval we set
stackSampler = new StackSampler(Looper.getMainLooper().getThread(),sContext.provideDumpInterval());
// CPU collector, the parameter is the dump interval we set
cpuSampler = new CpuSampler(sContext.provideDumpInterval());
// Set log printing class Printer
setMonitor(new LooperMonitor(new LooperMonitor.BlockListener() {
// This method is executed when caton is detected
@Override
public void onBlockEvent(long realTimeStart, long realTimeEnd, long threadTimeStart, long threadTimeEnd) {
// Get stack information
ArrayList<String> threadStackEntries = stackSampler.getThreadStackEntries(realTimeStart, realTimeEnd);
if(! threadStackEntries.isEmpty()) {// Construct BlockInfo and call it back to the interceptor, thus calling it inside onBlock() in our AppBlockCanaryCotnext
BlockInfo blockInfo = BlockInfo.newInstance()
.setMainThreadTimeCost(realTimeStart, realTimeEnd, threadTimeStart, threadTimeEnd)
.setCpuBusyFlag(cpuSampler.isCpuBusy(realTimeStart, realTimeEnd)) // Pass in the CPU information dumped to
.setRecentCpuRate(cpuSampler.getCpuRateInfo()) // Pass in the CPU information dumped to
.setThreadStackEntries(threadStackEntries) // Pass in the stack information dumped to
.flushString();
// Save the delay information
LogWriter.save(blockInfo.toString());
// If there is an interceptor, its onBlock() method is executed. Remember the interceptor we added earlier
if(mInterceptorChain.size() ! =0) {
for (BlockInterceptor interceptor : mInterceptorChain) {
interceptor.onBlock(getContext().provideContext(), blockInfo);
}
}
}
}
},
getContext().provideBlockThreshold(), // We set the holdup threshold
getContext().stopWhenDebugging())); // Whether to stop in debug mode, the default return true, because debug mode is common
LogWriter.cleanObsolete();
}
Copy the code
Next we’ll look at the LooperMonitor class:
Println () = println()
class LooperMonitor implements Printer {
// The parameters are: the immediate delay callback, the delay threshold, and whether to stop in debug mode
public LooperMonitor(BlockListener blockListener, long blockThresholdMillis, boolean stopWhenDebugging) {
if (blockListener == null) {
throw new IllegalArgumentException("blockListener should not be null.");
}
mBlockListener = blockListener;
mBlockThresholdMillis = blockThresholdMillis;
mStopWhenDebugging = stopWhenDebugging;
}
// The core function
@Override
public void println(String x) {
// Stops in debug mode
if (mStopWhenDebugging && Debug.isDebuggerConnected()) {
return;
}
// Use a Boolean value to determine whether it is before or after execution
if(! mPrintingStarted) {// Record the start time
mStartTimestamp = System.currentTimeMillis();
mStartThreadTimestamp = SystemClock.currentThreadTimeMillis();
mPrintingStarted = true;
// Start to dump stack and CPU information
startDump();
} else {
// Record the end time
final long endTime = System.currentTimeMillis();
mPrintingStarted = false;
// Detect stalling and callback
if (isBlock(endTime)) {
notifyBlockEvent(endTime);
}
/ / stop the dumpstopDump(); }}// Whether it is stuck
private boolean isBlock(long endTime) {
// The time difference is greater than the threshold we pass in
return endTime - mStartTimestamp > mBlockThresholdMillis;
}
// Caton callback
private void notifyBlockEvent(final long endTime) {
final long startTime = mStartTimestamp;
final long startThreadTime = mStartThreadTimestamp;
final long endThreadTime = SystemClock.currentThreadTimeMillis();
HandlerThreadFactory.getWriteLogThreadHandler().post(new Runnable() {
@Override
public void run(a) {
// Here is the callback tomBlockListener.onBlockEvent(startTime, endTime, startThreadTime, endThreadTime); }}); }/ / the dump
private void startDump(a) {
// dump stack information
if (null! = BlockCanaryInternals.getInstance().stackSampler) { BlockCanaryInternals.getInstance().stackSampler.start(); }// Dump CPU information
if (null != BlockCanaryInternals.getInstance().cpuSampler) {
BlockCanaryInternals.getInstance().cpuSampler.start();
}
}
/ / end dump
private void stopDump(a) {
if (null! = BlockCanaryInternals.getInstance().stackSampler) { BlockCanaryInternals.getInstance().stackSampler.stop(); }if (null! = BlockCanaryInternals.getInstance().cpuSampler) { BlockCanaryInternals.getInstance().cpuSampler.stop(); }}}Copy the code
BlockCanary. Start () creates LooperMonitor and stackSampler and cpuSampler classes to grab stack and CPU information. Dump will start and record the start time. When message is finished, dump will stop and record the end time. Then the difference between the end time and the start time will be used. BlockInfo is constructed from the dumped stack information and CPU information and passed to the developer via callbacks.
Now let’s take a look at the code that dumps stack and CPU information, starting with their parent class AbstractSampler, and the entry function start() inside:
abstract class AbstractSampler {
protected AtomicBoolean mShouldSample = new AtomicBoolean(false);
// This is the entry function
public void start(a) {
// Use an atomic variable to avoid repeated startup
if (mShouldSample.get()) {
return;
}
mShouldSample.set(true);
// Remove the previous one
HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
// Post new. Note that the second parameter is 0.8 times the dump interval we set in AppBlockCanaryContextHandlerThreadFactory.getTimerThreadHandler().postDelayed(mRunnable,BlockCanaryInternals.getInstance().getSampleDelay()); }// Corresponding stop function
public void stop(a) {
if(! mShouldSample.get()) {return;
}
mShouldSample.set(false); HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable); }}Copy the code
We see that it does this by Posting a runnable, and then we look at the runnable:
protected long mSampleInterval;
private Runnable mRunnable = new Runnable() {
@Override
public void run(a) {
// Core function: doSample();
doSample();
if (mShouldSample.get()) {
// Post out againHandlerThreadFactory.getTimerThreadHandler().postDelayed(mRunnable, mSampleInterval); }}};Copy the code
As you can see, doSample() is called in a loop, depending on the dump interval we set inside AppBlockCanaryContext.
So let’s look at the core function doSample(), which is an overloaded function, and look at the implementation in StackSampler first
@Override
protected void doSample(a) {
private static final int DEFAULT_MAX_ENTRY_COUNT = 100;
private int mMaxEntryCount = DEFAULT_MAX_ENTRY_COUNT;
private static final LinkedHashMap<Long, String> sStackMap = new LinkedHashMap<>();
StringBuilder stringBuilder = new StringBuilder();
// Iterate over the current thread's StackTrace to generate a String
for (StackTraceElement stackTraceElement : mCurrentThread.getStackTrace()) {
stringBuilder
.append(stackTraceElement.toString())
.append(BlockInfo.SEPARATOR);
}
// Add each StackTrace to the sStackMap using lRU
synchronized (sStackMap) {
// mMaxEntryCount defaults to 100
if (sStackMap.size() == mMaxEntryCount && mMaxEntryCount > 0) {
sStackMap.remove(sStackMap.keySet().iterator().next());
}
// Key is the current time value, and value is the StackTrace to which this dump is basedsStackMap.put(System.currentTimeMillis(), stringBuilder.toString()); }}Copy the code
The core logic is to get the current thread’s StackTrace and save it to the map, up to the last 100, where key is the time value and value is StackTrace.
Remember how we got the stack information in onBlockEvent()? Yes, yes
stackSampler.getThreadStackEntries(realTimeStart, realTimeEnd)
Copy the code
It is implemented in StackSampler as follows:
// Find the result between startTime and endTime in the sStackMap we saved above and return it in the List.
public ArrayList<String> getThreadStackEntries(long startTime, long endTime) {
ArrayList<String> result = new ArrayList<>();
synchronized (sStackMap) {
for (Long entryTime : sStackMap.keySet()) {
if(startTime < entryTime && entryTime < endTime) { result.add(BlockInfo.TIME_FORMATTER.format(entryTime) + BlockInfo.SEPARATOR + BlockInfo.SEPARATOR + sStackMap.get(entryTime)); }}}return result;
}
Copy the code
The implementation is as simple as looking in a sStackMap for results between startTime and endTime, and storing the results as a List to return.
Now let’s look at the implementation of doSample() in CpuSampler:
@Override
protected void doSample(a) {
BufferedReader cpuReader = null;
BufferedReader pidReader = null;
try {
// Read the "/proc/stat" file
cpuReader = new BufferedReader(new InputStreamReader(
new FileInputStream("/proc/stat")), BUFFER_SIZE);
// Get the CPU rate from the "/proc/stat" file
String cpuRate = cpuReader.readLine();
if (cpuRate == null) {
cpuRate = "";
}
// Get the process ID
if (mPid == 0) {
mPid = android.os.Process.myPid();
}
// Obtain the /proc/mpid/stat file corresponding to the process based on the process ID
pidReader = new BufferedReader(new InputStreamReader(
new FileInputStream("/proc/" + mPid + "/stat")), BUFFER_SIZE);
// To get the CPU rate of the process
String pidCpuRate = pidReader.readLine();
if (pidCpuRate == null) {
pidCpuRate = "";
}
// Parse the data
parse(cpuRate, pidCpuRate);
} catch (Throwable throwable) {
Log.e(TAG, "doSample: ", throwable);
} finally {
try {
if(cpuReader ! =null) {
cpuReader.close();
}
if(pidReader ! =null) { pidReader.close(); }}catch (IOException exception) {
Log.e(TAG, "doSample: ", exception); }}}Copy the code
Parse () : /proc/mpid/stat () : /proc/mpid/stat () : /mpid/stat ()
// Save a maximum of 10 data items
private static final int MAX_ENTRY_COUNT = 10;
// Used to store CPU information
private final LinkedHashMap<Long, String> mCpuInfoEntries = new LinkedHashMap<>();
private void parse(String cpuRate, String pidCpuRate) {
// Convert to an array
String[] cpuInfoArray = cpuRate.split("");
if (cpuInfoArray.length < 9) {
return;
}
// Parse the subscripts one by one
long user = Long.parseLong(cpuInfoArray[2]);
long nice = Long.parseLong(cpuInfoArray[3]);
long system = Long.parseLong(cpuInfoArray[4]);
long idle = Long.parseLong(cpuInfoArray[5]);
long ioWait = Long.parseLong(cpuInfoArray[6]);
long total = user + nice + system + idle + ioWait
+ Long.parseLong(cpuInfoArray[7])
+ Long.parseLong(cpuInfoArray[8]);
String[] pidCpuInfoList = pidCpuRate.split("");
if (pidCpuInfoList.length < 17) {
return;
}
long appCpuTime = Long.parseLong(pidCpuInfoList[13])
+ Long.parseLong(pidCpuInfoList[14])
+ Long.parseLong(pidCpuInfoList[15])
+ Long.parseLong(pidCpuInfoList[16]);
// Convert data to String and save
if(mTotalLast ! =0) {
StringBuilder stringBuilder = new StringBuilder();
long idleTime = idle - mIdleLast;
long totalTime = total - mTotalLast;
stringBuilder
.append("cpu:")
.append((totalTime - idleTime) * 100L / totalTime)
.append("%")
.append("app:")
.append((appCpuTime - mAppCpuTimeLast) * 100L / totalTime)
.append("%")
.append("[")
.append("user:").append((user - mUserLast) * 100L / totalTime)
.append("%")
.append("system:").append((system - mSystemLast) * 100L / totalTime)
.append("%")
.append("ioWait:").append((ioWait - mIoWaitLast) * 100L / totalTime)
.append("%]");
// Store data in mCpuInfoEntries. Key is the current time value and is the LRU policy used
synchronized (mCpuInfoEntries) {
mCpuInfoEntries.put(System.currentTimeMillis(), stringBuilder.toString());
if (mCpuInfoEntries.size() > MAX_ENTRY_COUNT) {
for (Map.Entry<Long, String> entry : mCpuInfoEntries.entrySet()) {
Long key = entry.getKey();
mCpuInfoEntries.remove(key);
break; }}}}// Update the data for the next round
mUserLast = user;
mSystemLast = system;
mIdleLast = idle;
mIoWaitLast = ioWait;
mTotalLast = total;
mAppCpuTimeLast = appCpuTime;
}
Copy the code
The logic here is similar to StackSample: obtain CPU information and store it in mCpuInfoEntries. Key is the current time value and value is the String corresponding to CPU information, which is also the Lru policy.
Remember how we got the CPU information in onBlockEvent()? Yes, through cpuSampler. GetCpuRateInfo (), its implementation is as follows:
// Get CPU rate information
public String getCpuRateInfo(a) {
StringBuilder sb = new StringBuilder();
synchronized (mCpuInfoEntries) {
// Returns by directly iterating through mCpuInfoEntries and writing to String
for (Map.Entry<Long, String> entry : mCpuInfoEntries.entrySet()) {
long time = entry.getKey();
sb.append(BlockInfo.TIME_FORMATTER.format(time))
.append(' ') .append(entry.getValue()) .append(BlockInfo.SEPARATOR); }}return sb.toString();
}
Copy the code
This directly converts the CPU information we have stored in mCpuInfoEntries to a String.
IsCpuBusy () is the key logic of isCpuBusy().
conclusion
The core logic of BlockCanary is simple:
- 1 through
Looper
To provide thesetMessageLogging(Printer)
The function passes in a customLooperMonitor
. - 2 in the
Message
Start before executiondump
Thread stack sumcpu
Information. - 3 in
Message
Stop after executiondump
And use the time difference to judge whether there is a lag. - 4 If there is a deadlock, it will
dump
The data is parsed and passed to the developer via callbacks. - 5 developers can use this data to analyze the causes of caton.