Currently, the common practice of the lag monitoring in iOS is to monitor the elapsed time in the iOS Runloop. When the threshold times out, the function call stack of the lag is dumped.

There are many reasons for the main thread of iOS to lag, and the main thread will lead to poor response of APP. At present, the best solution in the industry is to monitor the lag through Runloop. When the threshold times out, the thread call stack of the main thread is dumped.

IOS Runloop basis

Expressed by a big picture of Daiming:

Code can refer to the source code, the basic abstraction is as follows:

Observers create AutoreleasePool: _objc_autoreleasePoolPush(); /// Observers create AutoreleasePool: _objc_autoreleasePoolPush(); __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopEntry); Do {/// 2. Notify Observers that a Timer callback is about to occur. __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopBeforeTimers); Notifying Observers that Source (non-port-based,Source0) callback is about to occur. __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopBeforeSources); __CFRUNLOOP_IS_CALLING_OUT_TO_A_BLOCK__(block); /// 4. Trigger the Source0 (non-port-based) callback. __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__(source0); __CFRUNLOOP_IS_CALLING_OUT_TO_A_BLOCK__(block); /// Observers are in this state torelease and create AutoreleasePool: _objc_autoreleasePoolPop(); _objc_autoreleasePoolPush(); __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopBeforeWaiting); /// 7. Sleep to wait MSG. == mach_msg() -> mach_msg_trap(); Observers, __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopAfterWaiting); /// 8. 9.1 If the Timer is used to wake up, call Timer __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__(Timer); // 9.2 If you are awakened by dispatch, Block __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__(dispatched_block); If Runloop is woken up by a Source1 (port-based) event, Handle this event __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__(source1); } while (...) ; Observers are about torelease AutoreleasePool: _objc_autoreleasePoolPop(); __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__(kCFRunLoopExit); }Copy the code

And the runloop has the following states:

Typedef CF_OPTIONS(CFOptionFlags, CFRunLoopActivity) {kCFRunLoopEntry, // enter loop kCFRunLoopBeforeTimers, // Trigger Timer callback kCFRunLoopBeforeSources, // Trigger Source0 callback kCFRunLoopBeforeWaiting, Mach_port message kCFRunLoopAfterWaiting, mach_port message kCFRunLoopExit, // Exit loop kCFRunLoopAllActivities // loop all state changes}Copy the code

The conventional solution is to register an Observer based on the Runloop status, and if the main thread status exceeds a certain threshold, the main thread is considered stalled!!

Matrix main thread stuck scheme

The main thread lag scheme of Matrix is similar:

  1. Add an Observer at the beginning and end of the Runloop to get the start and end states of the main thread.
  2. When the state of the main thread runs beyond a certain threshold, the main thread is considered stalled and thus marked as a stalled thread.
  3. The default timeout threshold for the main Runloop program is 2 seconds, and the check period for child threads is 1 second. Every second, the child thread checks the running status of the main thread.
  4. If the main thread Runloop is detected running for more than 2 seconds, it is considered stalled and a snapshot of the current thread is obtained.
  5. At the same time, if the CPU usage exceeds 80%, it is considered to be stuck

The code is implemented in WCBlockMonitorMgr:

- (void)addRunLoopObserver {NSRunLoop *curRunLoop = [NSRunLoop currentRunLoop]; // The first observer -> info is not released, so when passing in a structure, use __bridge to CFRunLoopObserverContext context = {0, (__bridge void *)self, NULL, NULL, NULL }; // Begin Observer CFRunLoopObserverRef beginObserver = CFRunLoopObserverCreate(kCFAllocatorDefault, kCFRunLoopAllActivities, YES, LONG_MIN, &myRunLoopBeginCallback, &context); CFRetain(beginObserver); M_runLoopBeginObserver = beginObserver; The last Observer CFRunLoopObserverRef endObserver = CFRunLoopObserverCreate(kCFAllocatorDefault, kCFRunLoopAllActivities, YES, LONG_MAX, &myRunLoopEndCallback, &context); CFRetain(endObserver); M_runLoopEndObserver = endObserver; // Add observer CFRunLoopRef runloop = [curRunLoop getCFRunLoop]; CFRunLoopAddObserver(runloop, beginObserver, kCFRunLoopCommonModes); CFRunLoopAddObserver(runloop, endObserver, kCFRunLoopCommonModes); // for InitializationRunLoopMode CFRunLoopObserverContext initializationContext = { 0, (__bridge void *)self, NULL, NULL, NULL }; m_initializationBeginRunloopObserver = CFRunLoopObserverCreate(kCFAllocatorDefault, kCFRunLoopAllActivities, YES, LONG_MIN, &myInitializetionRunLoopBeginCallback, &initializationContext); CFRetain(m_initializationBeginRunloopObserver); / / the end of the observer m_initializationEndRunloopObserver = CFRunLoopObserverCreate (kCFAllocatorDefault, kCFRunLoopAllActivities, YES, LONG_MAX, &myInitializetionRunLoopEndCallback, &initializationContext); CFRetain(m_initializationEndRunloopObserver); // Specify a separate runloopmode -> the first mode to run after APP starts. In after the completion of the start will no longer use - > in order to monitor the APP start time / / UIInitializationRunLoopMode: when just start the APP the first to enter the first Mode, start after the completion of the will no longer be used. CFRunLoopAddObserver(runloop, m_initializationBeginRunloopObserver, (CFRunLoopMode) @"UIInitializationRunLoopMode"); CFRunLoopAddObserver(runloop, m_initializationEndRunloopObserver, (CFRunLoopMode) @"UIInitializationRunLoopMode"); } // Multiple callback methods -- time stamps for each stateCopy the code

As you can see, Matrix registers multiple key observers in the Runloop and records the execution time when the Runloop reaches a certain state at key locations. Then start a child thread to determine if it is stuck at the specified time!! If caton then dumps the thread snapshot of the key main thread, the logic is shown in the following figure:

- (void)threadProc { ... While (YES) {@autoreleasepool {if (g_bMonitor) {// Check whether the time is stuck! EDumpType dumpType = [self check]; if (m_bStop) { break; } // According to the type of the delay: 1. CPU delay 2. The number of threads exceeds 64 if (dumpType! = EDumpType_Unlag) { if (EDumpType_BackgroundMainThreadBlock == dumpType || EDumpType_MainThreadBlock == dumpType) { // If (g_CurrentThreadCount > 64) {dumpType = EDumpType_BlockThreadTooMuch; [self dumpFileWithType:dumpType]; } else {// main thread is stuck --> dump main thread is stuck stack //... m_potenHandledLagFile = [self dumpFileWithType:dumpType]; } } else if (EDumpType_CPUBlock == dumpType) { // ... Lead to high CPU card's} else {m_potenHandledLagFile = [self dumpFileWithType: dumpType]; }} / /... } for (int nCnt = 0; for (int nCnt = 0; nCnt < m_nIntervalTime && ! m_bStop; nCnt++) { if (g_MainThreadHandle && g_bMonitor) { int intervalCount = g_CheckPeriodTime / g_PerStackInterval; if (intervalCount <= 0) { usleep(g_CheckPeriodTime); } else { for (int index = 0; index < intervalCount && ! m_bStop; index++) { usleep(g_PerStackInterval); // Timed sleep... / / specified time interval - cache the main thread stack frame information [WCGetMainThreadUtil getCurrentMainThreadStack: ^ (NSUInteger PC) {stackArray [nSum] = (uintptr_t)pc; nSum++; } withMaxEntries:g_StackMaxCount withThreadCount:g_CurrentThreadCount]; [m_pointMainThreadHandler addThreadStack:stackArray andStackCount:nSum]; } } } else { usleep(g_CheckPeriodTime); } } if (m_bStop) { break; }}}}Copy the code

As you can see, the method deals with three main things:

  1. Test the type of Caton
  2. The time interval is controlled according to the annealing algorithm
  3. Log the main thread stack

Specifically, it can be summarized as the following process:

  1. The child thread is created by while making it a resident thread
  2. Called in the child thread by the state time recorded in the runloopcheckMethods To determine whether there is a catch
  3. If stuck, determine the type of stuck and different treatment for different reasons
  4. Update the call stack information of the main thread according to the interval time generated by the annealing algorithm

Card immediately, time-consuming stack frame extraction

The most common way to think about this is to dump the main thread call stack frame when a block is detected!! However, the specific time-consuming methods cannot be accurately collected in this way, because the interval time exists, so the instantaneous main thread call stack detected by Carden may not include time-consuming methods.

Therefore, Matrix designed a scheme to cache the last several consecutive call stack snapshots, and then compare the recent snapshots to find the real time method!! For specific logic, please refer to matrix-ios Caton monitoring

Here are some ideas:

  1. The caton monitor periodically retrieves the main thread stack and stores the stack in a circular queue in memory!!
  2. When the main thread detects a deadlock, it retrieves it by retracing the stack saved to the round-robin queueMost recent time stack. The specific process is as follows:
    1. Characterized by the top of the stack function, the same top of the stack function is the same as the whole stack;
    2. The interval for fetching stacks is the same, and the number of stack repeats is approximately the call time of stack. The more repetitions, the more time it takes.
    3. There can be multiple stacks that repeat the same number of times, and the nearest stack takes the most time.
  3. After obtaining the time stack, passKSCrashDump stack information to a file!!

reference

Advanced: iOS performance optimization series

Matrix-iOS

Cloud.tencent.com/developer/a…

Cloud.tencent.com/developer/a…

Juejin. Cn/post / 684490…

Blog.ibireme.com/2015/05/18/…

Time.geekbang.org/column/arti…