ANR basics

The monitoring principle of ANR (Application Not Responding) is essentially a message mechanism. A delay message is set and the ANR is triggered if the timeout is Not removed. The specific logical processing is performed on the system server, including sending timeout messages, removing timeout messages, processing timeout messages and displaying ANR pop-up boxes. For app, ANR is triggered when the main thread is blocked.

classification

In many cases, the ANR has nothing to do with the component being executed (ContentProvider, Broadcast, Service) itself.

  1. Service ANR: Foreground 20s, background 200s; StartForeground timeout 10 s
  2. Broadcast ANR: 10s in the foreground and 60s in the background
  3. Input ANR: The button or touch event will not respond within 5s
  4. ContentProvider ANR: 10s, rare

Common reasons for

  1. Main thread Binder time
  2. Main thread I/O time
  3. A deadlock
  4. The lock time of the main thread
  5. Slow activity Resume/slow focus window switch (for Input ANR)
  6. The main thread has too many messages to execute, causing the corresponding component message execution delay.
  7. The execution of messages in front of the main thread takes time, causing subsequent messages to delay.
  8. It depends on the state of the phone, memory/CPU, etc
  9. StartForeground not executed in onStartCommand of Service

How to analyze

steps

  1. Determine basic information about anR

First search am_anr to determine anR occurrence time, type, timeout, process, etc.

03-04 13:45:03.781 1000 1199 14667 I am_anr: [0, 0290 com. Android systemui, 551042573, Input dispatching timed out (StatusBar, b49ba2b StatusBar (server) is not responding. Waited 8001ms for MotionEvent(action=DOWN))]

  1. View the traces file

Check the traces file for the corresponding process and point in time in /data/anr. If it is clear that the main thread is Blocked and takes a long time to cut utM/STM, use the stack to determine the cause.

  1. View the event logs and main logs of the corresponding time range

In many cases, however, the stack at which the ANR occurred is not properly captured, so additional logs are needed to determine the cause. Search “ANR in” to search pid information (within 8s after 03-04 13:45:03.781)

The key word

  1. Slow Operation /am_lifecycle_sample
  2. Slow Looper main
  3. Slow Binder /binder_sample
  4. dvm_lock_sample

Common instance

The I/o time consuming

The I/O time takes a long time to execute 110 messages, which causes that the messages cannot be executed in a long time

06-30 15:11:11. 411 1000 1895 8935 I am_anr: [1, 0583 com. Android. Settings, 952745541, Broadcast of Intent { act=android.hardware.usb.action.USB_STATE flg=0x31000010 CMP = com. Android. Settings /. Connecteddevice. Usb. UsbModeChooserReceiver (from the extras)}] 06-30 15:11:16. 453 1000 5831 5831 I Am_lifecycle_sample: [0, com. Android Settings, 110149, 98] 06-30 15:11:16. 454, 1000, 5831, 5831 W stars: PerfMonitor longMsg :seq=2 plan=15:11:01.454 late=2ms wall=14998ms running=77ms runnable=55ms IO =14144ms H = android. App. ActivityThread $h w = 110 06-30 15:11:16 procState = 2. 469, 1000, 5831, 5831 D UsbModeChooserReceiver: action = android.hardware.usb.action.USB_STATE

The startFrouground of Service is used improperly

StartForeground normally needs to be called in the onStartCommand method, not in the onCreate function of the Service. Because if your onStartCommand returns START_NOT_STICKY, the onCreate method will not be called if the process is killed and restarted, but ANR will still be used.

After the process crash, the Service will not be started immediately. Instead, the Service will be started by calculating the delay time. Here the delay time is 6268ms, and the SMS process will be started by MxActivateService after 6s in the log

05-31 19:35:17.316 1760 8998 I am_proc_died: [0279-08, com. Android. MMS, 200, 4] 05-31 19:35:17. 406 1760 8998 I am_schedule_service_restart [0, com. Android. MMS/com. Xiaomi. MMS. Transaction. MxActivateService, 6268] 05-31 19:35:23. 812 1760 1823 I am_proc_start: [0162 06101-04, com. Android. MMS, service, \ {com. Android. MMS/com. Xiaomi. MMS. Transaction. MxActivateService}] 05-31 19:35:24.031 1760 3001 I am_proc_bound: [0,16206,com.android.mms]

The timeout duration of startStartForeground ANR is 10s, and the start time is when the System server post app executes the onStartCommand of Service. ANR occurs if the startForeground method is not called within 10 seconds after onStartCommand is executed.

05-31 19:35:24.500 1760 16269 I am_anr: [06, 0162 com. Android. MMS, 952745573, the Context, startForegroundService () did not then call Service. StartForeground () : ServiceRecord{3201640 u0 com.android.mms/com.xiaomi.mms.transaction.MxActivateService}]

The lock time of the main thread

Within 8s after ANR occurred, the home main thread had an equal lock for 11s. As a result, the input event was not processed in time.

02-07 16:24:51. 941 1000 1237 8731 I am_anr: [5, 0185 com. Beautiful miui. Home, 953794117, Input dispatching timed out (com.miui.home/com.miui.home.recents.settings.NavigationBarTypeActivity, 1d271ca com.miui.home/com.miui.home.recents.settings.NavigationBarTypeActivity (server) is not responding. Waited 8000ms For FocusEvent(hasFocus=false)] 02-07 16:24:52.566 10074 1855 1855 I DVM_lock_SAMPLE: [com.miui.home,1,main,11259,Launcher.java,7312,void com.miui.home.launcher.Launcher.bindAppsChanged(java.util.List, java.util.List, java.util.List),LauncherModel.java,1468,void com.miui.home.launcher.LauncherModel$LoaderTask.run(),100]

Binder time-consuming

App queries the Content provider of another process, binder takes more than 5s

06-25 16:30:49. 584 1000 1894 16604 I am_anr: [1, 0617 com. Beautiful miui. Cloudservice, 818429509, Input dispatching timed out (com.miui.cloudservice/com.miui.cloudservice.ui.MiCloudMainActivity, 7cfa4a0 com.miui.cloudservice/com.miui.cloudservice.ui.MiCloudMainActivity(server) is not responding. Waited 5002ms for FocusEvent(hasFocus=false)] 06-25 16:30:49.872 10141 6171 6171 W BpBinder: PerfMonitor binderTransact: Time = 5416 ms interface = android. Content. 06-25 16:30:49 IContentProvider code = 1, 893, 10141, 6171, 6171 W Activity: Slow Operation: Activity com.miui.cloudservice/.ui.ShareLocationProxyActivity onCreate took 5440ms

The pid of the peer process is 7467, and isLocked holds the lock for more than 5s. As a result, isOpen isLocked

outgoing transaction: from 6171: 6171 to 7467:14336 context:binder code: 1 duration: 5.12s 06-25 16:30:49.871 9810 7467 14336 W Aomi. Finddevic: Long monitor contention with owner IntentService[FindDeviceAsyncControlService] (16551) at boolean com.xiaomi.finddevice.v2.FindDeviceStatusManagerInternal.isLocked(android.content.Context(FindDeviceStatusManagerInterna L.j ava: 85) waiters = 1 in Boolean com. Xiaomi. Finddevice. V2. FindDeviceStatusManagerInternal. IsOpen () for 5.415 s

A complex scenario

The app main thread equal-lock 0x083A79BD is Blocked

"main" prio=5 tid=1 Blocked // The main thread state is Blocked
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x70d8db70 self=0xecabae00
  | sysTid=15494 nice=-10 cgrp=default sched=0/0 handle=0xed020470
  | state=S schedstat=( 391677894 98428857 812 ) utm=30 stm=8 core=4 HZ=100
  | stack=0xff64f000-0xff651000 stackSize=8192KB
  | held mutexes=
  at android.app.ActivityThread.acquireProvider(ActivityThread.java:7237)
  - waiting to lock <0x083a79bd> (a android.app.ActivityThread$ProviderKey) held by thread 59 / / 0 x083a79bd lock
  at android.app.ContextImpl$ApplicationContentResolver.acquireUnstableProvider(ContextImpl.java:2954)
  at android.content.ContentResolver.acquireUnstableProvider(ContentResolver.java:2509)
Copy the code

The lock 0x083A79BD is held by the ad-SingleThread thread, which is currently waiting for a response from the Binder peer

"Ad-SingleThread" prio=5 tid=59 Native
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x151c8298 self=0xcd7c2800
  | sysTid=15697 nice=0 cgrp=default sched=0/0 handle=0xb3be61c0
  | state=S schedstat=( 6271455 10639325 58 ) utm=0 stm=0 core=5 HZ=100
  | stack=0xb3ae3000-0xb3ae5000 stackSize=1040KB
  | held mutexes=
  native: #00 pc 0009b220  /apex/com.android.runtime/lib/bionic/libc.so (__ioctl+8)
  native: #01 pc 00069127  /apex/com.android.runtime/lib/bionic/libc.so (ioctl+26)
  native: #02 pc 00039f1b  /system/lib/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+238)
  native: #03 pc 0003ab8b  /system/lib/libbinder.so (android::IPCThreadState::waitForResponse(android::Parcel*, int*) +86) // etc binder peer end returns
  native: #04 pc 0003a92b  /system/lib/libbinder.so (android::IPCThreadState::transact(int, unsigned int, android::Parcel const&, android::Parcel*, unsigned int) +126)
  native: #05 pc 00035563  /system/lib/libbinder.so (android::BpBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int) +142)
  native: #06 pc 000cc93d  /system/lib/libandroid_runtime.so (android_os_BinderProxy_transact(_JNIEnv*, _jobject*, int, _jobject*, _jobject*, int) +216)
  at android.os.BinderProxy.transactNative(Native method)
  at android.os.BinderProxy.transact(BinderProxy.java:595)
  at android.app.IActivityManager$Stub$Proxy.getContentProvider(IActivityManager.java:6002) // Communicate with AMS Binder
  at android.app.ActivityThread.acquireProvider(ActivityThread.java:7264)
  - locked <0x006b0aac> (a android.app.ActivityThread$ProviderAcquiringCount)
  - locked <0x083a79bd> (a android.app.ActivityThread$ProviderKey) / / 0 x083a79bd a lock
  at android.app.ContextImpl$ApplicationContentResolver.acquireUnstableProvider(ContextImpl.java:2954)
Copy the code

Check the System Server process in traces and search getContentProvider for the corresponding binder thread

"Binder:1606_C" prio=5 tid=162 TimedWaiting // The thread is in wait state
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x137a04b0 self=0xb4000071741c4000
  | sysTid=2842 nice=0 cgrp=default sched=0/0 handle=0x7160821cc0
  | state=S schedstat=( 159202795207 108681383188 879023 ) utm=10372 stm=5548 core=5 HZ=100 // The utM is more than 10s, which is abnormal
  | stack=0x716072a000-0x716072c000 stackSize=995KB
  | held mutexes=
  at java.lang.Object.wait(Native method)
  - waiting on <0x008a756c> (a com.android.server.am.ContentProviderRecord) // wait
  at java.lang.Object.wait(Object.java:442)
  at com.android.server.am.ActivityManagerService.getContentProviderImpl(ActivityManagerService.java:8014) 
  - locked <0x008a756c> (a com.android.server.am.ContentProviderRecord) / / 0 x008a756c a lock
  at com.android.server.am.ActivityManagerService.getContentProvider(ActivityManagerService.java:8148) // The binder thread executes getContentProvider
  at android.app.IActivityManager$Stub.onTransact(IActivityManager.java:2523)
  at com.android.server.am.ActivityManagerService.onTransact(ActivityManagerService.java:3147)
  at android.os.Binder.execTransactInternal(Binder.java:1160)
  at android.os.Binder.execTransact(Binder.java:1129)
Copy the code

Find the code corresponding to the stack above in AMS. This code is waiting for the release of the corresponding provider. If the provider process is not started, the process will be pulled first. The timeout period is 20s.

// Wait for the provider to be published...
7991        final long timeout =
7992                SystemClock.uptimeMillis() + ContentResolver.CONTENT_PROVIDER_READY_TIMEOUT_MILLIS;
7993        boolean timedOut = false;
7994        synchronized (cpr) { / / the lock.8007                try {
8008                    final long wait = Math.max(0L, timeout - SystemClock.uptimeMillis());
8012                    if(conn ! =null) {
8013                        conn.waiting = true;
8014                    }
8015                    cpr.wait(wait); // Wait for the lock
8016                    if (cpr.provider == null) {
8017                        timedOut = true;
8018                        break;
8019                    }
8020                } catch (InterruptedException ex) {
8021.Copy the code

Combined with the event log of the peer process (where the Content Provider resides) at the corresponding point in time

06-23 21:00:05. 484 1000 1606 1749 I am_proc_start [0152, 87100, com. Beautiful miui. Securitycenter. Remote, content The provider, {com. Beautiful miui. Securitycenter/com. Beautiful miui. Idprovider. Idprovider}] 06-23 21:00:05. 555 1000 1606 2891 I am_proc_bound: [0152, 87, com. Beautiful miui. Securitycenter. Remote] 06-23 21:00:05. 656 1000 1606 2891 I am_kill: [0152, 87, com. Beautiful miui. Securitycenter. Remote, 100, the stop com. Beautiful miui. Securitycenter due to the from Process: com. Beautiful miui. Securitycenter. 06-23 21:00:05 remote]. 688, 1000, 1606, 1748 I am_wtf: [0615 5, system_server, 1, ActivityManager, a Timeout waiting for the provider com. Beautiful miui. Securitycenter / 1000 for the provider Com. Beautiful miui. Idprovider providerRunning = false caller = com. Beautiful miui. 06-23 21:00:16 systemAdSolution / 10111]. 592, 10102, 15899, 15950 W BpBinder: PerfMonitor binderTransact : Time = 5464 ms interface = android. App. 06-23 21:00:16 IActivityManager code = 22, 591, 10153, 3966, 4704 W BpBinder: PerfMonitor binderTransact : Time =10529ms interface=android.app**.IActivityManager** code=22 06-23 21:00:16.592 10153 3966 4704 E ActivityThread: Failed to find provider info for com.miui.idprovider

In conclusion: When the ContentProvider pulls up the process, the process is killed just before the Publish Provider is published. Therefore, getContentProviderImpl takes time to wait, which leads to bINDEt time, app terminal thread lock holding time, and main thread lock time, and ANR occurs.

ANR principle

Service ANR

Before service processes are executed, the System server sends a delay message to the main Handler. If the APP segment does not notify the System Server segment to remove the delay message in the corresponding delay message, the app segment executes a timeout message and triggers anR. SERVICE_TIMEOUT_MSG delay The delay is 20s for foreground service and 200s for background service


void scheduleServiceTimeoutLocked(ProcessRecord proc) {
       if (proc.executingServices.size() == 0 || proc.thread == null) {
           return;
       }
       Message msg = mAm.mHandler.obtainMessage(
               ActivityManagerService.SERVICE_TIMEOUT_MSG);
       msg.obj = proc;
       mAm.mHandler.sendMessageDelayed(msg,
               proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
   }
Copy the code

The ANR of a service is summarized as three processes in the following table:

Cause/Process ActiveServices Service timeout
create realStartServiceLocked onCreate 20s/200s
start sendServiceArgs onStartCommand 20s/200s
foreground sendServiceArgs onStartCommand 10s

Create a service

Sending delay messages

Send delay message in ActiveService realStartServiceLocked method, foreground 20s, background 200s; Then notify the APP to execute the create Service process.

    private void realStartServiceLocked(ServiceRecord r, ProcessRecord app,
            IApplicationThread thread, int pid, UidRecord uidRecord, boolean execInFg,
            boolean enqueueOomAdj) throws RemoteException {...// Send delay message first
        bumpServiceExecutingLocked(r, execInFg, "create".null /* oomAdjReason */); .boolean created = false;
        try{...// Bind app to execute Service onCreate
            thread.scheduleCreateService(r, r.serviceInfo,
                    mAm.compatibilityInfoForPackage(r.serviceInfo.applicationInfo),
                    app.mState.getReportedProcState());
            r.postNotification();
            created = true; . }catch (DeadObjectException e) {
Copy the code

Remove delay messages

After receiving the system Server’s Binder callback to the scheduleCreateService, the APP sends a message to the Handler of the main thread. After the onCreate method of the Service is executed, the app sends a message to the Handler of the main thread. By binder serviceDoneExecuting told system, server has been completed in removing MainHandler related delay news.

Start the service

Sending delay messages

The system server posts a SERVICE_TIMEOUT_MSG delay message to MainHandler before the start service. The sendServiceArgsLocked method is used by the system server. The corresponding app side is the onStartCommand method.

When startService is created, call sendServiceArgsLocked if service exists. Otherwise, wait until service create before calling the method.

    private final void sendServiceArgsLocked(ServiceRecord r, boolean execInFg,
            boolean oomAdjusted) throws TransactionTooLargeException {
        ArrayList<ServiceStartArgs> args = new ArrayList<>();

        while (r.pendingStarts.size() > 0) {
            ServiceRecord.StartItem si = r.pendingStarts.remove(0); .// Send a timeout message related to start
            bumpServiceExecutingLocked(r, execInFg, "start".null /* oomAdjReason */);
            if(r.fgRequired && ! r.fgWaiting) {if(! r.isForeground) {// If the foreground is service and isForeground is not assigned, the foreground timeout message is sent
                    scheduleServiceForegroundTransitionTimeoutLocked(r);
                } else {
                    r.fgRequired = false; }}...try {
        // Binder notifies app to execute onStartCommand
            r.app.getThread().scheduleServiceArgs(r, slice);
        } catch (TransactionTooLargeException e) {
Copy the code

Remove delay messages

After receiving the System Server’s Binder callback, the app sends a message to the main thread Handler and waits for the message to be executedonStartCommandMethod is executed through binderserviceDoneExecutingNotifies the system server that the execution is complete and removes the delay messages in the MainHandler.

The foreground service calls startForeground timeout

Sending delay messages

Call startForegroundService to start the foreground Service, Post a SERVICE_FOREGROUND_TIMEOUT_MSG message of delay 10s to MainHandler before onStartCommand is invoked

If the client is not invoked within 10 secondsstartForeground, triggers ANR

Remove delay messages

The client calls startForeground

Service. StartForeground – AMS. SetServiceForeground – ActiveServices. SetServiceForegroundLocked – > SetServiceForegroundInnerLocked – MainHandler. RemoveMessages

Execute a timeout message, triggering ANR

  • If binder time-consuming (relatively rare);
  • It takes time for the main thread of app to execute some messages, resulting in service related messages not being executed.
  • Or the periodic methods onCreate and onStartCommand of the Service itself perform time-consuming operations.

If the APP fails to call serviceDoneExecuting by binder within 20s (Foreground Service), 200s (background Service) or 10s (startForeground service), MainHandler will execute the delay message. And call appNotResponding method, resulting in ANR phenomenon.

MainHandler. – > ActiveServices serviceTimeout (serviceForegroundTimeout) – > ProcessRecord. AppNotResponding

Broacast timeout

  1. Sending delay messages

The foreground broadcasts for 10s and the background broadcasts for 60s

BroadcastHandler – BroadcastQueue. ProcessNextBroadcast – processNextBroadcastLocked – setBroadcastTimeoutLocked – > BroadcastHandler sendMessageAtTime

  1. Remove delay messages

Binder to AMS. FinishReceiver – BroadcastQueue. ProcessNextBroadcastLocked – cancelBroadcastTimeoutLocked – BroadcastHandler.removeMessages

  1. Execute delay message

BroadcastHandler – BroadcastQueue. BroadcastTimeoutLocked – AppNotResponding. The run – > ProcessRecord. AppNotResponding

ContentProvider timeout

  1. Sending delay messages

Binder to AMS. AttachApplication – attachApplicationLocked – MainHandler. SendMessageDelayed

static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG = 57;      
if(providers ! =null && checkAppInLaunchingProvidersLocked(app)) {
    Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);
    msg.obj = app;
    mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_PUBLISH_TIMEOUT);
}
Copy the code
  1. Remove delay messages

Binder to AMS. PublishContentProviders – MainHandler. RemoveMessages

  1. Execute delay message

Kill the process directly and clean up the associated information, Won’t play ANR MainHandler window – > processContentProviderPublishTimedOutLocked – cleanupAppInLaunchingProvidersLocked – play ProcessList.removeProcessLocked

Input the timeout

void InputDispatcher::processNoFocusedWindowAnrLocked() {
    // Check if the application that we are waiting for is still focused.
    sp<InputApplicationHandle> focusedApplication =
            getValueByKey(mFocusedApplicationHandlesByDisplay, mAwaitedApplicationDisplayId);
    if(focusedApplication == nullptr || focusedApplication->getApplicationToken() ! = mAwaitedFocusedApplication->getApplicationToken()) {// Unexpected because we should have reset the ANR timer when focused application changed
        ALOGE("Waited for a focused window, but focused application has already changed to %s",
              focusedApplication->getName().c_str());
        return; // The focused application has changed.
    }
 
    const sp<InputWindowHandle>& focusedWindowHandle =
            getFocusedWindowHandleLocked(mAwaitedApplicationDisplayId);
    if(focusedWindowHandle ! = nullptr) {return; // We now have a focused window. No need for ANR.
    }
    onAnrLocked(mAwaitedFocusedApplication);
}
Copy the code

The stack is reported as follows:

inputDispatchingTimedOut:20657, ActivityManagerService (com.android.server.am)
inputDispatchingTimedOut:20309, ActivityManagerService$LocalService (com.android.server.am)
keyDispatchingTimedOut:6334, ActivityRecord (com.android.server.wm)
notifyANRInner:261, InputManagerCallback (com.android.server.wm)
notifyANR:179, InputManagerCallback (com.android.server.wm)
notifyANR:2300, InputManagerService (com.android.server.input)

notifyANR:535DoNotifyAnrLockedInterruptible, InputDispatcher (inputflinger dispatcher) :4902OnAnrLocked, InputDispatcher (inputflinger dispatcher) :4833ProcessNoFocusedWindowAnrLocked, InputDispatcher (inputflinger dispatcher) :535, InputDispatcher (inputflinger dispatcher)Copy the code