“This is the third day of my participation in the November Gwen Challenge. See details of the event: The last Gwen Challenge 2021”.

The full name of ANR Applicatipon No Response refers to the loss of Response of the application. The first reaction of many people is to analyze ANR by combining the ANR file under data/ ANR /trace. TXT with the system log. But with all this information can we pinpoint anR?

What happens ANR

Note that the Activity’s onCreate method does not trigger ANR for time-consuming operations (check source code does not find the corresponding monitor).

System ANR monitoring

In Toutiao ANR Optimization Practice series – Design Principles and Influencing Factors, the monitoring process of ANR is introduced by taking broadcasting as an example. Here, service is used as an example to illustrate

The service starts with a call to ActiveServices#realStartServiceLocked

private final void realStartServiceLocked(ServiceRecord r, ProcessRecord app, boolean execInFg) throws RemoteException { ... / / send a message Boot to create the service time bumpServiceExecutingLocked (r, execInFg, "create"); . / / call to the corresponding application through ApplicationThread agent create service app. Thread. ScheduleCreateService (r, r.s erviceInfo, mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo), app.repProcState); }Copy the code

ActivityThread#scheduleCreateService()

public final void scheduleCreateService(IBinder token,
                ServiceInfo info, CompatibilityInfo compatInfo, int processState) {
            updateProcessState(processState, false);
            CreateServiceData s = new CreateServiceData();
            s.token = token;
            s.info = info;
            s.compatInfo = compatInfo;
            sendMessage(H.CREATE_SERVICE, s);
        }
        
        void sendMessage(int what, Object obj) {
        sendMessage(what, obj, 0, 0, false);
    }
     
private void sendMessage(int what, Object obj, int arg1, int arg2, boolean async) {
        if (DEBUG_MESSAGES) Slog.v(
            TAG, "SCHEDULE " + what + " " + mH.codeToString(what)
            + ": " + arg1 + " / " + obj);
        Message msg = Message.obtain();
        msg.what = what;
        msg.obj = obj;
        msg.arg1 = arg1;
        msg.arg2 = arg2;
        if (async) {
            msg.setAsynchronous(true);
        }
        mH.sendMessage(msg);
    }
Copy the code

According to app Caton Series I: Handler synchronization barrier, we know that synchronous messages are used at this time. If there are time-consuming tasks or a large number of messages to be processed in the current message queue, subsequent processes may not be scheduled in time.

The process of creating the service is eventually handled by ActivityThread#handleCreateService.

private void handleCreateService(CreateServiceData data) { // If we are getting ready to gc after going to the background, well // we are back active so skip it. unscheduleGcIdler(); LoadedApk packageInfo = getPackageInfoNoCheck( data.info.applicationInfo, data.compatInfo); Service service = null; try { java.lang.ClassLoader cl = packageInfo.getClassLoader(); / / create the Service object Service = packageInfo getAppFactory () instantiateService (cl, data. Info. Name, data. Intent); } catch (Exception e) { if (! mInstrumentation.onException(service, e)) { throw new RuntimeException( "Unable to instantiate service " + data.info.name + ": " + e.toString(), e); } } try { if (localLOGV) Slog.v(TAG, "Creating service " + data.info.name); ContextImpl context = ContextImpl.createAppContext(this, packageInfo); context.setOuterContext(service); Application app = packageInfo.makeApplication(false, mInstrumentation); service.attach(context, this, data.info.name, data.token, app, ActivityManager.getService()); // Call the Service lifecycle method onCreate service.oncreate (); mServices.put(data.token, service); Try {/ / inform ActivityManagerService lifted the anr ActivityManager. GetService () serviceDoneExecuting (data) token, SERVICE_DONE_EXECUTING_ANON, 0, 0); } catch (RemoteException e) { throw e.rethrowFromSystemServer(); } } catch (Exception e) { if (! mInstrumentation.onException(service, e)) { throw new RuntimeException( "Unable to create service " + data.info.name + ": " + e.toString(), e); }}}Copy the code

Detect and remove detection mainly through bumpServiceExecutingLocked ANR and serviceDoneExecuting to complete, implementation is as follows

private final void bumpServiceExecutingLocked(ServiceRecord r, boolean fg, String why) { ... scheduleServiceTimeoutLocked(r.app); . }Copy the code
void scheduleServiceTimeoutLocked(ProcessRecord proc) { if (proc.executingServices.size() == 0 || proc.thread == null) {  return; } Message msg = mAm.mHandler.obtainMessage( ActivityManagerService.SERVICE_TIMEOUT_MSG); msg.obj = proc; mAm.mHandler.sendMessageDelayed(msg, proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT); }Copy the code

As you can see, service ANR detection is handled by the mHandler of ActivityManagerService that sends a delayed message. If the message is not removed by a specified time, the ANR will be exploded. And ActivityManagerService# serviceDoneExecuting will eventually call ActiveServices# serviceDoneExecutingLocked to remove the message message.

The timeout logic of a Service is similar to that of a broadcast:

Image from: Toutiao ANR Optimization Practice series – Design Principles and influencing Factors

From this we can draw the conclusion that:

System services (AMS, InputService) initiate asynchronous timeout monitoring after sending messages with timeout attributes, such as Service, Receiver, Input events, to target processes through Binder or other IPC. And monitoring of this nature is a black box monitoring, is not really monitor messages sent whether during the execution of a real time out, whether to send the message that is system have implemented, or the real execution process takes how long, as long as the monitoring before the arrival of overtime, the server is not received notice, then judged to be timed out.

The capture of the ANR

In the case of a Service, ActiveServices#serviceTimeout is triggered if the ANR monitoring message is not removed at the specified time. In this method, AppErrors#appNotResponding is called if the ANR cause has been collected

final void appNotResponding(ProcessRecord app, ActivityRecord activity,
            ActivityRecord parent, boolean aboveSystem, final String annotation) {
       //省略代码
       //判断 是不是正在关机  正在处理anr app崩溃 进程被ActivityManagerService杀掉等场景
        synchronized (mService) {
            // PowerManager.reboot() can block for a long time, so ignore ANRs while shutting down.
            if (mService.mShuttingDown) {
                Slog.i(TAG, "During shutdown skipping ANR: " + app + " " + annotation);
                return;
            } else if (app.notResponding) {
                Slog.i(TAG, "Skipping duplicate ANR: " + app + " " + annotation);
                return;
            } else if (app.crashing) {
                Slog.i(TAG, "Crashing app skipping ANR: " + app + " " + annotation);
                return;
            } else if (app.killedByAm) {
                Slog.i(TAG, "App already killed by AM skipping ANR: " + app + " " + annotation);
                return;
            } else if (app.killed) {
                Slog.i(TAG, "Skipping died app ANR: " + app + " " + annotation);
                return;
            }
​
            // In case we come through here for the same app before completing
            // this one, mark as anring now so we will bail out.
            app.notResponding = true;
​
            // Log the ANR to the event log.
            EventLog.writeEvent(EventLogTags.AM_ANR, app.userId, app.pid,
                    app.processName, app.info.flags, annotation);
​
            // Dump thread traces as quickly as we can, starting with "interesting" processes.
            //根据注释内容 尽可能快速的手机anr进行trace信息,把他放在第一个
            firstPids.add(app.pid);
​
            // Don't dump other PIDs if it's a background ANR
            isSilentANR = !showBackground && !isInterestingForBackgroundTraces(app);
            if (!isSilentANR) {
                int parentPid = app.pid;
                if (parent != null && parent.app != null && parent.app.pid > 0) {
                    parentPid = parent.app.pid;
                }
                if (parentPid != app.pid) firstPids.add(parentPid);
​
                if (MY_PID != app.pid && MY_PID != parentPid) firstPids.add(MY_PID);
​
                for (int i = mService.mLruProcesses.size() - 1; i >= 0; i--) {
                    ProcessRecord r = mService.mLruProcesses.get(i);
                    if (r != null && r.thread != null) {
                        int pid = r.pid;
                        if (pid > 0 && pid != app.pid && pid != parentPid && pid != MY_PID) {
                            if (r.persistent) {
                                firstPids.add(pid);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding persistent proc: " + r);
                            } else if (r.treatLikeActivity) {
                                firstPids.add(pid);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding likely IME: " + r);
                            } else {
                                lastPids.put(pid, Boolean.TRUE);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding ANR proc: " + r);
                            }
                        }
                    }
                }
            }
        }
​
        // Log the ANR to the main log.
        // 这个就是我们平常在看到的ANR日志信息
        StringBuilder info = new StringBuilder();
        info.setLength(0);
        info.append("ANR in ").append(app.processName);
        if (activity != null && activity.shortComponentName != null) {
            info.append(" (").append(activity.shortComponentName).append(")");
        }
        info.append("\n");
        info.append("PID: ").append(app.pid).append("\n");
        if (annotation != null) {
            info.append("Reason: ").append(annotation).append("\n");
        }
        if (parent != null && parent != activity) {
            info.append("Parent: ").append(parent.shortComponentName).append("\n");
        }
​
        ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);
​
        // don't dump native PIDs for background ANRs unless it is the process of interest
        String[] nativeProcs = null;
        if (isSilentANR) {
            for (int i = 0; i < NATIVE_STACKS_OF_INTEREST.length; i++) {
                if (NATIVE_STACKS_OF_INTEREST[i].equals(app.processName)) {
                    nativeProcs = new String[] { app.processName };
                    break;
                }
            }
        } else {
            nativeProcs = NATIVE_STACKS_OF_INTEREST;
        }
​
        int[] pids = nativeProcs == null ? null : Process.getPidsForCommands(nativeProcs);
        ArrayList<Integer> nativePids = null;
​
        if (pids != null) {
            nativePids = new ArrayList<Integer>(pids.length);
            for (int i : pids) {
                nativePids.add(i);
            }
        }
​
        // For background ANRs, don't pass the ProcessCpuTracker to
        // avoid spending 1/2 second collecting stats to rank lastPids.
        //收集trace信息
        File tracesFile = ActivityManagerService.dumpStackTraces(
                true, firstPids,
                (isSilentANR) ? null : processCpuTracker,
                (isSilentANR) ? null : lastPids,
                nativePids);
​
        String cpuInfo = null;
        if (ActivityManagerService.MONITOR_CPU_USAGE) {
            mService.updateCpuStatsNow();
            synchronized (mService.mProcessCpuTracker) {
                cpuInfo = mService.mProcessCpuTracker.printCurrentState(anrTime);
            }
            info.append(processCpuTracker.printCurrentLoad());
            info.append(cpuInfo);
        }
​
        info.append(processCpuTracker.printCurrentState(anrTime));
        //打印日志
        Slog.e(TAG, info.toString());
        if (tracesFile == null) {
            // There is no trace file, so dump (only) the alleged culprit's threads to the log
            Process.sendSignal(app.pid, Process.SIGNAL_QUIT);
        }
​
        StatsLog.write(StatsLog.ANR_OCCURRED, app.uid, app.processName,
                activity == null ? "unknown": activity.shortComponentName, annotation,
                (app.info != null) ? (app.info.isInstantApp()
                        ? StatsLog.ANROCCURRED__IS_INSTANT_APP__TRUE
                        : StatsLog.ANROCCURRED__IS_INSTANT_APP__FALSE)
                        : StatsLog.ANROCCURRED__IS_INSTANT_APP__UNAVAILABLE,
                app != null ? (app.isInterestingToUserLocked()
                        ? StatsLog.ANROCCURRED__FOREGROUND_STATE__FOREGROUND
                        : StatsLog.ANROCCURRED__FOREGROUND_STATE__BACKGROUND)
                        : StatsLog.ANROCCURRED__FOREGROUND_STATE__UNKNOWN);
        mService.addErrorToDropBox("anr", app, app.processName, activity, parent, annotation,
                cpuInfo, tracesFile, null);
​
        if (mService.mController != null) {
            try {
                // 0 == show dialog, 1 = keep waiting, -1 = kill process immediately
                int res = mService.mController.appNotResponding(
                        app.processName, app.pid, info.toString());
                if (res != 0) {
                    if (res < 0 && app.pid != MY_PID) {
                        app.kill("anr", true);
                    } else {
                        synchronized (mService) {
                            mService.mServices.scheduleServiceTimeoutLocked(app);
                        }
                    }
                    return;
                }
            } catch (RemoteException e) {
                mService.mController = null;
                Watchdog.getInstance().setActivityController(null);
            }
        }
​
        synchronized (mService) {
            //省略代码
            // Set the app's notResponding state, and look up the errorReportReceiver
            makeAppNotRespondingLocked(app,
                    activity != null ? activity.shortComponentName : null,
                    annotation != null ? "ANR " + annotation : "ANR",
                    info.toString());
​
            // Bring up the infamous App Not Responding dialog
            Message msg = Message.obtain();
            msg.what = ActivityManagerService.SHOW_NOT_RESPONDING_UI_MSG;
            msg.obj = new AppNotRespondingDialog.Data(app, activity, aboveSystem);
            //显示ANR Dialog
            mService.mUiHandler.sendMessage(msg);
        }
    }
Copy the code

The ANR of a system generally goes through the following processes

  1. Determine if ANR is true
  2. Collect critical process information
  3. Collect ANR logs
  4. Dump trace file
  5. Display ANR Dialog