hiring

Mobile architecture group recruitment iOS/Android, location hangzhou, interested please contact me!!

IOS memory abort(Jetsam) principle

Apple has recently opened source the XNU kernel code for iOS, and recently started working on stability and performance for mobile/cat users, so read up. Today I want to start by analyzing the memory Jetsam principle related to abort.

What is the Jetsam

Some people may not understand Jetsam very well. We can go to the phone Settings -> Privacy -> Analyze this path to look at the system logs, you will find that the phone has many JetsamEvent beginning logs. When these logs are turned on, they usually display data such as memory size and CPU time.

Jetsamevents occur mainly due to memory limitations caused by the absence of swap areas on iOS devices, so the iOS kernel has to kill some low-priority or high-memory items. These JetsamEvents are data messages recorded by the system after the App is killed.

To some extent, JetsamEvent is an alternative Crash event. However, due to the limitation of the semaphores that can be captured on iOS, it cannot be captured because the App is killed due to memory. To this end, many predecessors in the industry have designed flags to record so-called abort events themselves to collect data. But this collection abort is typically a simple count of times, without a detailed stack.

The source code to explore

MacOS/iOS is a system derived from BSD. The kernel is Mach, but the interfaces exposed at the top are generally based on the BSD layer wrapped in Mach. Although Mach is a microkernel architecture where the real virtual memory management takes place, BSD provides a relatively high-level interface for memory management, and various common JetSam events are also generated by BSD, so let’s explore the principle from the bsd_init function as an entry point.

Bsd_init is all about initializing subsystems like virtual memory management and so on.

Memory-related possibilities include the following steps:

1. Initialize the BSD memory Zone, which is based on the Mach kernel Zone build kmeminit(); 2. Features unique to iOS, The permanent monitoring thread for memory and process sleep #if CONFIG_FREEZE # ifNdef CONFIG_MEMORYSTATUS #error "CONFIG_FREEZE defined without matching CONFIG_MEMORYSTATUS" #endif /* Initialise background freezing */ bsd_init_kprintf("calling memorystatus_freeze_init\n");  memorystatus_freeze_init(); #endif> 3. IOS exclusive, JetSAM (i.e. resident monitoring thread for low memory events) #if CONFIG_MEMORYSTATUS /* Initialize kernel memory status notifications */ bsd_init_kprintf("calling memorystatus_init\n"); memorystatus_init(); #endif /* CONFIG_MEMORYSTATUS */Copy the code

These two steps call the interfaces exposed in kern_memoryStatus. c, which enable the two highest priority threads from the kernel to monitor memory usage.

Take a look at the features involved in CONFIG_FREEZE first. When this effect is enabled, the kernel freezes the process instead of killing it.

This freezing function is performed by starting a memoryStatus_freeze_thread in the kernel. The thread calls memoryStatus_freeze_top_process to freeze upon receiving the signal.

Of course, when it comes to the code related to process sleep, there are other concepts that need to be discussed in the MAC system. Pulling apart is a relatively big topic, followed by a separate article to elaborate.

Getting back to iOS Abort, we just need to focus on memoryStatus_init, removing the platform-independent code as follows:

__private_extern__ void
memorystatus_init(void)
{
    thread_t thread = THREAD_NULL;
    kern_return_t result;
    int i;

    /* Init buckets */
    // 注意点1:优先级数组,每个数组都持有了一个同优先级进程的列表
    for (i = 0; i < MEMSTAT_BUCKET_COUNT; i++) {
        TAILQ_INIT(&memstat_bucket[i].list);
        memstat_bucket[i].count = 0;
    }
    memorystatus_idle_demotion_call = thread_call_allocate((thread_call_func_t)memorystatus_perform_idle_demotion, NULL);

#if CONFIG_JETSAM

    nanoseconds_to_absolutetime((uint64_t)DEFERRED_IDLE_EXIT_TIME_SECS * NSEC_PER_SEC, &memorystatus_sysprocs_idle_delay_time);
    nanoseconds_to_absolutetime((uint64_t)DEFERRED_IDLE_EXIT_TIME_SECS * NSEC_PER_SEC, &memorystatus_apps_idle_delay_time);

    /* Apply overrides */
    // 注意点2:获取一系列内核参数
    PE_get_default("kern.jetsam_delta", &delta_percentage, sizeof(delta_percentage));
    if (delta_percentage == 0) {
        delta_percentage = 5;
    }
    assert(delta_percentage < 100);
    PE_get_default("kern.jetsam_critical_threshold", &critical_threshold_percentage, sizeof(critical_threshold_percentage));
    assert(critical_threshold_percentage < 100);
    PE_get_default("kern.jetsam_idle_offset", &idle_offset_percentage, sizeof(idle_offset_percentage));
    assert(idle_offset_percentage < 100);
    PE_get_default("kern.jetsam_pressure_threshold", &pressure_threshold_percentage, sizeof(pressure_threshold_percentage));
    assert(pressure_threshold_percentage < 100);
    PE_get_default("kern.jetsam_freeze_threshold", &freeze_threshold_percentage, sizeof(freeze_threshold_percentage));
    assert(freeze_threshold_percentage < 100);

    if (!PE_parse_boot_argn("jetsam_aging_policy", &jetsam_aging_policy,
            sizeof (jetsam_aging_policy))) {

        if (!PE_get_default("kern.jetsam_aging_policy", &jetsam_aging_policy,
                sizeof(jetsam_aging_policy))) {

            jetsam_aging_policy = kJetsamAgingPolicyLegacy;
        }
    }

    if (jetsam_aging_policy > kJetsamAgingPolicyMax) {
        jetsam_aging_policy = kJetsamAgingPolicyLegacy;
    }

    switch (jetsam_aging_policy) {

        case kJetsamAgingPolicyNone:
            system_procs_aging_band = JETSAM_PRIORITY_IDLE;
            applications_aging_band = JETSAM_PRIORITY_IDLE;
            break;

        case kJetsamAgingPolicyLegacy:
            /*
             * Legacy behavior where some daemons get a 10s protection once
             * AND only before the first clean->dirty->clean transition before
             * going into IDLE band.
             */
            system_procs_aging_band = JETSAM_PRIORITY_AGING_BAND1;
            applications_aging_band = JETSAM_PRIORITY_IDLE;
            break;

        case kJetsamAgingPolicySysProcsReclaimedFirst:
            system_procs_aging_band = JETSAM_PRIORITY_AGING_BAND1;
            applications_aging_band = JETSAM_PRIORITY_AGING_BAND2;
            break;

        case kJetsamAgingPolicyAppsReclaimedFirst:
            system_procs_aging_band = JETSAM_PRIORITY_AGING_BAND2;
            applications_aging_band = JETSAM_PRIORITY_AGING_BAND1;
            break;

        default:
            break;
    }

    /*
     * The aging bands cannot overlap with the JETSAM_PRIORITY_ELEVATED_INACTIVE
     * band and must be below it in priority. This is so that we don't have to make
     * our 'aging' code worry about a mix of processes, some of which need to age
     * and some others that need to stay elevated in the jetsam bands.
     */
    assert(JETSAM_PRIORITY_ELEVATED_INACTIVE > system_procs_aging_band);
    assert(JETSAM_PRIORITY_ELEVATED_INACTIVE > applications_aging_band);

    /* Take snapshots for idle-exit kills by default? First check the boot-arg... */
    if (!PE_parse_boot_argn("jetsam_idle_snapshot", &memorystatus_idle_snapshot, sizeof (memorystatus_idle_snapshot))) {
            /* ...no boot-arg, so check the device tree */
            PE_get_default("kern.jetsam_idle_snapshot", &memorystatus_idle_snapshot, sizeof(memorystatus_idle_snapshot));
    }

    memorystatus_delta = delta_percentage * atop_64(max_mem) / 100;
    memorystatus_available_pages_critical_idle_offset = idle_offset_percentage * atop_64(max_mem) / 100;
    memorystatus_available_pages_critical_base = (critical_threshold_percentage / delta_percentage) * memorystatus_delta;
    memorystatus_policy_more_free_offset_pages = (policy_more_free_offset_percentage / delta_percentage) * memorystatus_delta;

    /* Jetsam Loop Detection */
    if (max_mem <= (512 * 1024 * 1024)) {
        /* 512 MB devices */
        memorystatus_jld_eval_period_msecs = 8000;    /* 8000 msecs == 8 second window */
    } else {
        /* 1GB and larger devices */
        memorystatus_jld_eval_period_msecs = 6000;    /* 6000 msecs == 6 second window */
    }

    memorystatus_jld_enabled = TRUE;

    /* No contention at this point */
    memorystatus_update_levels_locked(FALSE);

#endif /* CONFIG_JETSAM */

    memorystatus_jetsam_snapshot_max = maxproc;
    memorystatus_jetsam_snapshot = 
        (memorystatus_jetsam_snapshot_t*)kalloc(sizeof(memorystatus_jetsam_snapshot_t) +
        sizeof(memorystatus_jetsam_snapshot_entry_t) * memorystatus_jetsam_snapshot_max);
    if (!memorystatus_jetsam_snapshot) {
        panic("Could not allocate memorystatus_jetsam_snapshot");
    }

    nanoseconds_to_absolutetime((uint64_t)JETSAM_SNAPSHOT_TIMEOUT_SECS * NSEC_PER_SEC, &memorystatus_jetsam_snapshot_timeout);

    memset(&memorystatus_at_boot_snapshot, 0, sizeof(memorystatus_jetsam_snapshot_t));

    result = kernel_thread_start_priority(memorystatus_thread, NULL, 95 /* MAXPRI_KERNEL */, &thread);
    if (result == KERN_SUCCESS) {
        thread_deallocate(thread);
    } else {
        panic("Could not create memorystatus_thread");
    }
}
Copy the code

Here are a few knowledge points

  • The kernel has a priority distribution for all processes, maintained through an array, where each entry is a list of processes. The size of this array is JETSAM_PRIORITY_MAX + 1. Its structure is defined as follows:

    typedef struct memstat_bucket {
        TAILQ_HEAD(, proc) list;
        int count;
    } memstat_bucket_t;
    Copy the code

    The structure is very straightforward.

  • Threads have different priorities under Mach, where MAXPRI_KERNEL represents the thread assigned the highest priority in the range available to the kernel. Other levels include the following:


* // The real-time thread with the highest priority (not clear who uses it) * 127 Reserved (real-time) * A * + * (32 levels) * + * V * 96 Reserved (real-time) * // MAXPRI_KERNEL * 95 Kernel mode only * A * + * (16 levels) * + * V * 80 Kernel mode only * // The priority allocated to the operating system * 79 System high Priority * A * + * (16 levels) * + * V * 64 System high priority * / priorities * A * + * (12 levels) * + * V * 52 Elevated priorities * 51 Elevated priorities (incl. BSD +nice) * A * + * (20 levels) * + * V * 32 Elevated priorities (incl. BSD +nice) * 31 Default (default base for threads) * 30 Lowered priorities (incl. BSD -nice) * A * + * (20 levels) * + * V * 11 Lowered priorities (incl. BSD -nice) * 10 Lowered priorities (aged pri's) * A * + * (11 levels) * + * V * 0 Lowered priorities (aged pri's / idle) * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *Copy the code
  • As you can see from the figure above, user-mode applications cannot have higher threads than the operating system and kernel. Furthermore, thread priorities are assigned differently between user-mode applications, with foreground active applications taking precedence over background applications. SpringBoard is the highest-priority app on iOS.
  • Of course, thread priorities are not static. Mach dynamically adjusts priorities for per-thread utilization and overall system load. If too much CPU is consumed, the priority is lowered, and if a thread is starved of CPU too much, the priority is raised. However, the program cannot exceed the priority range of the thread in which it resides.

So, with all the preamps out of the way, how did Apple handle the JetSam incident?

result = kernel_thread_start_priority(memorystatus_thread, NULL, 95 /* MAXPRI_KERNEL */, &thread);
Copy the code

Apple’s approach is very simple. As mentioned above, the BSD layer creates a kernel thread with the highest priority, VM_memorystatus. This thread maintains two lists, one based on the process priority mentioned earlier, and the other is called a memory snapshot list. Memorystatus_jetsam_snapshot is the memory page consumed by each process.

This resident thread receives memory stress notifications from the kernel’s memory daemon PageOut to each App process via a kernel call to handle events, This event is forwarded to the top UI event that we would normally receive as a global memory warning or didReceiveMemoryWarning in each ViewController.

Of course, our own App doesn’t actively register to listen for this memory warning event, it helps us to do all this underneathlibdispatchIf you’re interested, delve into it_dispatch_source_type_memorypressureand__dispatch_source_type_memorystatus.

So under what circumstances does memory stress occur? Let’s take a look at the memorystatus_action_needed function:

static boolean_t
memorystatus_action_needed(void)
{
#if CONFIG_EMBEDDED
    return (is_reason_thrashing(kill_under_pressure_cause) ||
            is_reason_zone_map_exhaustion(kill_under_pressure_cause) ||
           memorystatus_available_pages <= memorystatus_available_pages_pressure);
#else /* CONFIG_EMBEDDED */
    return (is_reason_thrashing(kill_under_pressure_cause) ||
            is_reason_zone_map_exhaustion(kill_under_pressure_cause));
#endif /* CONFIG_EMBEDDED */
}
Copy the code

To summarize:

Is_reason_thrashing, the Mach Zone exhausts.is_Reason_zone_map_exhaustion (this involves the virtual memory management of the Mach kernel, Memorystatus_available_pages as well as available pages below a threshold.

In these cases, you are ready to Kill the process. However, there is a particularly interesting piece of code underneath this processing. Let’s look at the function memorystatus_act_AGGRESSIVE:

if ( (jld_bucket_count == 0) || 
     (jld_now_msecs > (jld_timestamp_msecs + memorystatus_jld_eval_period_msecs))) {

    /* 
     * Refresh evaluation parameters 
     */
    jld_timestamp_msecs     = jld_now_msecs;
    jld_idle_kill_candidates = jld_bucket_count;
    *jld_idle_kills         = 0;
    jld_eval_aggressive_count = 0;
    jld_priority_band_max    = JETSAM_PRIORITY_UI_SUPPORT;
}
Copy the code

This code is clearly making a conditional judgment based on an interval. If this judgment is not satisfied, subsequent Kill attempts will not reach. The memorystatus_jLD_EVAL_period_msecs variable is memorystatus_jLD_EVAL_period_msecs.

/* Jetsam Loop Detection */
if (max_mem <= (512 * 1024 * 1024)) {
    /* 512 MB devices */
    memorystatus_jld_eval_period_msecs = 8000;    /* 8000 msecs == 8 second window */
} else {
    /* 1GB and larger devices */
    memorystatus_jld_eval_period_msecs = 6000;    /* 6000 msecs == 6 second window */
}
Copy the code

This window is based on the physical memory limit of the device, but anyway, it looks like there’s at least six seconds left for us to do something.

Of course, if the time window requirements are met, the killable target is searched according to the priority process list we mentioned:

proc_list_lock();
switch (jetsam_aging_policy) {
case kJetsamAgingPolicyLegacy:
    bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
    jld_bucket_count = bucket->count;
    bucket = &memstat_bucket[JETSAM_PRIORITY_AGING_BAND1];
    jld_bucket_count += bucket->count;
    break;
case kJetsamAgingPolicySysProcsReclaimedFirst:
case kJetsamAgingPolicyAppsReclaimedFirst:
    bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
    jld_bucket_count = bucket->count;
    bucket = &memstat_bucket[system_procs_aging_band];
    jld_bucket_count += bucket->count;
    bucket = &memstat_bucket[applications_aging_band];
    jld_bucket_count += bucket->count;
    break;
case kJetsamAgingPolicyNone:
default:
    bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
    jld_bucket_count = bucket->count;
    break;
}

bucket = &memstat_bucket[JETSAM_PRIORITY_ELEVATED_INACTIVE];
elevated_bucket_count = bucket->count;
Copy the code

It’s important to note that JETSAM doesn’t have to kill just one process, he can kill N processes in a big way.

if (memorystatus_avail_pages_below_pressure()) {
    /*
     * Still under pressure.
     * Find another pinned processes.
     */
    continue;
} else {
    return TRUE;
}
Copy the code

Memorystatus_do_kill ->jetsam_do_kill.

other

Sysctlname and syscTL system calls are disabled by Apple. For example:

"kern.jetsam_delta"
"kern.jetsam_critical_threshold"
"kern.jetsam_idle_offset"
"kern.jetsam_pressure_threshold"
"kern.jetsam_freeze_threshold"
"kern.jetsam_aging_policy"
Copy the code

However, I tried to get the boottime of the machine through kern.boottime, as shown in the following code:

size_t size; sysctlbyname("kern.boottime", NULL, &size, NULL, 0); char *boot_time = malloc(size); sysctlbyname("kern.boottime", boot_time, &size, NULL, 0); uint32_t timestamp = 0; Memcpy (x tamp, boot_time, sizeof (uint32_t)); free(boot_time); NSDate* bootTime = [NSDate dateWithTimeIntervalSince1970:timestamp];Copy the code

The last

Hee hee, the technical principle of some research, the heart immediately to solve the company’s Abort problem has a certain idea. Hey, hey, I wrote a DEMO to prove my idea, it works. Wow kaka. Wait for my good news