The original is from my blog

What’s New

  • This page was last updated 22 November 2020
  • First updated November 22, 2020

preface

Ask a slightly experienced iOS developer how the App works, and he’ll probably say it starts with the main function. Who started it? He probably knows that iOS apps are started by a process called SpringBoard. As we all know, the desktop application that comes with the iPhone is called SpringBoard. You can open an App by clicking an icon on the desktop. Since iOS only supports running one user App at the same time, and it is not allowed to start another process directly within an App, so as an iOS developer, Most people don’t know the details of how one process starts another.

However, if you are a Mac developer, because the desktop program allows multiple processes, usually you will be exposed to the parent-child process relationship, especially if you are engaged in the Development of Mac anti-virus software, security control program, also need to understand from the kernel level how one process starts another process. To explore the nature of process startup from the source and how the process is generated to run can enrich our cognition of OS X and iOS system, and in turn, these knowledge can better serve the development of App at the application layer.

Nature of the program

Processes, threads, are both abstract concepts and real things. From the perspective of the programming process, in the kernel thread has its own corresponding structure (that’s right is the structure of language C), the kernel is maintained by a table to process information, such as when we call the fork function, inside a piece of code is the number of queries the current process, there is no more than the maximum allowed the fork the child process. The following is the implementation of the kernel fork function, and you can see that the process is indeed represented by a structure.

All processes are created in the kernel, and all source code analyzed in this article is source code for the xun kernel.

int
fork1(proc_t parent_proc, thread_t *child_threadp, int kind, coalition_t *coalitions)
{
    // Omit the other code
        /* * Increment the count of procs running with this uid. Don't allow * a nonprivileged user to exceed their current limit, which is * always less than what an rlim_t can hold. * (locking protection is provided by list lock held in chgproccnt) * /
    count = chgproccnt(uid, 1);
    // Omit the other code
}
Copy the code

From the CPU’s point of view, none of these concepts exist anymore. The CPU only knows how to execute instructions line by line, either to operate registers, or to read data from a memory address, or to write data from memory to disk (through the driver).

Process from generation to run

A process from birth to operation, like a universe from creation to life, is amazing and hard to understand. Fortunately, the process is artificially created, so we can still look at the documentation and source code to find out the answer. Of course, this is very difficult, and this article is not 100% correct, so it’s just a summary

Start by turning on the power

I haven’t done much on this part, but just to make it simple, when the hardware is powered on, the CPU starts executing instructions line by line from a fixed location on the hardware ROM. The result of these instructions is to load and run the EFI firmware (binary program) in the ROM. Then guided by EFI to OS X and iOS kernel (binary), this part has a lot of content, took hundreds of pages in the book, the end result is the kernel to every basic functions of an operating system initialization, such as file systems, virtual memory, a network protocol stack, etc., at this time also have process this abstract concept, the kernel is a process.

The first process in user mode is launchd

We omitted booting and kernel loading, and when we went to launchd, it was like jumping straight from the big bang to the formation of the earth 🙂

Once the kernel is loaded, a thread is assigned to execute the bsdinit_task function, which eventually loads the launchd binaries and eventually switches to user mode to run the launchd process. At this point, the user mode finally has its first process.

The following is the kernel function bsdinit_task source code, written in C language, executed in the kernel state. You can see that the launchd process was originally called init, and then changed to “launchd :)”

void
bsdinit_task(void)
{
    proc_t p = current_proc();

    process_name("init", p);
    // Omit the other code
    bsd_init_kprintf("bsd_do_post - done");

    load_init_program(p);
    lock_trace = 1;
}
Copy the code

Launchd is that it created

The above code does not explain how the launchd process is created by the kernel process, but it is necessary to explain the launchd process in detail here, because understanding this process will help you understand how subsequent user apps such as wechat and Taobao App processes are created.

After the kernel is loaded, the file system is initialized. The launchd image file, also known as the binary executable file, is located in the following directory, which is hardcoded into the kernel code.

➜ ~ ll /sbin/launchd-rwxr-xr-x 1 root wheel 378K 9 22 08:30 /sbin/launchdCopy the code

The load_init_program function is called at the end of the bsdinit_task function.

static const char * init_programs[] = {
#if DEBUG
    "/usr/local/sbin/launchd.debug".#endif
#if DEVELOPMENT || DEBUG
    "/usr/local/sbin/launchd.development".#endif
    "/sbin/launchd"};void
load_init_program(proc_t p)
{
    uint32_t i;
    int error;
    vm_map_t map = current_map();
    // Omit the other code

    error = ENOENT;
    for (i = 0; i < sizeof(init_programs) / sizeof(init_programs[0]); i++) {
        printf("load_init_program: attempting to load %s\n", init_programs[i]);
        error = load_init_program_at_path(p, (user_addr_t)scratch_addr, init_programs[i]);
        if(! error) {return;
        } else {
            printf("load_init_program: failed loading %s: errno %d\n", init_programs[i], error); }}}Copy the code

You can see that the launchd binaries are hardcoded directly into the global constants, and then the load_init_program_at_path function is called, which will call execve at the end,

int
execve(proc_t p, struct execve_args *uap, int32_t *retval)
{
    struct __mac_execve_args muap;
    int err;

    memoryshot(VM_EXECVE, DBG_FUNC_NONE);

    muap.fname = uap->fname; // Binary file path
    muap.argp = uap->argp; / / parameters
    muap.envp = uap->envp; // Environment variables are very important
    muap.mac_p = USER_ADDR_NULL;
    err = __mac_execve(p, &muap, retval);

    return err;
}
Copy the code

The kernel function execve has a corresponding system call, which can be seen in the man tool:

// execve() transforms the calling process into a new process.
// The system call function is signed as follows
int execve(const char *path, char *const argv[], char *const envp[]);
Copy the code

Mean the process of call this function will be converted to the specified new process, the system calls the will call to the kernel function with the same name, we can do a little experiment, you can see this function is not return if the call is successful, will directly into the target process of the main function, running target process.

// The target program
#include <stdio.h>

int main(int argc, const char * argv[]) {
    // insert code here...
    printf("Hello, Jue jin! \n");
    return 0;
}​
Copy the code
// Demo program
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/wait.h>

int main(a) {
    pid_t pid = fork();

    if (pid == 0) {
        printf("%s\n"."parent");
    } else {
        int ret = execve("./TestHelloWorld".0.0);
        printf("execve ret: %d\n", ret); }}Copy the code

Compile the target program, put it in the same directory as the demo program, run the demo program and you’ll see Hello Jue Jin printed, and you won’t see execve Ret: printed by the demo program. The inner details of the execve function will be expanded later.

As for environment variables, it is necessary to emphasize that normal child process environment variables will inherit from the parent process, and many programs are dependent on the implementation of environment variables, such as Xcode debugging, run a program in Xcode, through the activity monitor can see that the parent process of this program is the debugServer process of Xcode.

The program itself is simple, but it has multiple dylib injections, and that’s the power of environment variables. The dyld loader loads the specified dynamic libraries before loading the program, based on some specific flags in the environment variables, so we can happily use the debugging capabilities provided by Xcode.

sudo launchctl procinfo 39203
Copy the code

The launchctl tool allows you to view process information, which includes an array of environment variables (one of which is limited in space)

environment vector = { DYLD_INSERT_LIBRARIES => /Applications/Xcode.app/Contents/Developer/usr/lib/libBacktraceRecording.dylib:/Applications/Xcode.app/Contents/Develope r/usr/lib/libMainThreadChecker.dylib:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/Libr ary/Debugger/libViewDebuggerSupport.dylib }Copy the code

This is the most important environment variable for Xcode debugging. This detail should belong to dyld.

After launchd is that it

The launchd process is the first user mode process with PID=1. The launchd process does not have any user interface processes yet. The launchd process then pulls up the necessary daemons and agents for Mac OS X & iOS. Agent process

Launchd queries several pre-specified directories to determine which daemons or agents need to be started. For the Mac, daemons are processes that start before the user is logged in. Processes that start after the user is logged in are called agent processes. However, iOS has no concept of user login, so these processes in iOS are daemons (including desktop). These directories are as follows:

/ System/Library/LaunchDaemons # storage System daemon file/System/Library/LaunchAgents # storage System agent process file/Library/LaunchDaemons /Library/LaunchAgents /LaunchAgents /Library/LaunchAgents /Library/LaunchAgents /Library/LaunchAgents /Library/LaunchAgentsCopy the code

All of the above directories contain the plist file. The plist file will explain how to start the process, the path of the process binary file, and so on. The specific PList content format is omitted, there are a lot of details. Here are some examples.

Mac OS X launches the Dock process, Finder process, and so on after the user is logged in. The Finder process’s plist file is as follows:

~ cat /System/Library/LaunchAgents/com.apple.Finder.plist <? The XML version = "1.0" encoding = "utf-8"? > <! DOCTYPE plist PUBLIC "- / / / / DTD plist Apple / 1.0 / EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd" > < plist Version ="1.0"> <dict> <key>POSIXSpawnType</key> <string>App</string> <key>RunAtLoad</key> <false/> <key>KeepAlive</key> <dict> <key>SuccessfulExit</key> <false/> <key>AfterInitialDemand</key> <true/> </dict> <key>Label</key> <string>com.apple.Finder</string> <key>Program</key> <string>/System/Library/CoreServices/Finder.app/Contents/MacOS/Finder</string> <key>CFBundleIdentifier</key> <string>com.apple.finder</string> <key>ThrottleInterval</key> <integer>1</integer> </dict> </plist>Copy the code

Mac OS X and iOS have many of the same deamons programs, as well as many different ones, among which the iOS desktop process SpringBoard has plist files in the following directories:

/System/Library/LaunchDaemons
Copy the code

In addition, the luanchd process on iOS, when started, also pulls many other system processes, no more so than on Mac OS X.

There is no detail on how launchd pulls up other processes, so I’ll get to that later and skip it for now.

Finder & SpringBoard processes

In iOS, an App can open another App using the following API:

// UIApplication
- (void)openURL:(NSURL *)url 
        options:(NSDictionary<UIApplicationOpenExternalURLOptionsKey.id> *)options 
completionHandler:(void(^) (BOOL success))completion;
Copy the code

If SpringBoard also uses this AIP, you will have to decompile SpringBoard to find out.

Similarly on Mac OS X, you can open other processes using the NSWorkspace class, and we usually open an App in the Finder, and the Finder uses the API in the NSWorkspace class.

In iOS, for example, now that you have the SpringBoard process, the user can see the desktop. When you click on an App icon, such as wechat, SpringBoard will tell the launchd process to open wechat in some way.

In fact, when a Mac program uses the NSWorkspace class to open another App, you can see that the launchd process is also the parent of that App. Since this part of the API is not open source, I’m going to skip it here. Just note that when a user opens an App, either SpringBoard or Finder, they tell the Launchd process to open the App in some way.

The launchd process launches the user App

Launchd passes when launching other processesfork()System call, enter the kernel mode to clone another launchd process, and then passexecveSystem call, passing in the path of the target App binary file.execveThere are some column calls inside the function. See the following call diagram:

This part is all open source and will eventually call the load_dylinker function. The source code for the load_dylinker kernel function is enclosed below:

#define DEFAULT_DYLD_PATH "/usr/lib/dyld"

#if (DEVELOPMENT || DEBUG)
extern char dyld_alt_path[];
extern int use_alt_dyld;
#endif

static load_return_t
load_dylinker(
    struct dylinker_command *lcp,
    integer_t               archbits,
    vm_map_t                map.thread_t        thread,
    int                     depth,
    int64_t                 slide,
    load_result_t           *result,
    struct image_params     *imgp
    )
{
    const char              *name;
    struct vnode            *vp = NULLVP;   /* set by get_macho_vnode() */
    struct mach_header      *header;
    off_t                   file_offset = 0; /* set by get_macho_vnode() */
    off_t                   macho_size = 0; /* set by get_macho_vnode() */
    load_result_t           *myresult;
    kern_return_t           ret;
    struct macho_data       *macho_data;
    struct {
        struct mach_header      __header;
        load_result_t           __myresult;
        struct macho_data       __macho_data;
    } *dyld_data;

    if (lcp->cmdsize < sizeof(*lcp) || lcp->name.offset >= lcp->cmdsize) {
        return LOAD_BADMACHO;
    }
    // Find the path of the dyld dynamic linker from the target macho file.
    name = (const char *)lcp + lcp->name.offset;

    /* Check for a proper null terminated string. */
    size_t maxsz = lcp->cmdsize - lcp->name.offset;
    size_t namelen = strnlen(name, maxsz);
    if (namelen >= maxsz) {
        return LOAD_BADMACHO;
    }

#if (DEVELOPMENT || DEBUG)

    /* * rdar://23680808 * If an alternate dyld has been specified via boot args, check * to see if PROC_UUID_ALT_DYLD_POLICY has been set on this * executable and redirect the kernel to load that linker. */

    if (use_alt_dyld) {
        int policy_error;
        uint32_t policy_flags = 0;
        int32_t policy_gencount = 0;

        policy_error = proc_uuid_policy_lookup(result->uuid, &policy_flags, &policy_gencount);
        if (policy_error == 0) {
            if(policy_flags & PROC_UUID_ALT_DYLD_POLICY) { name = dyld_alt_path; }}}#endif

#if! (DEVELOPMENT || DEBUG)
    if (0! =strcmp(name, DEFAULT_DYLD_PATH)) {
        return LOAD_BADMACHO;
    }
#endif

    /* Allocate wad-of-data from heap to reduce excessively deep stacks */

    MALLOC(dyld_data, void *, sizeof(*dyld_data), M_TEMP, M_WAITOK);
    header = &dyld_data->__header;
    myresult = &dyld_data->__myresult;
    macho_data = &dyld_data->__macho_data;

    ret = get_macho_vnode(name, archbits, header,
        &file_offset, &macho_size, macho_data, &vp);
    if (ret) {
        goto novp_out;
    }

    *myresult = load_result_null;
    myresult->is_64bit_addr = result->is_64bit_addr;
    myresult->is_64bit_data = result->is_64bit_data;
    // Parse the macho file with a recursive call
    ret = parse_machfile(vp, map, thread, header, file_offset,
        macho_size, depth, slide, 0, myresult, result, imgp);

    if (ret == LOAD_SUCCESS) {
        if (result->threadstate) {
            /* don't use the app's threadstate if we have a dyld */
            kfree(result->threadstate, result->threadstate_sz);
        }
        result->threadstate = myresult->threadstate;
        result->threadstate_sz = myresult->threadstate_sz;

        result->dynlinker = TRUE;
        result->entry_point = myresult->entry_point;
        result->validentry = myresult->validentry;
        result->all_image_info_addr = myresult->all_image_info_addr;
        result->all_image_info_size = myresult->all_image_info_size;
        if(myresult->platform_binary) { result->csflags |= CS_DYLD_PLATFORM; }}struct vnode_attr va;
    VATTR_INIT(&va);
    VATTR_WANTED(&va, va_fsid64);
    VATTR_WANTED(&va, va_fsid);
    VATTR_WANTED(&va, va_fileid);
    int error = vnode_getattr(vp, &va, imgp->ip_vfs_context);
    if (error == 0) {
        imgp->ip_dyld_fsid = vnode_get_va_fsid(&va);
        imgp->ip_dyld_fsobjid = va.va_fileid;
    }

    vnode_put(vp);
novp_out:
    FREE(dyld_data, M_TEMP);
    return ret;
}
Copy the code

The CMD section of the macho file indicates the path of the dynamic loader that the current binaries need to use.

As you can see from the code, in debug mode, the kernel directly treats binary files in the /usr/lib/dyld path as the actual dyld loader.

After we get the binary file dyld, we recursively call parse_machfile, which then parses and loads dyld in the macho format. You can see that when we open a binary file, Not only the macho format binary file is parsed internally, but also the dyld binary file is parsed incidentally.

After parsing the dyld file, the load_dylinker function returns the structure load_return_t, which contains the address of the entry instruction in the dyld file, that is, entry_point

As you can see from the flowchart above, after getting entry_point, the thread_setentryPoint () function is finally called, setting entry_point to the instruction register.

There are multiple implementations of this function that correspond to multiple CPU architectures:

intel:

/* * thread_setentrypoint: * * Sets the user PC into the machine * dependent thread state info. */
void
thread_setentrypoint(thread_t thread, mach_vm_address_t entry)
{
    pal_register_cache_state(thread, DIRTY);
    if (thread_is_64bit_addr(thread)) {
        x86_saved_state64_t     *iss64;

        iss64 = USER_REGS64(thread);

        iss64->isf.rip = (uint64_t)entry;
    } else {
        x86_saved_state32_t     *iss32;

        iss32 = USER_REGS32(thread);

        iss32->eip = CAST_DOWN_EXPLICIT(unsigned int, entry); }}Copy the code

arm64:

/* * Routine: thread_setentrypoint * */
void
thread_setentrypoint(thread_t         thread,
    mach_vm_offset_t entry)
{
    struct arm_saved_state *sv;

    sv = get_user_regs(thread);

    set_saved_state_pc(sv, entry);

    return;
}
Copy the code

After entry_point is set to the instruction register, the next instruction to be executed by the CPU is the instruction at the entry_point location. This involves assembly knowledge, the CPU is based on the instruction register (PC, EIP, RIP instruction register in different CPU architecture) to determine where to load the next item from. Detailed compilation knowledge please find information online.

Enter the dyld

After the instruction register above is set to the entry address of dyld, then enter dyLD. There’s a lot of information on how Dyld works that I won’t go into here. After initialization, such as runtime of OC, then find entry_point of the macho file (such as the macho executable file of wechat.exe). The entry_point of the target Macho is actually the well-known main function.

The main function

I don’t need to talk about this part, everybody understands.

conclusion

When the user clicks an App, it runs the App through the Luanchd process. The luanchd process calls fork and then execev. Execev does not return until the program is finished. The function loads dyld’s binary file first, and then gives control to dyld loader, which then calls App main. This is when the program starts to run.

Description of Professional terms

Several nouns are used several times in this article. Here’s what they mean:

  1. Binary file: a macho file, which can be an executable file or a loader file for exampledyld
  2. Entry_point: The address of the entry instruction that the program is installed in memory. The contents of this address are CPU instructions.

Recommended reading

How does a macOS kernel App work

OS kernel loaded Mach -o process analysis | MRH share of learning

Download XNU source code