A primer on language development, always a “Hello World Demo” with a raised hand. Like this:

Obviously, those of you who are familiar with iOS development know that this is from Objective-C.

Today, we will start with this familiar code to explore the whole process of the birth of “Hello World”.

mainfunction

As we all know, the main function is the entrance to our program, so let’s start from there and start our performance.

Now that the position has been determined, remove a break point and get the image below:

Obviously, inmainThe function is called before it executesstartMethod? So this so-calledstartWhat is the method? Where did it come from? How does it work?

(void)load {} (+ (void)load {}) {} (+ (void)load {}) {} (+ (void)load {});

+loadmethods

After running the code again, it’s easy to verify the aforementioned “load method call before main” and get the following call stack (the bt command prints a more detailed stack) :

From the picture above we can clearly see that it all started in a piledyldThings, then,dyldIs a what?

Dyld (full name: The Dynamic Link Editor) is apple’s dynamic link editor, which is used to link all libraries and executables. It is an important part of apple’s operating system. After the system kernel has prepared the program, DyLD is responsible for the rest of the work. Its code is also open source and can normally be downloaded here

!!!!! Note: there are many descriptions of the dyLD execution process on the Internet, but they are based on some older versions. You can see that the stack information above is a little different from the old version, but the overall flow is basically the same. Here is the introduction based on the latest dyLD4 version, the source code can be downloaded from here

We went on to say, in dyld finish load library, executable files and a series of preparations, through dyld4: : RuntimeState: : notifyObjCInit trigger libobjc. A. d. ylib load_images in function, To the call to our custom [Person Load] method, and finally to the main function later.

How can I say that? Are the words notify familiar to iOS developers? That’s right, notice! Dyld triggers notifications to the recipient of a registered notification. The load_images function in libobjc.a.dylib is the Runtime library we’re always talking about. The Runtime code is also in the official open source library, so we can test our hypothesis directly next. Runtime source code can be downloaded here

As an additional note, the Runtime source code can be run, of course requires a bunch of configuration, here is a direct run of the source code, after download, Target select KCObjcBuild, the source code can be directly Debug, not too great.

load_imagesfunction

Oh, go on. How do we test that? A direct search for load_images in the Runtime source code is easy to locate:

apparentlyload_imagesWhere the definition is directly dumped a breakpoint verification look, (source code directly compiled advantage highlights) :

The stack information called here clearly fits our guess.

Continue, is where the notification should be registered, otherwise how can you call the load_images function above? Load_images = load_images = load_images

What do we see in the picture above? Yes, _dyLD_OBJC_notify_register is the place where notifications are registered.

In this case, we know that in order to implement the above call process, then, the notification should be registered before the call is triggered, or it can not be said, and then verify, hand off a breakpoint. :

Sure enough, even the stack information is directly visible_objc_initCall process, is a pleasant surprise.

_objc_initfunction

_objc_init ();

-> dyld`start 
-> dyld`dyld4::prepare() 
-> dyld`dyld4::APIs::runAllInitializersForMain() 
-> dyld`dyld4::Loader::findAndRunAllInitializers() 
-> dyld`dyld3::MachOAnalyzer::forEachInitializer() 
-> dyld`dyld3::MachOFile::forEachSection() const 
-> dyld`dyld3::MachOFile::forEachLoadCommand() 
-> dyld3::MachOFile::forEachSection() 
-> dyld3::MachOAnalyzer::forEachInitializer() 
-> dyld4::Loader::findAndRunAllInitializers() 
-> libSystem.B.dylib`libSystem_initializer 
-> libdispatch.dylib`libdispatch_init 
-> libdispatch.dylib`_os_object_init 
-> libobjc.A.dylib`_objc_init  
Copy the code

The dyLD related functions are in the DyLD open source library, mentioned earlier, here.

The code for libsystem.b.dilib is available in the official open source library, here.

The libdispatch.dylib code is also available in the official open source library, here.

Libobjc.a.dylib is the Runtime code mentioned above, here

Running track

So far, we have seen the basic logic of the main call:

Dyld start -> _objc_init function load (registered load_images) -> trigger load_images function -> trigger +load method -> call main function at the end -> finally output Hello World

As you can see, there is a whole bunch of stuff going on that we didn’t realize was going on before main was called, most of it being handled by DyLD, let’s call it the loading process.

But how exactly does main get invoked? Currently looking at is still not very clear, or to continue to improve the source code, the source code? From the previous stack, we still need to start and prepare for dyLD.

dyldevokemainfunction

We can search for prepare globally in the dyld source code to find the following location:

And this happens to bestartThe internal call to the function, which corresponds to the stack information we saw earlier, should be able to confirm that it did not run away.

In fact, from here we can make a bold assumption here, and then to carefully verify:

// load all dependents of program and bind them together
MainFunc appMain = prepare(state, dyldMA);

// now make all dyld Allocated data structures read-only
state.decWritable();

// call main() and if it returns, call exit() with the result
// Note: this is organized so that a backtrace in a program's main thread shows just "start" below "main"
int result = appMain(state.config.process.argc, state.config.process.argv, state.config.process.envp, state.config.process.apple);
Copy the code

From the naming and comments above, it is easy to see: Dependents (MainFunc) returns the main entry (MainFunc). AppMain () is a call to main. We can see that the parameters are similar to our main parameters.

The above code can be further verified if we prepare() further.

.// run all initializers
    state.runAllInitializersForMain();

    // notify we are about to call main
    notifyMonitoringDyldMain();
    if ( dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE) ) {
        dyld3::kdebug_trace_dyld_duration_end(launchTraceID, DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, 0.0.4);
    }
    ARIADNEDBG_CODE(220.1);

    MainFunc result;
    if ( state.config.security.skipMain ) {
        return &fake_main;
    }
    else if ( state.config.process.platform == dyld3::Platform::driverKit ) {
        result = state.mainFunc();
        if ( result == 0 )
            halt("DriverKit main entry point not set");
#if __has_feature(ptrauth_calls)
        // HACK: DriverKit signs the pointer with a diversity different than dyld expects when calling the pointer.
        result = (MainFunc)__builtin_ptrauth_strip((void*)result, ptrauth_key_function_pointer);
        result = (MainFunc)__builtin_ptrauth_sign_unauthenticated((void*)result, 0.0);
#endif
    }
    else {
        // find entry point for main executable
        uint64_t entryOffset;
        bool     usesCRT;
        if ( !state.config.process.mainExecutable->getEntry(entryOffset, usesCRT) )
            halt("main executable has no entry point");
        result = (MainFunc)((uintptr_t)state.config.process.mainExecutable + entryOffset);
        if ( usesCRT ) {
            // main executable uses LC_UNIXTHREAD, dyld needs to cut back kernel arg stack and jump to "start"
#if SUPPPORT_PRE_LC_MAIN
            // backsolve for KernelArgs (original stack entry point in _dyld_start)
            const KernelArgs* kernArgs = (KernelArgs*)(&state.config.process.argv[2 -]);
            gotoAppStart((uintptr_t)result, kernArgs);
#else
            halt("main executable is missing LC_MAIN");
#endif
        }
#if __has_feature(ptrauth_calls)
        result = (MainFunc)__builtin_ptrauth_sign_unauthenticated((void*)result, 0.0);
#endif
    }
Copy the code

RunAllInitializersForMain () can also be embodied in front of the stack, the function name is also very intuitive, calls for the main function, do all the initialization.

The following result assignment process is actually the search process of the main function entry.

conclusion

In summary, the process before calling main is clearer. It starts from the start function of dyld, which uses prepare() to prepare for the call of main, such as initialization, linking all dynamic libraries and executable files. After finding the entry of main and returning to the start function, the main function is triggered.

Of course, there are some ambiguations about what’s going on inside _objc_init, what load_images are for, and is appMain the main method we used in the original screenshot? (Obviously not, the number of parameters does not match, haha, there is some process in between).

Interested partners can continue to study, we will continue to discuss later.