preface
I have summarized the previous articles into the OC Underlying Principles series. Through the object part of OC basic principle series, we know that OC object has a relatively comprehensive understanding. Through the class part, we also have a clear understanding of OC class. Through the message sending part, we have a deep understanding of message search and forwarding. Application content is the content of objects, classes, and method calls (message sending). So in this article we are going to explain the loading process of App applications.
Explore the App startup process
When exploring the App startup process, the first thing that comes to mind is verification through code
Verify App startup sequence by code
Prepare the following code
@implementation ViewController + (void)load{ NSLog(@"%s",__func__); } - (void)viewDidLoad { [super viewDidLoad]; } @ end __attribute__ ((constructor)) void kcFunc () {printf (" : % s \ n ", __func__); } int main(int argc, char * argv[]) {NSLog(@" enter main function "); NSString * appDelegateClassName; @autoreleasepool { // Setup code that might create autoreleased objects goes here. appDelegateClassName = NSStringFromClass([AppDelegate class]); } return UIApplicationMain(argc, argv, nil, appDelegateClassName); }Copy the code
We put some load in the ViewController, which is in main.mA C++ static method
Add log to main function, we run, check the print order, print the result as follows:So the question is,We all know that main is the only entry point to our App, but the load method goes first, the static method comes last, and main is executed last. Why is that?
Add static and dynamic libraries and executables
We know that there are a lot of libraries at the bottom of the App running process, including static libraries and dynamic libraries
- Static libraries usually end with.a,.lib or.framework, while dynamic libraries end with.tbd,.so,.framework
- Static libraries: When linked, static libraries are copied in their entirety into executable files, and are used multiple times to make redundant copies
- Dynamic library: link does not copy, program run by the system dynamic loading into memory, for program call, the system only load once, multiple programs common, save memory
Note: the static and dynamic libraries here are mainly apple official, our own private dynamic libraries will not be public, thank youScience and technology migrant workers
Point out any loose ends We know thatThe.h,.m classes are precompiled while the program is running, and then compiled. After compilation, assembly is performed. After assembly, there is a stage called connection (linking all the code into our program), and finally an executable file is generated
. Our code above has an executable (red box below)So what we’ve done is we’ve got to load our dependency libraries, we’ve got to load our.h,.m files, so who’s going to decide in what order to load these things? That’s what we’re talking about todayDyld connector
. It determines the order in which the content is loaded.
App: images ->dyld: read memory, start main program – link – initialize some necessary objects (runtime, libsysteminit, OS_init).
So let’s learn dyLD.
To explore the dyld
Analyze the main program initialization process
First we go to the official website to download dyLD source code, open the source code, we see there are a lot of thingsSo the question is, where do we start?We add a breakpoint to the load method, run the project, and when we get to the breakpoint, we print the BT
(Here’s the idea).We seeThe initial dyLD is _dyLD_start
. Then we search globally for _dyLD_start to find the method we are looking for, and below we look at x86 (different architectures, as well as ARM, i386).We found that the above are assembly, basically do not understand, how to do that?We can use comments to help understand this situation, because the comments are quite detailed
. Let’s look at the codeDyldbootstrap = dyLDBootstrap = dyLDBootstrapWe found this method, found a lot of content, we will pick up the main explanation. We scroll down to the methods and find the start methodThere are a lot of things we don’t need to know, we just need to focus on the important methods.
Macho_header: the macho header file. The directory information we read when we put the executable in the rotten apple is the macho file.
We click the _main function to enter the following methodThere are about 700 lines of code in this method, so let’s just pick up the important ones.Because this method returns result, we need to find result to do the assignment there
. Let’s search for result globallyWe find that sMainExecutable is called, that is, all sMainExecutable assignments are made. Now we search for sMainExecutable and find the following code that initializes the sMainExecutableWe order instantiateFromLoadedImage in (screenshot I’ll comment also cut into, can see comments)How to initialize, it is our ImageLoader at home, we’ll look at the way instantiateMainExecutableThe most important method in this method is sniffLoadCommands, so let’s search sniffLoadCommands, and here’s a trick,Since the first argument to sniffLoadCommands is mh, mh is prefixed with const
So we can search this wayClick on the method you want to seeWe see some of the same content as Load Commands from rotten apple. In fact, this method is to form the entire main program format. Let’s go back to the picture belowThis is essentially creating an ImageLoader, loading it into the Image, and returning it. So the sMainExecutable is the main program that has been created.
Analyze the _main function process
Now that we’ve looked at the initialization of the main program, let’s look at what the _main method does
Determining version InformationThe current platformThe current MO addressSet crash and log addresses and context informationTo set the environment variables, envp is the argument to _main, which is an array of all the environment variables to insertThe shared cache is processedInitialization of the main program (described above)Insert dynamic libraryThe link of the main programLink All our images (From the above two can know, must first link the main program, then link all the images
)Binding weak symbolPerform all initialization methodsFind the main function entry
Here’s a summary of the process:
Analyze the initializeMainExecutable initialization process
Above we saw that the _main function performs all initializersLet’s take a look at the implementation of this method. Click to enter the method belowSearch runInitializers globally(In the initializeMainExecutable method, runInitializers know that this is the initialization method literally, so search for it)
.When we look at runInitializers, go through each line and discover that processInitializers are the real deal. We search for processInitializersWe know from experience that the method in the red box is the important method. This method is called recursively(Recursive because our dynamic or static library introduces other libraries, and the table is a multi-table structure, so it needs to be instantiated recursively.)
. We search for recursiveInitialization methodsWe found a lot of code for this method, what are the important methods we need to investigate? (recursiveSpinLock(lock_info); This is a recursive lock
) DependentImage: This method is a dependentImage to see if all the files have been loaded, becauseThe dynamic libraries we talked about earlier may depend on other libraries
. See belowWhen we were working on the source code, the KCObjc runtime depended on objc(objc), that’s what it meant. The above approach is extended but not focused.The red box is the key method, but both lines 1595 and 1603 call notifySingle, so let’s look at the notifySingle codeThere is a lot of code, and we have to learn to find important ways to study itThe red box method is the key, so let’s look at sNotifyObjCInit, global searchSNotifyObjCInit = init; sNotifyObjCInit = init;SNotifyObjCInit is assigned in the registerObjCNotifiers, so would calling the registerObjCNotifiers give sNotifyObjCInit, So we need to look at registerObjCNotifiers to possibly find an implementation of sNotifyObjCInit
. That’s the idea. We search for registerObjCNotifiers_DYLD_OBJC_notify_register is a registerObjCNotifiers method called in our Objc source codeIs called in the _objc_init method._dyLD_OBJC_notify_register is equivalent to a callback function, That is, the method is called in the same way that _objc_init is called (which is what we found in Objc source code) and the analogy is to call the initializer before calling the _dyLD_OBJC_notify_registerGo back to the way we started, as shown belowSo let’s look at the method in the red box, which is the initialization method, and we click on it, and there’s doImageInit in the ImageLoaderMachO method, so let’s say doImageInitThere are still a lot of methods, so let’s look at line 2258, which means you can get the method you want by shifting the memory address, so let’s do an exampleWe see the one aboveThe memory address of main, and we keep shifting that address around to get other methods. 2258 lines of code just means keep panning to get the initialization method to call
.LibSystem must be initialized or an error will be reported.Above we have studied the initialization process, and finally the memory address is continuously shifted to call the initialization methodSo let’s verify that
Verify the initializeMainExecutable process
Let’s break _objc_init in Objc source codeRun the code, get to the breakpoint, and look at the stack informationTo see the libSystem_initializer code, download the source code:LibsystemThe source version we downloaded is libsystem-1252.250.1. We open the source code, search for libSystem_initializer, and we find the codeIt’s a lot of code, it’s only part of the code, so let’s look down at the method
Line 149 initializes the kernel, line 151 platform-specific initializes, line 153 initializes the thread (after initializing our GCD), line 155 initializes the libc, and line 158 initializes the malloc.
Line 168 initializes dyLD (dyLD_start is an uninitialized dyLD, dyLD is also a library) and line 170 initializes libdispatch. We also saw the libDispatch initialization call in the stack above
Libdispatch_init is called libdispatch_init in libdispatch.dylib.Libdispatch – 1173.40.5 Let’s search libdispatch_init in libdispatch source code.As we move down, there will be _dispatch_thread for some CREAT operations (GCD related).At the end we find the initialization of _OS_OBJect_init. Let’s searchWe see that line 75 is _objc_init(), which is_os_OBJECT_init jumps from _objc_init to _objc_init to initialize the runtime. As described above, _objc_init calls the _dyLD_OBJC_notify_register and assigns the value to sNotifyObjCInit
. Go back to the method shown belowweA notifySingle is executed after the doInitialization call, and the notifySingle jumps to sNotifyObjCInit. SNotifyObjCInit () is executed
.
Summarize the process
Let’s use a flowchart to look at the entire initializeMainExecutable process
Explain the App startup process
The article started by saying that when the App started, we found that now the load method, then the C++ static method, and then the main function. Let’s explain why.
Verify that load is called first
We’ll analyze it laterThe initialization method goes first, and in Objc sources the _objc_init method should go first, which ends with a call to _dyLD_OBJC_notify_register, passing in load_images
So what’s load_imagesWe see that call_load_methods is called at the end, so what does call_load_methods containWe’re doing a do-while loop between lines 350 and 360, we’re doing a loop insideLine 353 calls call_class_loads
So let’s look at the code inside the call_class_loadsLines 196-205 execute the for loop,In line 204 we see that the load method is executed
. soWe already called the load method before calling the _dyLD_OBJC_Notify_register
. Now let’s see where C++ static methods are called.
Validate C++ static method calls
We analyzed ImageLoaderMachO above only for doImageInit, but for doModInitFunctions we look at the figure belowSo let’s click on ImageLoaderMachO and see how this worksThis method is a call to all the C++ static methods in the application. Now let’s verify that the C++ static method we wrote earlier breaks the point, runs the code, printsBt (Prints stack information)
. Let’s look at the picture belowThe #1 line is doModInitFunctions, indicating that C++ static methods are executed under doModInitFunctions.
Verifying the main function
At this point, let’s switch to assemblySo what we’re going to do next is when kCfun is done, we’re going to go straight to the next step, and we’re going to go to mainThe main function is compiled inside, and is fixed to write dead. The compiler finds main and loads it into memory, verifying that if we change the main function nameLet’s change main to ll_main and run itLd_start = i386; i386 = arm64 The man function is already dead on the bottom.
The last
In this article, we mainly talked about the process of program operation, mainly introduced the process of DYLD. Later, we also detailed the process of initializeMainExecutable, and explained why the method call sequence was load first and main function last. There are a lot of things that need to be understood slowly. What wrong place still hope everybody points out! What the article says most is to find important methods. In the following exploration, we must know which methods are important. First see if the method returns a value, and then see where the value is assigned (read the comment during the process to make sure it’s correct). If there’s no return, we need to understand what we’re looking at, what the method does, and then go to the code, look at the comments, and if necessary look at the method behind it to make sure it’s correct, okay