Application loading
Libraries: Executable binary files that can be loaded into memory by the system. Library is divided into two kinds, one is static library, one is dynamic library (.so.DLL. Framework…) Static libraries: load them sequentially, and may be added repeatedly. Dynamic libraries: Load them only when they are needed, not repeatedly. Shared memory reduces the package size, so Apple’s libraries are all dynamic
The build process
Executable file
Build any project and click on it after successShow in Finder -> Show the package contents, the black one is the executable mach-o. To run, drag a Mach-o file directly to the terminal, where ios projects require authorization to open the emulator, and MAC projects run directly.
Dyld Dynamic linker
Image: Library mapped to memory is an image
Dyld Load process
Exploration helps us look at the source code and go to the _objc_init function and see a little bit more, and see the comment here, 1. 2. Call libSystem before the library is loaded
/*********************************************************************** * _objc_init * Bootstrap initialization. Registers our image notifier with dyld. * Called by libSystem BEFORE library initialization time * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
void _objc_init(void)
{
static bool initialized = false;
if (initialized) return;
initialized = true;
// fixme defer initialization until an objc-using image is found?
// Fix delaying initialization until a usable objc mirror is found
environ_init();
tls_init();
static_init();
runtime_init();
exception_init();
#if __OBJC2__
cache_t::init();
#endif
_imp_implementationWithBlock_init();
_dyld_objc_notify_register(&map_images, load_images, unmap_image);
#if __OBJC2__
didCallDyldNotifyRegister = true;
#endif
}
Copy the code
When we put a breakpoint on the main function of the iOS project, we found thatmain
Before the function is called,ViewController
theload
Method has been calledload
Methods in themain
Before the call.Focusing on theload
Method, I’m going to go ahead and put a breakpoint here and see what methods were called before loadBecause this is a stack structure, first in, then out, so the analysis from the bottom up. At this point we found the Dyld library, which we started fromopensourceDownload the latest dyld library to analyze, this library underlying dependency is more, so temporarily running up, but does not prevent us from analyzing 1._dyld_start
2.dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*)
3.dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*)
4.dyld::useSimulatorDyld(int, macho_header const*, char const*, int, char const**, char const**, char const**, unsigned long*, unsigned long*)
5.dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*)
6.dyld::initializeMainExecutable()
7.ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&)
8.ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&)
9.dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*)
10.load_images
We also opendyld
This library, from method 1 to method 2Continue to searchdyldbootstrap
Find the one in this namespacestart
Function corresponds to the console method 2start
The last line of the function returnsdyld::_main
With method 3 above, we have now finished booting dyldbootstrap
, calling dyldmain
The function,I looked through it and there were about a thousand lines in there. I don’t know where to start, so let’s start with returnresult
This function has in itresult
There’s not a lot of places, look at the assignment and the comment “Find the entry point to the main executable.”sMainExecutable
Continue to searchsMainExecutable
So let’s see what it does, and I glanced at it, and I found that it was right. Result andsMainExecutable
1. Initializes the image file loader for the main executable, instantiating the main program2. Load any inserted libraries3. The link of the main program4. Dynamic library comments inserted by Link (do this after link’s main program so that dylibs are inserted)5. Weak references bind to the main program (after all image files are linked)6. Run all initialization runs corresponding to the console method 66.1 Methods to run the main executable and all libraries corresponding to the console 76.2 Image File Loading and Initialization ProcedureWe look atImageLoader::processInitializers
Note: The upward-linked dylib initializer is very fast, and in order to handle dangling Dylibs that link up rather than down, all upward-linked dylibs defer their initialization until the recursions through the down-linked Dylibs are complete. 602 Line comment: Call the recursive init of the image file in the mirror list to build a new list of uninitialized dependenciesGlobal searchrecursiveInitialization(const
Go to the recursive initialization method of the image file, and look at the comments to let Objc know that we’re going to initialize the image by first initializing the dependent libraries of the image, which is found herenotifySingle
Function corresponds to the console method 96.3 findnotifySingle
Function to notify objc that the image file is initializedstatic _dyld_objc_notify_init sNotifyObjCInit
Find where this assignment is_dyld_objc_notify_init
I found a familiar figure _dyld_objc_notify_register
That’s what we are_objc_init
The bottom source appeared, at this point officially from dyLD jump out of the bottom C travel a closed loop7. Notify any monitoring processes that the process is about to enter main
_dyld_objc_notify_register(&map_images, load_images, unmap_image)
The first argument to map_images is the class loading, protocol properties ro/ RW class initialization, lazy loading, etc. Load_imagesload is a collection of methods and we need to know the assignment and call time of the three arguments in it.
// _dyld_objc_notify_register
void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped,
_dyld_objc_notify_init init,
_dyld_objc_notify_unmapped unmapped)
{
dyld::registerObjCNotifiers(mapped, init, unmapped);
}
Copy the code
Enter dyld: : registerObjCNotifiers method, see the assignment in this method
// _dyld_objc_notify_init
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
// record functions to call
sNotifyObjCMapped = mapped;
sNotifyObjCInit = init;
sNotifyObjCUnmapped = unmapped;
// ...
}
Copy the code
Timing of the sNotifyObjCMapped call
The global search locates the static void notifyBatchPartial method, and we see a line of calls in it
(*sNotifyObjCMapped)(objcImageCount, paths, mhs);
Copy the code
Continue searching for static void notifyBatchPartial, and we find this assignment
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
// record functions to call
sNotifyObjCMapped = mapped;
sNotifyObjCInit = init;
sNotifyObjCUnmapped = unmapped;
// call 'mapped' function with all images mapped so far
try {
notifyBatchPartial(dyld_image_state_bound, true.NULL.false.true); }}Copy the code
That is, after sNotifyObjCMapped, (*sNotifyObjCMapped) is called.
The call time of sNotifyObjCInit
The global search found calls in this function and found two places to call, one in the registerObjCNotifiers function
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
// record functions to callsNotifyObjCMapped = mapped; sNotifyObjCInit = init; sNotifyObjCUnmapped = unmapped; . (*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
}
Copy the code
That is, after sNotifyObjCInit is assigned, (*sNotifyObjCInit) is also called; Another place is in static void notifySingle
whyload
Before C++ functions?
So we know from that(*sNotifyObjCInit)
In the methodnotifySingle
We know from the order in which the methods are called that the image file is called in the recursive loadnotifySingle
functionIn this order, it would be natural to assume that c++ methods are called in order before the load method, but is that really the case? 1. Write a c++ function in place of main
__attribute__((constructor)) void testFunc(a){
printf("Coming: %s \n",__func__);
}
Copy the code
2. Implement the load() method in Person
+ (void)load {
printf("Coming: %s \n",__func__);
}
Copy the code
3. Implement a c++ function in the source codeBreak point on main. Find the call order like thisSo you can get the sequence image file in C++ > load > current project in C++
Introduced libSystem
From the above analysis, we know that the registration notification notifying OBJc when dyld has finished loading the image file_dyld_objc_notify_register
At this time and_objc_init
Here echo, so we directly in the source code under the breakpoint, you can see the call stack inside the method call order, in_dyld_start
->_os_object_init
-> _objc_init
There’s one step in the middle that we haven’t analyzed yet, which is_os_object_init
This is alibdispatch
The library, we’re going to go straight fromopenSource
On the download
libdispatch/libSystem
Open the LibDispatch library and search to locate_os_object_init
The method is called internally_objc_init
(This is the objc method, not the init of dispatchd) methodlibdispatch_init
Method of,libSystem.B.dylib
libSystem_initializer->
_libdispatch_init->
libdispatch.dyliblibdispatch_init
-> libdispatch.dylib
_os_object_init->
libobjc.A.dylib_objc_init
In summary, it is once again verified that the execution of the four lines of code circled by the call stack in the figure above involveslibdispatch
andlibSystem
library
Dyld Load process
Open dyld again and searchdoModInitFunctions
We see a line commenting that libSystem Initializer must be loaded first. You can guess that this method is loadinglibSystem
Library, the call to this method is actually inImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&)
While the image file dependency loading in 6.2 above is called belowThus, a closed loop of the entire analysis is formed.
supplement
image list
To view the images used by the current project, useimage list
Command to find the local library path
The difference between dyLD2 and dyLD3
In iOS 13, the new DyLD 3 will be adopted across iOS to replace dyLD 2. Before we start, let’s show you a screenshot from wwDC2017/413 session to show the difference between dyLD 2 and DyLD 3
dyld2
According to the figure above and dyLD source code, the main workflow of DYLD 2 is:
- Initialization of dyld, the main code in dyldbootstrap::start, then run dyld::_main, dyld::_main more code, is the core part of dyld loading;
- Check and prepare the environment, such as getting the binary path, checking the environment variables, resolving the image header of the main binary, etc.
- Instantiate the image Loader of the master binary and verify that the versions of the master binary and dyld match.
- Check whether the shared cache is mapped. If not, perform the map shared cache operation first.
- Check DYLD_INSERT_LIBRARIES and load the inserted dynamic library (instantiate the Image Loader).
- The link operation is performed. This is a complex process that recursively loads all the dependent dynamic libraries (which are sorted with the dependent libraries always first), and performs symbolic binding, as well as rebase and binding operations.
- Execute the initialization method. OC’s +load and C’s constructor methods are executed at this stage;
- Read the LC_MAIN section of Mach-o to get the program’s entry address and call the main method.
dyld3
Dyld 3 isn’t new to WWDC19, it was introduced to iOS 11 back in 2017 to optimize the system library. Now, it will also be used to launch third-party apps in iOS 13, completely replacing Dyld 2. Since the dyLD 3 code is not open source, it is currently only possible to know what improvements have been made through official disclosures. The best thing about Dyld 3 is that it’s partly out-of-process and cached, so when you open the APP, a lot of the work is actually done.
Dyld 3 contains three components:
- An out-of-process Mach-O analyzer/compiler
Parse Mach-o Headers and Find Dependencies are a security risk in dyLD 2’s loading process (you can attack them by modifying the Mach-o headers and adding an illegal @rpath). Perform symbol lookups takes more CPU time because the symbol will always be at the same offset in the library if the library file is unchanged. These two parts will form a Lauch closure in DyLD 3 by caching the resulting data into a file using write ahead.
- The engine that executes Lauch Closure in this process
Verify that “lauch closures” is correct, map dylib, and execute main. At this point, it no longer has to analyze the Mach-o header and perform symbol lookups, saving a lot of time.
- Lauch Closure cache
Lauch Closure for system applications is built directly into shared Cache. For third-party applications, it will be generated when the APP is installed or updated, ensuring that Lauch Closure is always ready before the APP is opened. Overall, dyLD 3 takes care of a lot of time consuming operations ahead of time, greatly improving startup time.