Tag: App start dyld
This section describes the process of loading dyld. What else is done before the main function
Preparation work dyLD source libDispatch-1271.120.2 source libsystem-1292.60.1 objC4-818.2
1. dyld
1.1 introduction
dyld
The Dynamic Link Editor isApple's dynamic linker
Is an important part of Apple’s operating system, before apps are compiled and packaged into executable file formatMach-O
After the file is submitted toDyld is responsible for connecting, loading programs
- Dyld runs through the process of App startup, including loading dependent libraries and main program. If we need to optimize performance and startup, it is inevitable to deal with DyLD
1.2 History of DYLD
1.2.1 dyld 1.0 1996-2004
dyld 1
Included in theNeXTStep 3.3
, before NeXTStatic binary
The data. Not very much,dyld 1
Was written before the C++ dynamic library was widely used in the system, because C++ has many features, such as the work of its initializers, that work well in static environments, but can degrade performance in dynamic environments. Therefore, large C++ dynamic libraries can cause dyld to complete a lot of work and be slow- In the release
MacOS 10.0
andCheetah
Before, also added a feature, namelyPrebinding pre binding
. We can use Prebinding technology for alldylib
And the application foundFixed address
. Dyld will load all the contents of those addresses. If the load is successful, all dylib and program binary data will be edited to get all precomputations. The next time you need to put all your data into the same address, you won’t need to do anything else, which will speed things up a lot. But it also means that you have to edit the binary data every time you start, which is not a friendly way, at least from a security standpoint.
1.2.2 dyld 2, 2004-2017
Since its release in 2004, DyLD 2 has gone through several iterations. Some of the common features, such as ASLR, Code Sign, share Cache and so on, were introduced in DyLD 2
Dyld 2.0 (2004-2007)
- In 2004,
macOS Tiger
Introduced in thedyld 2
dyld 2
isdyld 1
A completely rewritten version that properly supports C++ initializer semantics, extends the mach-o format and updates dyld. Thus obtained the efficient C++ library support.- Dyld 2 has completed
dlopen
anddlsym
(used primarily for dynamically loading libraries and calling functions) with the correct semantics, so the older API was deprecateddlopen
: Opens a library and gets the handledlsym
: Looks for the value of the symbol in the open librarydlclose
: Closes the handle.dlerror
: Returns a string describing the last call to dlopen, DLSYm, or DLCLOSE.
dyld
theDesign goals
isIncrease start speed
. Therefore, only limited health tests were performed. Mainly because there were fewer malicious programs in the past- At the same time, DyLD also has some security issues, so some features have been improved to improve the security of DyLD on the platform
- Because of the huge increase in startup speed, we can
Reduce Prebinding work
. withEdit program data
The difference is that here we only edit the system library and can do so only when the software is updated. Therefore, during the software update process, you may see words like “Optimize system performance.” This is thePrebinding for updates
. Now DYLD is used for all optimizations, and its purpose is optimization. So we have dyld 2
Dyld 2.x (2007-2017)
- A number of improvements were made over the years from 2004 to 20017, and the performance of dyLD 2 improved significantly
- First of all,
increase
A lot ofThe infrastructure
andplatform
.- Since the release of DyLD 2 on PowerPC, added
x86
,x86_64
,arm
,arm64
And a number of derivative platforms. - Also launched
iOS
,tvOS
andwatchOS
, all of which require new DyLD capabilities
- Since the release of DyLD 2 on PowerPC, added
- Increase security in a variety of ways
- increase
codeSigning
Code signature, ASLR (Address Space Layout Randomization)
Address space configuration random loading: each time the library is loaded, it may be at a different addressbound checking
Boundary checking: Frontier checking for headers was added to Mach-o files to avoid the injection of malicious binary data
- increase
- Enhanced performance
- You can remove Prebinding by using
share cache
Shared code substitution
- You can remove Prebinding by using
ASLR
ASLR
It’s a way to prevent memory corruption vulnerability from being exploitedComputer security technology
, ASLR prevents an attacker from jumping to a specific location in memory to exploit the function by randomly placing the address space of the process’s key data area- Linux has added ASLR in kernel version 2.6.12
- The Apple in
Mac OS X Leopard 10.5
(Released in October 2007)Random address offset
, but its implementation does not provide the full protection capabilities defined by ASLR. Mac OS X Lion 10.7 provides ASLR support for all applications. - The Apple in
IOS 4.3
To import theASLR
.
Check the bounds
- Important additions to many of the mach-O headers
The border check
Functions thus can beAvoid injection of malicious binary data
Share Cache share code
share cache
The first isiOS3.1
andmacOS Snow Leopard
To completely replace Prebindingshare cache
Is aA single file
, including mostSystem dylib
Since these dylib files are merged into one file, they can be optimized.- Readjust all
Text segment (_TEXT)
andData segment (_DATA)
, and overwrite the entire symbol table to reduce the size of the file so that only a small number of areas are mounted per process. Allows us to pack binary data segments, thus saving a lot of RAM - The essence is a
Dylib prelinker
, the savings in RAM are significant and can be achieved when running in a normal iOS program500-1g
memory - You can also
Pregenerate data structures
For use by dyld and ob-c at run time. You don’t have to do these things when the program starts, which also saves more RAM and time
- Readjust all
share cache
Locally generated on macOS, running dyLD shared code will greatly improve system performance
1.2.3 DYLD 3 2017-present
dyld 3
WWDC 2017 is a new dynamic linker that completely changes the concept of dynamic linking and will be the default setting for most macOS applications. Dyld 3 will be used by default on all operating systems on Apple OS 2017.dyld 3
The earliest was in 2017iOS 11
Mainly used to optimize the system library.- And in the
iOS 13
In the system, iOS fully adopted the new DyLD 3 to replace the previous DyLD 2, becauseDyld 3 is fully compatible with DYLD 2
, the API is the same, so, in most cases, the developer does not need to do any additional adaptation to smooth the transition.
2. Bybt
Viewing stack InformationWhere does app launch start
Put a breakpoint on the load method, and check the BITtorrent stack information to see where the app started. Run the program and find out, did it start with _dyLD_start in dyld
int main(int argc, char * argv[]) {
NSString * appDelegateClassName;
@autoreleasepool {
// Setup code that might create autoreleased objects goes here.
appDelegateClassName = NSStringFromClass([AppDelegate class]);
}
return UIApplicationMain(argc, argv, nil, appDelegateClassName);
}
__attribute__((constructor)) void ypyFunc(){
printf("Coming: %s \n",__func__);
}
@interface ViewController(a)
@end
@implementation ViewController
+ (void)load{
NSLog(@"%s",__func__);
}
- (void)viewDidLoad {
[super viewDidLoad];
// Do any additional setup after loading the view.
}
@end
Copy the code
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000104ca5f24 002- Application load analysis +[ViewController Load](self=ViewController, _cmd="load") at ViewController.m:17:5
frame #1: 0x00000001aafd735c libobjc.A.dylib`load_images + 984
frame #2: 0x0000000104e0a190 dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 448
frame #3: 0x0000000104e1a0d8 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int.char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 512
frame #4: 0x0000000104e18520 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 184
frame #5: 0x0000000104e185e8 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 92
frame #6: 0x0000000104e0a658 dyld`dyld::initializeMainExecutable() + 216
frame #7: 0x0000000104e0eeb0 dyld`dyld::_main(macho_header const*, unsigned long.int.char const* *,char const* *,char const* *,unsigned long*) + 4400
frame #8: 0x0000000104e09208 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int.char const**, dyld3::MachOLoaded const*, unsigned long*) + 396
frame #9: 0x0000000104e09038 dyld`_dyld_start + 56
(lldb)
Copy the code
3. _dyLD_start Process analysis
Dyldbootstrap ::start(app_mh, argc, argv, dyLD_MH, &startglue) Is a C++ method
Dyldbootstrap: 3.1: start the source code
In the source code, search dyldbootstrap to find the namespace, and then look for the start method in the file. The core of this method is that the return value is called the main function of dyld, where macho_header is the header of Mach-o, and the file dyld loads is of the Type of Mach-o. The Mach-O type is an executable file type, consisting of four parts: Mach-O header, Load Command, section, and Other Data. You can view executable file information through MachOView
//
// This is code to bootstrap dyld. This work in normally done for a program by dyld and crt.
// In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{
// Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0.0.0.0);
// if kernel had to slide dyld, we need to fix up load sensitive locations
// we have to do this before using any global variables
rebaseDyld(dyldsMachHeader);
// kernel sets up env pointer to be just past end of agv array
const char** envp = &argv[argc+1];
// kernel sets up apple pointer to be just past end of envp array
const char** apple = envp;
while(*apple ! =NULL) { ++apple; }
++apple;
// set up random value for stack canary
__guard_setup(apple);
#if DYLD_INITIALIZER_SUPPORT
// run all C++ initializers inside dyld
runDyldInitializers(argc, argv, envp, apple);
#endif
_subsystem_init(apple);
// now that we are done bootstrapping dyld, call dyld's main
uintptr_t appsSlide = appsMachHeader->getSlide(a);return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
Copy the code
3.2 Source code analysis of dyld::_main function
Enter dyld::_main source code implementation, particularly long, about 600 lines, if the load process of dyld is not very familiar, can be based on the return value of the _main function, here for more. The _main function does a few things:
-
[Step 1: Conditions: Environment, platform, version, path, host information] : Set values based on environment variables and obtain the current running architecture
-
[Step 2: Loading shared cache] : Check whether shared cache is enabled and mapped to shared areas, such as UIKit, CoreFoundation, etc
-
Step 3: the main program initialization 】 【 : call instantiateFromLoadedImage function instantiates a ImageLoader object
-
[Step 4: Insert dynamic libraries] : Run the DYLD_INSERT_LIBRARIES environment variable, and call loadInsertedDylib to load
-
[Step 5: Link main program]
-
[Step 6: Link Dynamic library]
-
[Step 7: Weak reference binding]
-
[Step 8: Execute initialization method]
-
If not, read LC_UNIXTHREAD. In this way, we come to the familiar main function in daily development
The following is the main analysis of [Step 3] and [Step 8].
3.2.1 IntroductionStep 3: Main program initialization
-
SMainExecutable said the main program variables, view its assignment, is initialized by instantiateFromLoadedImage method
// instantiate ImageLoader for main executable // Step 3: Initialize the main program sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath); gLinkContext.mainExecutable = sMainExecutable; gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH); Copy the code
InstantiateFromLoadedImage initialization of the main program
-
Enter the instantiateFromLoadedImage source code, which create a ImageLoader instance objects, created by instantiateMainExecutable methods
// The kernel maps in main executable before dyld gets control. We need to // make an ImageLoader* for the already mapped in main executable. static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path) { // try mach-o loader // if ( isCompatibleMachO((const uint8_t*)mh, path) ) { ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext); addImage(image); return (ImageLoaderMachO*)image; // } // throw "main executable not a known format"; } Copy the code
-
Enter instantiateMainExecutable source, its role is primarily an executable file to create the image, return a ImageLoader type of image objects, namely the main program. The sniffLoadCommands function is used to obtain information about Load commands of Mach-O files and perform various checks on them
// create image for main executable ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context) { //dyld::log("ImageLoader=%ld, ImageLoaderMachO=%ld, ImageLoaderMachOClassic=%ld, ImageLoaderMachOCompressed=%ld\n", // sizeof(ImageLoader), sizeof(ImageLoaderMachO), sizeof(ImageLoaderMachOClassic), sizeof(ImageLoaderMachOCompressed)); bool compressed; unsigned int segCount; unsigned int libCount; const linkedit_data_command* codeSigCmd; const encryption_info_command* encryptCmd; sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd); // instantiate concrete class based on content of load commands if ( compressed ) return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context); else #if SUPPORT_CLASSIC_MACHO return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context); #else throw "missing LC_DYLD_INFO load command"; #endif } Copy the code
3.2.2 IntroductionStep 8: Execute the initialization method
-
Enter the initializeMainExecutable source code, primarily in loop traversals, and all runInitializers will be executed
void initializeMainExecutable(a) { // record that we've reached this step gLinkContext.startedInitializingMainExecutable = true; // run initialzers for any inserted dylibs ImageLoader::InitializerTimingList initializerTimes[allImagesCount()]; initializerTimes[0].count = 0; const size_t rootCount = sImageRoots.size(a);if ( rootCount > 1 ) { for(size_t i=1; i < rootCount; ++i) { sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]); }}// run initializers for main executable and everything it brings up sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]); // register cxa_atexit() handler to run static terminators in all loaded images when this process exits if( gLibSystemHelpers ! =NULL ) (*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL.NULL); // dump info if requested if ( sEnv.DYLD_PRINT_STATISTICS ) ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]); if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS ) ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]); } Copy the code
-
A global search for runInitializers(cons) finds the following source code, whose core code is the call of processInitializers
void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo) { uint64_t t1 = mach_absolute_time(a);mach_port_t thisThread = mach_thread_self(a); ImageLoader::UninitedUpwards up; up.count =1; up.imagesAndPaths[0] = { this.this->getPath() }; processInitializers(context, thisThread, timingInfo, up); context.notifyBatch(dyld_image_state_initialized, false); mach_port_deallocate(mach_task_self(), thisThread); uint64_t t2 = mach_absolute_time(a); fgTotalInitTime += (t2 - t1); }Copy the code
-
Enter the source code implementation of processInitializers, where the mirror list is recursively instantiated by calling recursiveInitialization
// <rdar://problem/14412057> upward dylib initializers can be run too soon // To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs // have their initialization postponed until after the recursion through downward dylibs // has completed. void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread, InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images) { uint32_t maxImageCount = context.imageCount() +2; ImageLoader::UninitedUpwards upsBuffer[maxImageCount]; ImageLoader::UninitedUpwards& ups = upsBuffer[0]; ups.count = 0; // Calling recursive init on all images in images list, building a new list of // uninitialized upward dependencies. // Call recursive instantiation on all the mirrors in the mirror list to create a new list of uninitialized upward dependencies for (uintptr_t i=0; i < images.count; ++i) { images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups); } // If any upward dependencies remain, init them. If there are any upward dependencies, initialize them if ( ups.count > 0 ) processInitializers(context, thisThread, timingInfo, ups); } Copy the code
-
The global search for recursiveInitialization(cons) function, which the source code implements as follows
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) { recursive_lock lock_info(this_thread); recursiveSpinLock(lock_info);// Lock recursively if ( fState < dyld_image_state_dependents_initialized- 1 ) { uint8_t oldState = fState; // Break cycles to end the recursion fState = dyld_image_state_dependents_initialized- 1; try { // initialize lower level libraries first for(unsigned int i=0; i < libraryCount(a); ++i) { ImageLoader* dependentImage =libImage(i); if( dependentImage ! =NULL ) { // don't try to initialize stuff "above" me yet if ( libIsUpward(i) ) { uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) }; uninitUps.count++; } else if ( dependentImage->fDepth >= fDepth ) { dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); }}}// record termination order if ( this->needsTermination() ) context.terminationRecorder(this); // let objc know we are about to initialize this image // Let objc know that we want to load the image uint64_t t1 = mach_absolute_time(a); fState = dyld_image_state_dependents_initialized; oldState = fState; context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo); // Initialize this image bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image // Let everyone know that we have initialized the image fState = dyld_image_state_initialized; oldState = fState; context.notifySingle(dyld_image_state_initialized, this.NULL); if ( hasInitializers ) { uint64_t t2 = mach_absolute_time(a); timingInfo.addTime(this->getShortName(), t2-t1); }}catch (const char* msg) { // this image is not initialized fState = oldState; recursiveSpinUnLock(a);throw; }}recursiveSpinUnLock(a);// Unlock recursively } Copy the code
In this case, we need to explore the notifySingle function in two parts, the notifySingle function and the doInitialization function. We will explore the notifySingle function first
3.2.2.1 notifySingle function
-
Search globally for notifySingle(function, whose focus is (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); This sentence
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo) { //dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath()); std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers); if( handlers ! =NULL ) { dyld_image_info info; info.imageLoadAddress = image->machHeader(a); info.imageFilePath = image->getRealPath(a); info.imageFileModDate = image->lastModified(a);for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(a); it ! = handlers->end(a); ++it) {const char* result = (*it)(state, 1, &info); if( (result ! =NULL) && (state == dyld_image_state_mapped) ) { //fprintf(stderr, " image rejected by handler=%p\n", *it); // make copy of thrown string so that later catch clauses can free it const char* str = strdup(result); throwstr; }}}if ( state == dyld_image_state_mapped ) {// Whether to be mapped // <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache // <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches // Save the mirror address + UUID from the shared mashup external if(! image->inSharedCache() || (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) { dyld_uuid_info info; if ( image->getUUID(info.imageUUID) ) { info.imageLoadAddress = image->machHeader(a);addNonSharedCacheImageUUID(info); }}}if( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit ! =NULL) && image->notifyObjC()) {uint64_t t0 = mach_absolute_time(a);dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0.0); (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); uint64_t t1 = mach_absolute_time(a);uint64_t t2 = mach_absolute_time(a);uint64_t timeInObjC = t1-t0; uint64_t emptyTime = (t2-t1)*100; if( (timeInObjC > emptyTime) && (timingInfo ! =NULL) ) { timingInfo->addTime(image->getShortName(), timeInObjC); }}// mach message csdlc about dynamically unloaded images if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) { notifyKernel(*image, false); const struct mach_header* loadAddress[] = { image->machHeader() }; const char* loadPath[] = { image->getPath() }; notifyMonitoringDyld(true.1, loadAddress, loadPath); }}Copy the code
-
Global search for sNotifyObjCInit, no implementation found, assignment operation
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped) { // record functions to call sNotifyObjCMapped = mapped; sNotifyObjCInit = init;/ / the key sNotifyObjCUnmapped = unmapped; // call 'mapped' function with all images mapped so far try { notifyBatchPartial(dyld_image_state_bound, true.NULL.false.true); } catch (const char* msg) { // ignore request to abort during registration } // <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem) for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(a); it ! = sAllImages.end(a); it++) { ImageLoader* image = *it;if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC()) {dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0.0); (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); }}}Copy the code
-
Note: the function _dyLD_OBJC_NOTIFy_register needs to be searched in the libobjc source code
void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped) { dyld::registerObjCNotifiers(mapped, init, unmapped); } Copy the code
-
In objC4-818.2, we searched for _dyLD_objc_notify_register and found that it was called in _objc_init, so sNotifyObjCInit was assigned to load_images in objc. And load_images calls all the +load methods. So, a notifySingle is a callback function
/*********************************************************************** * _objc_init * Bootstrap initialization. Registers our image notifier with dyld. * Called by libSystem BEFORE library initialization time * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * / void _objc_init(void) { static bool initialized = false; if (initialized) return; initialized = true; // fixme defer initialization until an objc-using image is found? environ_init(a);tls_init(a);static_init(a);runtime_init(a);exception_init(a);#if __OBJC2__ cache_t: :init(a);#endif _imp_implementationWithBlock_init(); _dyld_objc_notify_register(&map_images, load_images, unmap_image);/ / the key #if __OBJC2__ didCallDyldNotifyRegister = true; #endif } Copy the code
Load function to load
Let’s go to the source code of load_images and look at its implementation to prove that all load functions are called in load_images
-
Through the objC source _objC_init source implementation, into the source implementation of load_images
void load_images(const char *path __unused, const struct mach_header *mh) { if(! didInitialAttachCategories && didCallDyldNotifyRegister) { didInitialAttachCategories =true; loadAllCategories(a); }// Return without taking locks if there are no +load methods here. if (!hasLoadMethods((const headerType *)mh)) return; recursive_mutex_locker_t lock(loadMethodLock); // Discover load methods { mutex_locker_t lock2(runtimeLock); prepare_load_methods((const headerType *)mh); } // Call +load methods (without runtimeLock - re-entrant) call_load_methods(a); }Copy the code
-
Enter the source code implementation of call_load_methods, you can find that the core of the +load method is called through the do-while loop
/*********************************************************************** * call_load_methods * Call all pending class and category +load methods. * Class +load methods are called superclass-first. * Category +load methods are not called until after the parent class's +load. * * This method must be RE-ENTRANT, because a +load could trigger * more image mapping. In addition, the superclass-first ordering * must be preserved in the face of re-entrant calls. Therefore, * only the OUTERMOST call of this function will do anything, and * that call will handle all loadable classes, even those generated * while it was running. * * The sequence below preserves +load ordering in the face of * image loading during a +load, and make sure that no * +load method is forgotten because it was added during * a +load call. * Sequence: * 1. Repeatedly call class +loads until there aren't any more * 2. Call category +loads ONCE. * 3. Run more +loads if: * (a) there are more classes to load, OR * (b) there are some potential category +loads that have * still never been attempted. * Category +loads are only run once to ensure "parent class first" * ordering, even if a category +load triggers a new loadable class * and a new loadable category attached to that class. * * Locking: loadMethodLock must be held by the caller * All other locks must not be held. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * / void call_load_methods(void) { static bool loading = NO; bool more_categories; loadMethodLock.assertLocked(a);// Re-entrant calls do nothing; the outermost call will finish the job. if (loading) return; loading = YES; void *pool = objc_autoreleasePoolPush(a);do { // 1. Repeatedly call class +loads until there aren't any more while (loadable_classes_used > 0) { call_class_loads(a); }// 2. Call category +loads ONCE more_categories = call_category_loads(a);// 3. Run more +loads if there are classes OR more untried categories } while (loadable_classes_used > 0 || more_categories); objc_autoreleasePoolPop(pool); loading = NO; } Copy the code
-
Entering the call_class_loads source code implementation, you see that the load method called here validates the load method of the class we mentioned earlier
/*********************************************************************** * call_class_loads * Call all pending class +load methods. * If new classes become loadable, +load is NOT called for them. * * Called only by call_load_methods(). * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * / static void call_class_loads(void) { int i; // Detach current loadable list. struct loadable_class *classes = loadable_classes; int used = loadable_classes_used; loadable_classes = nil; loadable_classes_allocated = 0; loadable_classes_used = 0; // Call all +loads for the detached list. for (i = 0; i < used; i++) { Class cls = classes[i].cls; load_method_t load_method = (load_method_t)classes[i].method; if(! cls)continue; if (PrintLoading) { _objc_inform("LOAD: +[%s load]\n", cls->nameForLogging()); } (*load_method)(cls, @selector(load)); } // Destroy the detached list. if (classes) free(classes); } Copy the code
So, load_images calls all the load functions, and the source code analysis above corresponds to the stack print
【 summary 】 Load source chain is: _dyld_start –> dyldbootstrap::start –> dyld::_main –> dyld::initializeMainExecutable –> ImageLoader::runInitializers — > ImageLoader: : processInitializers — > ImageLoader: : recursiveInitialization – > dyld: : notifySingle (is a callback processing) — > sNotifyObjCInit –> load_images(libobjc.A.dylib)
So the question is, when is _objc_init called? Please read on
3.2.2.2 doInitialization function
-
We went to the _objc_init function of objC and found that it was not working. We went back to the source implementation of the recursiveInitialization function and found that we had omitted one function doInitialization
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) { recursive_lock lock_info(this_thread); recursiveSpinLock(lock_info);// Lock recursively if ( fState < dyld_image_state_dependents_initialized- 1 ) { uint8_t oldState = fState; // Break cycles to end the recursion fState = dyld_image_state_dependents_initialized- 1; try { // initialize lower level libraries first for(unsigned int i=0; i < libraryCount(a); ++i) { ImageLoader* dependentImage =libImage(i); if( dependentImage ! =NULL ) { // don't try to initialize stuff "above" me yet if ( libIsUpward(i) ) { uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) }; uninitUps.count++; } else if ( dependentImage->fDepth >= fDepth ) { dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); }}}// record termination order if ( this->needsTermination() ) context.terminationRecorder(this); // let objc know we are about to initialize this image // Let objc know that we want to load the image uint64_t t1 = mach_absolute_time(a); fState = dyld_image_state_dependents_initialized; oldState = fState; context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo); // Initialize this image bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image // Let everyone know that we have initialized the image fState = dyld_image_state_initialized; oldState = fState; context.notifySingle(dyld_image_state_initialized, this.NULL); if ( hasInitializers ) { uint64_t t2 = mach_absolute_time(a); timingInfo.addTime(this->getShortName(), t2-t1); }}catch (const char* msg) { // this image is not initialized fState = oldState; recursiveSpinUnLock(a);throw; }}recursiveSpinUnLock(a);// Unlock recursively } Copy the code
-
Enter the source implementation of the doInitialization function, which also needs to be divided into two parts, one is the doImageInit function, and the other is the doModInitFunctions
bool ImageLoaderMachO::doInitialization(const LinkContext& context) { CRSetCrashLogMessage2(this->getPath()); // mach-o has -init and static initializers doImageInit(context); doModInitFunctions(context); CRSetCrashLogMessage2(NULL); return (fHasDashInit || fHasInitializers); } Copy the code
-
Enter doImageInit source code implementation, its core is mainly for loop loading method call, here need to note that the initialization of libSystem must be run first
void ImageLoaderMachO::doImageInit(const LinkContext& context) { if ( fHasDashInit ) { const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds; const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)]; const struct load_command* cmd = cmds; for (uint32_t i = 0; i < cmd_count; ++i) { switch (cmd->cmd) { case LC_ROUTINES_COMMAND: Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide); #if __has_feature(ptrauth_calls) func = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)func, ptrauth_key_asia, 0); #endif // <rdar://problem/8543820&9228031> verify initializers are in image if(!this->containsAddress(stripPointer((void*)func)) ) { dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath()); } if(! dyld::gProcessInfo->libSystemInitialized ) {// The libSystem initializer must be run first, which is of high priority // <rdar://problem/17973316> libSystem initializer must run first dyld::throwf("-init function in image (%s) that does not link with libSystem.dylib\n".this->getPath()); } if ( context.verboseInit ) dyld::log("dyld: calling -init function %p in %s\n", func, this->getPath()); { dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0); func(context.argc, context.argv, context.envp, context.apple, &context.programVars); } break; } cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize); }}}Copy the code
-
Enter the source code implementation of doModInitFunctions. This method loads all Cxx files and can be verified by testing the program’s stack information. Add a breakpoint at the C++ method
void ImageLoaderMachO::doModInitFunctions(const LinkContext& context) { if ( fHasInitializers ) { const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds; const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)]; const struct load_command* cmd = cmds; for (uint32_t i = 0; i < cmd_count; ++i) { if ( cmd->cmd == LC_SEGMENT_COMMAND ) { const struct macho_segment_command* seg = (struct macho_segment_command*)cmd; const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command)); const struct macho_section* const sectionsEnd = §ionsStart[seg->nsects]; for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) { const uint8_t type = sect->flags & SECTION_TYPE; if( type == S_MOD_INIT_FUNC_POINTERS ) {.... }else if( type == S_INIT_FUNC_OFFSETS ) {.... } } cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize); }}}Copy the code
-
When I get here, I still don’t find the call to _objc_init? What to do? Give up? Of course not, we can also look at the stack before we call _objc_init with a symbolic breakpoint,
-
_objc_init with a symbolic breakpoint. Run the program to see the stack information after _objc_init is broken
-
Look for libSystem_initializer in libsystem libsystem-1292.60.1 to see the implementation
LibSystem_initializer source implementation
// libsyscall_initializer() initializes all of libSystem.dylib // <rdar://problem/4892197> __attribute__((constructor)) static void libSystem_initializer(int argc, const char* argv[], const char* envp[], const char* apple[], const struct ProgramVars* vars) {... _libSystem_ktrace0(ARIADNE_LIFECYCLE_libsystem_init | DBG_FUNC_START); __libkernel_init(&libkernel_funcs, envp, apple, vars); _libSystem_ktrace_init_func(KERNEL); __libplatform_init(NULL, envp, apple, vars); _libSystem_ktrace_init_func(PLATFORM); __pthread_init(&libpthread_funcs, envp, apple, vars); _libSystem_ktrace_init_func(PTHREAD); _libc_initializer(&libc_funcs, envp, apple, vars); _libSystem_ktrace_init_func(LIBC); // TODO: Move __malloc_init before __libc_init after breaking malloc's upward link to Libc // Note that __malloc_init() will also initialize ASAN when it is present __malloc_init(apple); _libSystem_ktrace_init_func(MALLOC); #if TARGET_OS_OSX /* <rdar://problem/9664631> */ __keymgr_initializer(); _libSystem_ktrace_init_func(KEYMGR); #endif _dyld_initializer();/ / dyld initialization _libSystem_ktrace_init_func(DYLD); libdispatch_init(a);// Dispatch initialization _libSystem_ktrace_init_func(LIBDISPATCH); #if! TARGET_OS_DRIVERKIT _libxpc_initializer(); _libSystem_ktrace_init_func(LIBXPC); #if CURRENT_VARIANT_asan setenv("DT_BYPASS_LEAKS_CHECK"."1".1); #endif #endif / /! TARGET_OS_DRIVERKIT // must be initialized after dispatch _libtrace_init(); _libSystem_ktrace_init_func(LIBTRACE); #if! TARGET_OS_DRIVERKIT #if defined(HAVE_SYSTEM_SECINIT) _libsecinit_initializer(); _libSystem_ktrace_init_func(SECINIT); #endif #if defined(HAVE_SYSTEM_CONTAINERMANAGER) _container_init(apple); _libSystem_ktrace_init_func(CONTAINERMGR); #endif __libdarwin_init(); _libSystem_ktrace_init_func(DARWIN); #endif / /! TARGET_OS_DRIVERKIT__stack_logging_early_finished(&malloc_funcs); . }Copy the code
-
According to the previous stack information, we found that walking is invoked in the libSystem_initializer libdispatch_init function, and the function of the source code is in libdispatch open source in the library, Libdispatch-1271.120.2 search libDispatch_init in libDispatch
DISPATCH_EXPORT DISPATCH_NOTHROW void libdispatch_init(void) { dispatch_assert(sizeof(struct dispatch_apply_s) <= DISPATCH_CONTINUATION_SIZE); if (_dispatch_getenv_bool("LIBDISPATCH_STRICT".false)) { _dispatch_mode |= DISPATCH_MODE_STRICT; } #if DISPATCH_DEBUG || DISPATCH_PROFILE #if DISPATCH_USE_KEVENT_WORKQUEUE if (getenv("LIBDISPATCH_DISABLE_KEVENT_WQ")) { _dispatch_kevent_workqueue_enabled = false; } #endif #endif #if HAVE_PTHREAD_WORKQUEUE_QOS dispatch_qos_t qos = _dispatch_qos_from_qos_class(qos_class_main()); _dispatch_main_q.dq_priority = _dispatch_priority_make(qos, 0); #if DISPATCH_DEBUG if (!getenv("LIBDISPATCH_DISABLE_SET_QOS")) { _dispatch_set_qos_class_enabled = 1; } #endif #endif #if DISPATCH_USE_THREAD_LOCAL_STORAGE _dispatch_thread_key_create(&__dispatch_tsd_key, _libdispatch_tsd_cleanup); #else _dispatch_thread_key_create(&dispatch_priority_key, NULL); _dispatch_thread_key_create(&dispatch_r2k_key, NULL); _dispatch_thread_key_create(&dispatch_queue_key, _dispatch_queue_cleanup); _dispatch_thread_key_create(&dispatch_frame_key, _dispatch_frame_cleanup); _dispatch_thread_key_create(&dispatch_cache_key, _dispatch_cache_cleanup); _dispatch_thread_key_create(&dispatch_context_key, _dispatch_context_cleanup); _dispatch_thread_key_create(&dispatch_pthread_root_queue_observer_hooks_key, NULL); _dispatch_thread_key_create(&dispatch_basepri_key, NULL); #if DISPATCH_INTROSPECTION _dispatch_thread_key_create(&dispatch_introspection_key , NULL); #elif DISPATCH_PERF_MON _dispatch_thread_key_create(&dispatch_bcounter_key, NULL); #endif _dispatch_thread_key_create(&dispatch_wlh_key, _dispatch_wlh_cleanup); _dispatch_thread_key_create(&dispatch_voucher_key, _voucher_thread_cleanup); _dispatch_thread_key_create(&dispatch_deferred_items_key, _dispatch_deferred_items_cleanup); #endif pthread_key_create(&_os_workgroup_key, _os_workgroup_tsd_cleanup); #if DISPATCH_USE_RESOLVERS // rdar://problem/8541707 _dispatch_main_q.do_targetq = _dispatch_get_default_queue(true); #endif _dispatch_queue_set_current(&_dispatch_main_q); _dispatch_queue_set_bound_thread(&_dispatch_main_q); #if DISPATCH_USE_PTHREAD_ATFORK (void)dispatch_assume_zero(pthread_atfork(dispatch_atfork_prepare, dispatch_atfork_parent, dispatch_atfork_child)); #endif _dispatch_hw_config_init(); _dispatch_time_init(); _dispatch_vtable_init(); _os_object_init();/ / the key _voucher_init(); _dispatch_introspection_init(); } Copy the code
-
Initinit_dyld_objc_notify_register; / / init_objc_notifY_register; / / init_dyLD_OBJC_notify_register; The call to sNotifySingle –> sNotifyObjCInie= parameter 2 to sNotifyObjcInit() forms a closed loop
void _os_object_init(void) { _objc_init();/ / the key Block_callbacks_RR callbacks = { sizeof(Block_callbacks_RR), (void(*) (const void *))&objc_retain, (void(*) (const void *))&objc_release, (void(*) (const void *))&_os_objc_destructInstance }; _Block_use_RR2(&callbacks); #if DISPATCH_COCOA_COMPAT const char *v = getenv("OBJC_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); v = getenv("DISPATCH_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); v = getenv("LIBDISPATCH_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); #endif } Copy the code
So the simple way to think about it is sNotifySingle, which is to add a notification which is addObserver, _objc_init, _dyLD_OBJC_notify_register which is to send a notification, which is push, And sNotifyObjcInit is the notification handler, the selector
[Summary] : _objc_init source chain: _dyld_start –> dyldbootstrap::start –> dyld::_main –> dyld::initializeMainExecutable –> ImageLoader::runInitializers –> ImageLoader::processInitializers –> ImageLoader::recursiveInitialization –> doInitialization LibSystem_initializer (libsystem.b.dylib) –> _os_object_init (libdispatch.dylib) –> _objc_init(libobjc.a.dylib)
3.2.3 IntroductionStep 9: Find the main entry function
-
Assembly debugging, you can see the display comes to the +[ViewController Load] method
-
Go ahead and go to the C++ function ypyFunc
-
Click stepover, continue to run down, run through the entire process, will return to _dyLD_start, then call main() function, through the assembly to complete the main parameter assignment operation dyld source code implementation
_dyLD_start LC_MAIN case, set up stack for call to main()
#if__arm64__ && ! TARGET_OS_SIMULATOR .text .align 2 .globl __dyld_start __dyld_start: mov x28, sp and sp, x28, #~15 // force 16-byte alignment of stack mov x0, #0 mov x1, #0 stp x1, x0, [sp, #- 16]! // make aligned terminating frame mov fp, sp // set up fp to point to terminating frame sub sp, sp, #16 // make room for local variables #if __LP64__ ldr x0, [x28] // get app's mh into x0 ldr x1, [x28, #8] // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment) add x2, x28, #16 // get argv into x2 #else ldr w0, [x28] // get app's mh into x0 ldr w1, [x28, #4] // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment) add w2, w28, #8 // get argv into x2 #endif adrp x3,___dso_handle@page add x3,x3,___dso_handle@pageoff // get dyld's mh in to x4 mov x4,sp // x5 has &startGlue // call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue) bl __ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm mov x16,x0 // save entry point address in x16 #if __LP64__ ldr x1, [sp] #else ldr w1, [sp] #endif cmp x1, #0 b.ne Lnew // LC_UNIXTHREAD way, clean up stack and jump to result #if __LP64__ add sp, x28, #8 // restore unaligned stack pointer without app mh #else add sp, x28, #4 // restore unaligned stack pointer without app mh #endif #if __arm64e__ braaz x16 // jump to the program's entry point #else br x16 // jump to the program's entry point #endif // LC_MAIN case, set up stack for call to main() Lnew: mov lr, x1 // simulate return address into _start in libdyld.dylib #if __LP64__ ldr x0, [x28, #8] // main param1 = argc add x1, x28, #16 // main param2 = argv add x2, x1, x0, lsl #3 add x2, x2, #8 // main param3 = &env[0] mov x3, x2 Lapple: ldr x4, [x3] add x3, x3, #8 #else ldr w0, [x28, #4] // main param1 = argc add x1, x28, #8 // main param2 = argv add x2, x1, x0, lsl #2 add x2, x2, #4 // main param3 = &env[0] mov x3, x2 Lapple: ldr w4, [x3] add x3, x3, #4 #endif cmp x4, #0 b.ne Lapple // main param4 = apple #if __arm64e__ braaz x16 #else br x16 #endif #endif // __arm64__ && ! TARGET_OS_SIMULATOR Copy the code
Dyld main part of the assembly source code implementation
If the name of main is changed, an error will be reported. If the name of main is changed, an error will be reported
So, to sum up, the final dyLD loading process, as shown in the figure below, also explains the previous question: why load–>Cxx–>main call order
🌹 just like it 👍🌹
🌹 feel have harvest, can come a wave, collect + concern, comment + forward, lest you can’t find me next 😁🌹
🌹 welcome everyone to leave a message to exchange, criticize and correct, learn from each other 😁, improve self 🌹