Tag: App start dyld


This section describes the process of loading dyld. What else is done before the main function


Preparation work dyLD source libDispatch-1271.120.2 source libsystem-1292.60.1 objC4-818.2

1. dyld

1.1 introduction

  • dyldThe Dynamic Link Editor isApple's dynamic linkerIs an important part of Apple’s operating system, before apps are compiled and packaged into executable file formatMach-OAfter the file is submitted toDyld is responsible for connecting, loading programs
  • Dyld runs through the process of App startup, including loading dependent libraries and main program. If we need to optimize performance and startup, it is inevitable to deal with DyLD

1.2 History of DYLD

1.2.1 dyld 1.0 1996-2004

  • dyld 1Included in theNeXTStep 3.3, before NeXTStatic binaryThe data. Not very much,
  • dyld 1Was written before the C++ dynamic library was widely used in the system, because C++ has many features, such as the work of its initializers, that work well in static environments, but can degrade performance in dynamic environments. Therefore, large C++ dynamic libraries can cause dyld to complete a lot of work and be slow
  • In the releaseMacOS 10.0andCheetahBefore, also added a feature, namelyPrebinding pre binding. We can use Prebinding technology for alldylibAnd the application foundFixed address. Dyld will load all the contents of those addresses. If the load is successful, all dylib and program binary data will be edited to get all precomputations. The next time you need to put all your data into the same address, you won’t need to do anything else, which will speed things up a lot. But it also means that you have to edit the binary data every time you start, which is not a friendly way, at least from a security standpoint.

1.2.2 dyld 2, 2004-2017

Since its release in 2004, DyLD 2 has gone through several iterations. Some of the common features, such as ASLR, Code Sign, share Cache and so on, were introduced in DyLD 2

Dyld 2.0 (2004-2007)

  • In 2004,macOS TigerIntroduced in thedyld 2
  • dyld 2isdyld 1A completely rewritten version that properly supports C++ initializer semantics, extends the mach-o format and updates dyld. Thus obtained the efficient C++ library support.
  • Dyld 2 has completeddlopenanddlsym(used primarily for dynamically loading libraries and calling functions) with the correct semantics, so the older API was deprecated
    • dlopen: Opens a library and gets the handle
    • dlsym: Looks for the value of the symbol in the open library
    • dlclose: Closes the handle.
    • dlerror: Returns a string describing the last call to dlopen, DLSYm, or DLCLOSE.
  • dyldtheDesign goalsisIncrease start speed. Therefore, only limited health tests were performed. Mainly because there were fewer malicious programs in the past
  • At the same time, DyLD also has some security issues, so some features have been improved to improve the security of DyLD on the platform
  • Because of the huge increase in startup speed, we canReduce Prebinding work. withEdit program dataThe difference is that here we only edit the system library and can do so only when the software is updated. Therefore, during the software update process, you may see words like “Optimize system performance.” This is thePrebinding for updates. Now DYLD is used for all optimizations, and its purpose is optimization. So we have dyld 2

Dyld 2.x (2007-2017)

  • A number of improvements were made over the years from 2004 to 20017, and the performance of dyLD 2 improved significantly
  • First of all,increaseA lot ofThe infrastructureandplatform.
    • Since the release of DyLD 2 on PowerPC, addedx86,x86_64,arm,arm64And a number of derivative platforms.
    • Also launchediOS,tvOSandwatchOS, all of which require new DyLD capabilities
  • Increase security in a variety of ways
    • increasecodeSigningCode signature,
    • ASLR (Address Space Layout Randomization)Address space configuration random loading: each time the library is loaded, it may be at a different address
    • bound checkingBoundary checking: Frontier checking for headers was added to Mach-o files to avoid the injection of malicious binary data
  • Enhanced performance
    • You can remove Prebinding by usingshare cacheShared code substitution

ASLR

  • ASLRIt’s a way to prevent memory corruption vulnerability from being exploitedComputer security technology, ASLR prevents an attacker from jumping to a specific location in memory to exploit the function by randomly placing the address space of the process’s key data area
  • Linux has added ASLR in kernel version 2.6.12
  • The Apple inMac OS X Leopard 10.5(Released in October 2007)Random address offset, but its implementation does not provide the full protection capabilities defined by ASLR. Mac OS X Lion 10.7 provides ASLR support for all applications.
  • The Apple inIOS 4.3To import theASLR.

Check the bounds

  • Important additions to many of the mach-O headersThe border checkFunctions thus can beAvoid injection of malicious binary data

Share Cache share code

  • share cacheThe first isiOS3.1andmacOS Snow LeopardTo completely replace Prebinding
  • share cacheIs aA single file, including mostSystem dylibSince these dylib files are merged into one file, they can be optimized.
    • Readjust allText segment (_TEXT)andData segment (_DATA), and overwrite the entire symbol table to reduce the size of the file so that only a small number of areas are mounted per process. Allows us to pack binary data segments, thus saving a lot of RAM
    • The essence is aDylib prelinker, the savings in RAM are significant and can be achieved when running in a normal iOS program500-1gmemory
    • You can alsoPregenerate data structuresFor use by dyld and ob-c at run time. You don’t have to do these things when the program starts, which also saves more RAM and time
  • share cacheLocally generated on macOS, running dyLD shared code will greatly improve system performance

1.2.3 DYLD 3 2017-present

  • dyld 3WWDC 2017 is a new dynamic linker that completely changes the concept of dynamic linking and will be the default setting for most macOS applications. Dyld 3 will be used by default on all operating systems on Apple OS 2017.
  • dyld 3The earliest was in 2017iOS 11Mainly used to optimize the system library.
  • And in theiOS 13In the system, iOS fully adopted the new DyLD 3 to replace the previous DyLD 2, becauseDyld 3 is fully compatible with DYLD 2, the API is the same, so, in most cases, the developer does not need to do any additional adaptation to smooth the transition.

2. BybtViewing stack InformationWhere does app launch start

Put a breakpoint on the load method, and check the BITtorrent stack information to see where the app started. Run the program and find out, did it start with _dyLD_start in dyld

int main(int argc, char * argv[]) {
    NSString * appDelegateClassName;
    
    
    @autoreleasepool {
        // Setup code that might create autoreleased objects goes here.
        appDelegateClassName = NSStringFromClass([AppDelegate class]);
    }
    return UIApplicationMain(argc, argv, nil, appDelegateClassName);
}

__attribute__((constructor)) void ypyFunc(){
    printf("Coming: %s \n",__func__);
}

@interface ViewController(a)

@end

@implementation ViewController

+ (void)load{
    NSLog(@"%s",__func__);
}

- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view.
}

@end
Copy the code
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000104ca5f24 002- Application load analysis +[ViewController Load](self=ViewController, _cmd="load") at ViewController.m:17:5
    frame #1: 0x00000001aafd735c libobjc.A.dylib`load_images + 984
    frame #2: 0x0000000104e0a190 dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 448
    frame #3: 0x0000000104e1a0d8 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int.char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 512
    frame #4: 0x0000000104e18520 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 184
    frame #5: 0x0000000104e185e8 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 92
    frame #6: 0x0000000104e0a658 dyld`dyld::initializeMainExecutable() + 216
    frame #7: 0x0000000104e0eeb0 dyld`dyld::_main(macho_header const*, unsigned long.int.char const* *,char const* *,char const* *,unsigned long*) + 4400
    frame #8: 0x0000000104e09208 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int.char const**, dyld3::MachOLoaded const*, unsigned long*) + 396
    frame #9: 0x0000000104e09038 dyld`_dyld_start + 56
(lldb)
Copy the code

3. _dyLD_start Process analysis

Dyldbootstrap ::start(app_mh, argc, argv, dyLD_MH, &startglue) Is a C++ method

Dyldbootstrap: 3.1: start the source code

In the source code, search dyldbootstrap to find the namespace, and then look for the start method in the file. The core of this method is that the return value is called the main function of dyld, where macho_header is the header of Mach-o, and the file dyld loads is of the Type of Mach-o. The Mach-O type is an executable file type, consisting of four parts: Mach-O header, Load Command, section, and Other Data. You can view executable file information through MachOView

//
// This is code to bootstrap dyld. This work in normally done for a program by dyld and crt.
// In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
				const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{

    // Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
    dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0.0.0.0);

	// if kernel had to slide dyld, we need to fix up load sensitive locations
	// we have to do this before using any global variables
    rebaseDyld(dyldsMachHeader);

	// kernel sets up env pointer to be just past end of agv array
	const char** envp = &argv[argc+1];
	
	// kernel sets up apple pointer to be just past end of envp array
	const char** apple = envp;
	while(*apple ! =NULL) { ++apple; }
	++apple;

	// set up random value for stack canary
	__guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
	// run all C++ initializers inside dyld
	runDyldInitializers(argc, argv, envp, apple);
#endif

	_subsystem_init(apple);

	// now that we are done bootstrapping dyld, call dyld's main
	uintptr_t appsSlide = appsMachHeader->getSlide(a);return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
Copy the code

3.2 Source code analysis of dyld::_main function

Enter dyld::_main source code implementation, particularly long, about 600 lines, if the load process of dyld is not very familiar, can be based on the return value of the _main function, here for more. The _main function does a few things:

  • [Step 1: Conditions: Environment, platform, version, path, host information] : Set values based on environment variables and obtain the current running architecture

  • [Step 2: Loading shared cache] : Check whether shared cache is enabled and mapped to shared areas, such as UIKit, CoreFoundation, etc

  • Step 3: the main program initialization 】 【 : call instantiateFromLoadedImage function instantiates a ImageLoader object

  • [Step 4: Insert dynamic libraries] : Run the DYLD_INSERT_LIBRARIES environment variable, and call loadInsertedDylib to load

  • [Step 5: Link main program]

  • [Step 6: Link Dynamic library]

  • [Step 7: Weak reference binding]

  • [Step 8: Execute initialization method]

  • If not, read LC_UNIXTHREAD. In this way, we come to the familiar main function in daily development

The following is the main analysis of [Step 3] and [Step 8].

3.2.1 IntroductionStep 3: Main program initialization

  • SMainExecutable said the main program variables, view its assignment, is initialized by instantiateFromLoadedImage method

    // instantiate ImageLoader for main executable
    // Step 3: Initialize the main program
    		sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
    		gLinkContext.mainExecutable = sMainExecutable;
    		gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
    Copy the code

    InstantiateFromLoadedImage initialization of the main program

  • Enter the instantiateFromLoadedImage source code, which create a ImageLoader instance objects, created by instantiateMainExecutable methods

    // The kernel maps in main executable before dyld gets control. We need to
    // make an ImageLoader* for the already mapped in main executable.
    static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
    {
    	// try mach-o loader
    //	if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
    		ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
    		addImage(image);
    		return (ImageLoaderMachO*)image;
    //	}
    	
    //	throw "main executable not a known format";
    }
    Copy the code
  • Enter instantiateMainExecutable source, its role is primarily an executable file to create the image, return a ImageLoader type of image objects, namely the main program. The sniffLoadCommands function is used to obtain information about Load commands of Mach-O files and perform various checks on them

    // create image for main executable
    ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context)
    {
    	//dyld::log("ImageLoader=%ld, ImageLoaderMachO=%ld, ImageLoaderMachOClassic=%ld, ImageLoaderMachOCompressed=%ld\n",
    	//	sizeof(ImageLoader), sizeof(ImageLoaderMachO), sizeof(ImageLoaderMachOClassic), sizeof(ImageLoaderMachOCompressed));
    	bool compressed;
    	unsigned int segCount;
    	unsigned int libCount;
    	const linkedit_data_command* codeSigCmd;
    	const encryption_info_command* encryptCmd;
    	sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd);
    	// instantiate concrete class based on content of load commands
    	if ( compressed ) 
    		return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
    	else
    #if SUPPORT_CLASSIC_MACHO
    		return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
    #else
    		throw "missing LC_DYLD_INFO load command";
    #endif
    }
    Copy the code

3.2.2 IntroductionStep 8: Execute the initialization method

  • Enter the initializeMainExecutable source code, primarily in loop traversals, and all runInitializers will be executed

    void initializeMainExecutable(a)
    {
    	// record that we've reached this step
    	gLinkContext.startedInitializingMainExecutable = true;
    
    	// run initialzers for any inserted dylibs
    	ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
    	initializerTimes[0].count = 0;
    	const size_t rootCount = sImageRoots.size(a);if ( rootCount > 1 ) {
    		for(size_t i=1; i < rootCount; ++i) {
    			sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]); }}// run initializers for main executable and everything it brings up 
    	sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
    	
    	// register cxa_atexit() handler to run static terminators in all loaded images when this process exits
    	if( gLibSystemHelpers ! =NULL ) 
    		(*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL.NULL);
    
    	// dump info if requested
    	if ( sEnv.DYLD_PRINT_STATISTICS )
    		ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
    	if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
    		ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
    }
    Copy the code
  • A global search for runInitializers(cons) finds the following source code, whose core code is the call of processInitializers

    void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
    {
    	uint64_t t1 = mach_absolute_time(a);mach_port_t thisThread = mach_thread_self(a); ImageLoader::UninitedUpwards up; up.count =1;
    	up.imagesAndPaths[0] = { this.this->getPath() };
    	processInitializers(context, thisThread, timingInfo, up);
    	context.notifyBatch(dyld_image_state_initialized, false);
    	mach_port_deallocate(mach_task_self(), thisThread);
    	uint64_t t2 = mach_absolute_time(a); fgTotalInitTime += (t2 - t1); }Copy the code
  • Enter the source code implementation of processInitializers, where the mirror list is recursively instantiated by calling recursiveInitialization

    // <rdar://problem/14412057> upward dylib initializers can be run too soon
    // To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs
    // have their initialization postponed until after the recursion through downward dylibs
    // has completed.
    void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
    									 InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
    {
    	uint32_t maxImageCount = context.imageCount() +2;
    	ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
    	ImageLoader::UninitedUpwards& ups = upsBuffer[0];
    	ups.count = 0;
    	// Calling recursive init on all images in images list, building a new list of
    	// uninitialized upward dependencies.
    // Call recursive instantiation on all the mirrors in the mirror list to create a new list of uninitialized upward dependencies
    	for (uintptr_t i=0; i < images.count; ++i) {
    		images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
    	}
    	// If any upward dependencies remain, init them. If there are any upward dependencies, initialize them
    	if ( ups.count > 0 )
    		processInitializers(context, thisThread, timingInfo, ups);
    }
    Copy the code
  • The global search for recursiveInitialization(cons) function, which the source code implements as follows

    void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
    										  InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
    {
    	recursive_lock lock_info(this_thread);
    	recursiveSpinLock(lock_info);// Lock recursively
    
    	if ( fState < dyld_image_state_dependents_initialized- 1 ) {
    		uint8_t oldState = fState;
    		// Break cycles to end the recursion
    		fState = dyld_image_state_dependents_initialized- 1;
    		try {
    			// initialize lower level libraries first
    			for(unsigned int i=0; i < libraryCount(a); ++i) { ImageLoader* dependentImage =libImage(i);
    				if( dependentImage ! =NULL ) {
    					// don't try to initialize stuff "above" me yet
    					if ( libIsUpward(i) ) {
    						uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
    						uninitUps.count++;
    					}
    					else if ( dependentImage->fDepth >= fDepth ) {
    						dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); }}}// record termination order
    			if ( this->needsTermination() )
    				context.terminationRecorder(this);
    
    			// let objc know we are about to initialize this image
    // Let objc know that we want to load the image
    			uint64_t t1 = mach_absolute_time(a); fState = dyld_image_state_dependents_initialized; oldState = fState; context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
    			
    			// Initialize this image
    			bool hasInitializers = this->doInitialization(context);
    
    			// let anyone know we finished initializing this image
    // Let everyone know that we have initialized the image
    			fState = dyld_image_state_initialized;
    			oldState = fState;
    			context.notifySingle(dyld_image_state_initialized, this.NULL);
    			
    			if ( hasInitializers ) {
    				uint64_t t2 = mach_absolute_time(a); timingInfo.addTime(this->getShortName(), t2-t1); }}catch (const char* msg) {
    			// this image is not initialized
    			fState = oldState;
    			recursiveSpinUnLock(a);throw; }}recursiveSpinUnLock(a);// Unlock recursively
    }
    Copy the code

In this case, we need to explore the notifySingle function in two parts, the notifySingle function and the doInitialization function. We will explore the notifySingle function first

3.2.2.1 notifySingle function

  • Search globally for notifySingle(function, whose focus is (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); This sentence

    static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
    {
    	//dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath());
    	std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers);
    	if( handlers ! =NULL ) {
    		dyld_image_info info;
    		info.imageLoadAddress	= image->machHeader(a); info.imageFilePath = image->getRealPath(a); info.imageFileModDate = image->lastModified(a);for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(a); it ! = handlers->end(a); ++it) {const char* result = (*it)(state, 1, &info);
    			if( (result ! =NULL) && (state == dyld_image_state_mapped) ) { 
    				//fprintf(stderr, " image rejected by handler=%p\n", *it);
    				// make copy of thrown string so that later catch clauses can free it
    				const char* str = strdup(result);
    				throwstr; }}}if ( state == dyld_image_state_mapped ) {// Whether to be mapped
    		// <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache
    		// <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches
    // Save the mirror address + UUID from the shared mashup external
    		if(! image->inSharedCache()
    			|| (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) {
    			dyld_uuid_info info;
    			if ( image->getUUID(info.imageUUID) ) {
    				info.imageLoadAddress = image->machHeader(a);addNonSharedCacheImageUUID(info); }}}if( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit ! =NULL) && image->notifyObjC()) {uint64_t t0 = mach_absolute_time(a);dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0.0);
    		(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
    		uint64_t t1 = mach_absolute_time(a);uint64_t t2 = mach_absolute_time(a);uint64_t timeInObjC = t1-t0;
    		uint64_t emptyTime = (t2-t1)*100;
    		if( (timeInObjC > emptyTime) && (timingInfo ! =NULL) ) {
    			timingInfo->addTime(image->getShortName(), timeInObjC); }}// mach message csdlc about dynamically unloaded images
    	if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) {
    		notifyKernel(*image, false);
    		const struct mach_header* loadAddress[] = { image->machHeader() };
    		const char* loadPath[] = { image->getPath() };
    		notifyMonitoringDyld(true.1, loadAddress, loadPath); }}Copy the code
  • Global search for sNotifyObjCInit, no implementation found, assignment operation

    void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
    {
    	// record functions to call
    	sNotifyObjCMapped	= mapped;
    	sNotifyObjCInit		= init;/ / the key
    	sNotifyObjCUnmapped = unmapped;
    
    	// call 'mapped' function with all images mapped so far
    	try {
    		notifyBatchPartial(dyld_image_state_bound, true.NULL.false.true);
    	}
    	catch (const char* msg) {
    		// ignore request to abort during registration
    	}
    
    	// <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem)
    	for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(a); it ! = sAllImages.end(a); it++) { ImageLoader* image = *it;if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC()) {dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0.0);
    			(*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); }}}Copy the code
  • Note: the function _dyLD_OBJC_NOTIFy_register needs to be searched in the libobjc source code

    void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                    _dyld_objc_notify_init      init,
                                    _dyld_objc_notify_unmapped  unmapped)
    {
    	dyld::registerObjCNotifiers(mapped, init, unmapped);
    }
    Copy the code
  • In objC4-818.2, we searched for _dyLD_objc_notify_register and found that it was called in _objc_init, so sNotifyObjCInit was assigned to load_images in objc. And load_images calls all the +load methods. So, a notifySingle is a callback function

    /*********************************************************************** * _objc_init * Bootstrap initialization. Registers our image notifier with dyld. * Called by libSystem BEFORE library initialization time * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
    
    void _objc_init(void)
    {
        static bool initialized = false;
        if (initialized) return;
        initialized = true;
        
        // fixme defer initialization until an objc-using image is found?
        environ_init(a);tls_init(a);static_init(a);runtime_init(a);exception_init(a);#if __OBJC2__
        cache_t: :init(a);#endif
        _imp_implementationWithBlock_init();
    
        _dyld_objc_notify_register(&map_images, load_images, unmap_image);/ / the key
    
    #if __OBJC2__
        didCallDyldNotifyRegister = true;
    #endif
    }
    Copy the code

Load function to load

Let’s go to the source code of load_images and look at its implementation to prove that all load functions are called in load_images

  • Through the objC source _objC_init source implementation, into the source implementation of load_images

    void
    load_images(const char *path __unused, const struct mach_header *mh)
    {
        if(! didInitialAttachCategories && didCallDyldNotifyRegister) { didInitialAttachCategories =true;
            loadAllCategories(a); }// Return without taking locks if there are no +load methods here.
        if (!hasLoadMethods((const headerType *)mh)) return;
    
        recursive_mutex_locker_t lock(loadMethodLock);
    
        // Discover load methods
        {
            mutex_locker_t lock2(runtimeLock);
            prepare_load_methods((const headerType *)mh);
        }
    
        // Call +load methods (without runtimeLock - re-entrant)
        call_load_methods(a); }Copy the code
  • Enter the source code implementation of call_load_methods, you can find that the core of the +load method is called through the do-while loop

    /*********************************************************************** * call_load_methods * Call all pending class and category +load methods. * Class +load methods are called superclass-first. * Category +load methods are not called until after the parent class's +load. * * This method must be RE-ENTRANT, because a +load could trigger * more image mapping. In addition, the superclass-first ordering * must be preserved in the face of re-entrant calls. Therefore, * only the OUTERMOST call of this function will do anything, and * that call will handle all loadable classes, even those generated * while it was running. * * The sequence below preserves +load ordering in the face of * image loading during a +load, and make sure that no * +load method is forgotten because it was added during * a +load call. * Sequence: * 1. Repeatedly call class +loads until there aren't any more * 2. Call category +loads ONCE. * 3. Run more +loads if: * (a) there are more classes to load, OR * (b) there are some potential category +loads that have * still never been attempted. * Category +loads are only run  once to ensure "parent class first" * ordering, even if a category +load triggers a new loadable class * and a new loadable category attached to that class. * * Locking: loadMethodLock must be held by the caller * All other locks must not be held. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
    void call_load_methods(void)
    {
        static bool loading = NO;
        bool more_categories;
    
        loadMethodLock.assertLocked(a);// Re-entrant calls do nothing; the outermost call will finish the job.
        if (loading) return;
        loading = YES;
    
        void *pool = objc_autoreleasePoolPush(a);do {
            // 1. Repeatedly call class +loads until there aren't any more
            while (loadable_classes_used > 0) {
                call_class_loads(a); }// 2. Call category +loads ONCE
            more_categories = call_category_loads(a);// 3. Run more +loads if there are classes OR more untried categories
        } while (loadable_classes_used > 0  ||  more_categories);
    
        objc_autoreleasePoolPop(pool);
    
        loading = NO;
    }
    Copy the code
  • Entering the call_class_loads source code implementation, you see that the load method called here validates the load method of the class we mentioned earlier

    /*********************************************************************** * call_class_loads * Call all pending class +load methods. * If new classes become loadable, +load is NOT called for them. * * Called only by call_load_methods(). * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
    static void call_class_loads(void)
    {
        int i;
        
        // Detach current loadable list.
        struct loadable_class *classes = loadable_classes;
        int used = loadable_classes_used;
        loadable_classes = nil;
        loadable_classes_allocated = 0;
        loadable_classes_used = 0;
        
        // Call all +loads for the detached list.
        for (i = 0; i < used; i++) {
            Class cls = classes[i].cls;
            load_method_t load_method = (load_method_t)classes[i].method;
            if(! cls)continue; 
    
            if (PrintLoading) {
                _objc_inform("LOAD: +[%s load]\n", cls->nameForLogging());
            }
            (*load_method)(cls, @selector(load));
        }
        
        // Destroy the detached list.
        if (classes) free(classes);
    }
    Copy the code

So, load_images calls all the load functions, and the source code analysis above corresponds to the stack print

【 summary 】 Load source chain is: _dyld_start –> dyldbootstrap::start –> dyld::_main –> dyld::initializeMainExecutable –> ImageLoader::runInitializers — > ImageLoader: : processInitializers — > ImageLoader: : recursiveInitialization – > dyld: : notifySingle (is a callback processing) — > sNotifyObjCInit –> load_images(libobjc.A.dylib)

So the question is, when is _objc_init called? Please read on

3.2.2.2 doInitialization function

  • We went to the _objc_init function of objC and found that it was not working. We went back to the source implementation of the recursiveInitialization function and found that we had omitted one function doInitialization

    void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
    										  InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
    {
    	recursive_lock lock_info(this_thread);
    	recursiveSpinLock(lock_info);// Lock recursively
    
    	if ( fState < dyld_image_state_dependents_initialized- 1 ) {
    		uint8_t oldState = fState;
    		// Break cycles to end the recursion
    		fState = dyld_image_state_dependents_initialized- 1;
    		try {
    			// initialize lower level libraries first
    			for(unsigned int i=0; i < libraryCount(a); ++i) { ImageLoader* dependentImage =libImage(i);
    				if( dependentImage ! =NULL ) {
    					// don't try to initialize stuff "above" me yet
    					if ( libIsUpward(i) ) {
    						uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
    						uninitUps.count++;
    					}
    					else if ( dependentImage->fDepth >= fDepth ) {
    						dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); }}}// record termination order
    			if ( this->needsTermination() )
    				context.terminationRecorder(this);
    
    			// let objc know we are about to initialize this image
    // Let objc know that we want to load the image
    			uint64_t t1 = mach_absolute_time(a); fState = dyld_image_state_dependents_initialized; oldState = fState; context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
    			
    			// Initialize this image
    			bool hasInitializers = this->doInitialization(context);
    
    			// let anyone know we finished initializing this image
    // Let everyone know that we have initialized the image
    			fState = dyld_image_state_initialized;
    			oldState = fState;
    			context.notifySingle(dyld_image_state_initialized, this.NULL);
    			
    			if ( hasInitializers ) {
    				uint64_t t2 = mach_absolute_time(a); timingInfo.addTime(this->getShortName(), t2-t1); }}catch (const char* msg) {
    			// this image is not initialized
    			fState = oldState;
    			recursiveSpinUnLock(a);throw; }}recursiveSpinUnLock(a);// Unlock recursively
    }
    Copy the code
  • Enter the source implementation of the doInitialization function, which also needs to be divided into two parts, one is the doImageInit function, and the other is the doModInitFunctions

    bool ImageLoaderMachO::doInitialization(const LinkContext& context)
    {
    	CRSetCrashLogMessage2(this->getPath());
    
    	// mach-o has -init and static initializers
    	doImageInit(context);
    	doModInitFunctions(context);
    	
    	CRSetCrashLogMessage2(NULL);
    	
    	return (fHasDashInit || fHasInitializers);
    }
    Copy the code
    • Enter doImageInit source code implementation, its core is mainly for loop loading method call, here need to note that the initialization of libSystem must be run first

      void ImageLoaderMachO::doImageInit(const LinkContext& context)
      {
      	if ( fHasDashInit ) {
      		const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
      		const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
      		const struct load_command* cmd = cmds;
      		for (uint32_t i = 0; i < cmd_count; ++i) {
      			switch (cmd->cmd) {
      				case LC_ROUTINES_COMMAND:
      					Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide);
      #if __has_feature(ptrauth_calls)
      					func = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)func, ptrauth_key_asia, 0);
      #endif
      					// <rdar://problem/8543820&9228031> verify initializers are in image
      					if(!this->containsAddress(stripPointer((void*)func)) ) {
      						dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
      					}
      					if(! dyld::gProcessInfo->libSystemInitialized ) {// The libSystem initializer must be run first, which is of high priority
      						// <rdar://problem/17973316> libSystem initializer must run first
      						dyld::throwf("-init function in image (%s) that does not link with libSystem.dylib\n".this->getPath());
      					}
      					if ( context.verboseInit )
      						dyld::log("dyld: calling -init function %p in %s\n", func, this->getPath());
      					{
      						dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
      						func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
      					}
      					break;
      			}
      			cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize); }}}Copy the code
    • Enter the source code implementation of doModInitFunctions. This method loads all Cxx files and can be verified by testing the program’s stack information. Add a breakpoint at the C++ method

      void ImageLoaderMachO::doModInitFunctions(const LinkContext& context)
      {
      	if ( fHasInitializers ) {
      		const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
      		const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
      		const struct load_command* cmd = cmds;
      		for (uint32_t i = 0; i < cmd_count; ++i) {
      			if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
      				const struct macho_segment_command* seg = (struct macho_segment_command*)cmd;
      				const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
      				const struct macho_section* const sectionsEnd = &sectionsStart[seg->nsects];
      				for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
      					const uint8_t type = sect->flags & SECTION_TYPE;
      					if( type == S_MOD_INIT_FUNC_POINTERS ) {.... }else if( type == S_INIT_FUNC_OFFSETS ) {.... } } cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize); }}}Copy the code

When I get here, I still don’t find the call to _objc_init? What to do? Give up? Of course not, we can also look at the stack before we call _objc_init with a symbolic breakpoint,

  • _objc_init with a symbolic breakpoint. Run the program to see the stack information after _objc_init is broken

  • Look for libSystem_initializer in libsystem libsystem-1292.60.1 to see the implementation

    LibSystem_initializer source implementation

    // libsyscall_initializer() initializes all of libSystem.dylib
    // <rdar://problem/4892197>
    __attribute__((constructor))
    static void
    libSystem_initializer(int argc,
    		      const char* argv[],
    		      const char* envp[],
    		      const char* apple[],
    		      const struct ProgramVars* vars)
    {... _libSystem_ktrace0(ARIADNE_LIFECYCLE_libsystem_init | DBG_FUNC_START); __libkernel_init(&libkernel_funcs, envp, apple, vars); _libSystem_ktrace_init_func(KERNEL); __libplatform_init(NULL, envp, apple, vars);
    	_libSystem_ktrace_init_func(PLATFORM);
    
    	__pthread_init(&libpthread_funcs, envp, apple, vars);
    	_libSystem_ktrace_init_func(PTHREAD);
    
    	_libc_initializer(&libc_funcs, envp, apple, vars);
    	_libSystem_ktrace_init_func(LIBC);
    
    	// TODO: Move __malloc_init before __libc_init after breaking malloc's upward link to Libc
    	// Note that __malloc_init() will also initialize ASAN when it is present
    	__malloc_init(apple);
    	_libSystem_ktrace_init_func(MALLOC);
    
    #if TARGET_OS_OSX
    	/* <rdar://problem/9664631> */
    	__keymgr_initializer();
    	_libSystem_ktrace_init_func(KEYMGR);
    #endif
    
    	_dyld_initializer();/ / dyld initialization
    	_libSystem_ktrace_init_func(DYLD);
    
    	libdispatch_init(a);// Dispatch initialization
    	_libSystem_ktrace_init_func(LIBDISPATCH);
    
    #if! TARGET_OS_DRIVERKIT
    	_libxpc_initializer();
    	_libSystem_ktrace_init_func(LIBXPC);
    
    #if CURRENT_VARIANT_asan
    	setenv("DT_BYPASS_LEAKS_CHECK"."1".1);
    #endif
    #endif / /! TARGET_OS_DRIVERKIT
    
    	// must be initialized after dispatch
    	_libtrace_init();
    	_libSystem_ktrace_init_func(LIBTRACE);
    
    #if! TARGET_OS_DRIVERKIT
    #if defined(HAVE_SYSTEM_SECINIT)
    	_libsecinit_initializer();
    	_libSystem_ktrace_init_func(SECINIT);
    #endif
    
    #if defined(HAVE_SYSTEM_CONTAINERMANAGER)
    	_container_init(apple);
    	_libSystem_ktrace_init_func(CONTAINERMGR);
    #endif
    
    	__libdarwin_init();
    	_libSystem_ktrace_init_func(DARWIN);
    #endif / /! TARGET_OS_DRIVERKIT__stack_logging_early_finished(&malloc_funcs); . }Copy the code
  • According to the previous stack information, we found that walking is invoked in the libSystem_initializer libdispatch_init function, and the function of the source code is in libdispatch open source in the library, Libdispatch-1271.120.2 search libDispatch_init in libDispatch

    DISPATCH_EXPORT DISPATCH_NOTHROW
    void
    libdispatch_init(void)
    {
    	dispatch_assert(sizeof(struct dispatch_apply_s) <=
    			DISPATCH_CONTINUATION_SIZE);
    
    	if (_dispatch_getenv_bool("LIBDISPATCH_STRICT".false)) {
    		_dispatch_mode |= DISPATCH_MODE_STRICT;
    	}
    
    #if DISPATCH_DEBUG || DISPATCH_PROFILE
    #if DISPATCH_USE_KEVENT_WORKQUEUE
    	if (getenv("LIBDISPATCH_DISABLE_KEVENT_WQ")) {
    		_dispatch_kevent_workqueue_enabled = false;
    	}
    #endif
    #endif
    
    #if HAVE_PTHREAD_WORKQUEUE_QOS
    	dispatch_qos_t qos = _dispatch_qos_from_qos_class(qos_class_main());
    	_dispatch_main_q.dq_priority = _dispatch_priority_make(qos, 0);
    #if DISPATCH_DEBUG
    	if (!getenv("LIBDISPATCH_DISABLE_SET_QOS")) {
    		_dispatch_set_qos_class_enabled = 1;
    	}
    #endif
    #endif
    
    #if DISPATCH_USE_THREAD_LOCAL_STORAGE
    	_dispatch_thread_key_create(&__dispatch_tsd_key, _libdispatch_tsd_cleanup);
    #else
    	_dispatch_thread_key_create(&dispatch_priority_key, NULL);
    	_dispatch_thread_key_create(&dispatch_r2k_key, NULL);
    	_dispatch_thread_key_create(&dispatch_queue_key, _dispatch_queue_cleanup);
    	_dispatch_thread_key_create(&dispatch_frame_key, _dispatch_frame_cleanup);
    	_dispatch_thread_key_create(&dispatch_cache_key, _dispatch_cache_cleanup);
    	_dispatch_thread_key_create(&dispatch_context_key, _dispatch_context_cleanup);
    	_dispatch_thread_key_create(&dispatch_pthread_root_queue_observer_hooks_key,
    			NULL);
    	_dispatch_thread_key_create(&dispatch_basepri_key, NULL);
    #if DISPATCH_INTROSPECTION
    	_dispatch_thread_key_create(&dispatch_introspection_key , NULL);
    #elif DISPATCH_PERF_MON
    	_dispatch_thread_key_create(&dispatch_bcounter_key, NULL);
    #endif
    	_dispatch_thread_key_create(&dispatch_wlh_key, _dispatch_wlh_cleanup);
    	_dispatch_thread_key_create(&dispatch_voucher_key, _voucher_thread_cleanup);
    	_dispatch_thread_key_create(&dispatch_deferred_items_key,
    			_dispatch_deferred_items_cleanup);
    #endif
    	pthread_key_create(&_os_workgroup_key, _os_workgroup_tsd_cleanup);
    #if DISPATCH_USE_RESOLVERS // rdar://problem/8541707
    	_dispatch_main_q.do_targetq = _dispatch_get_default_queue(true);
    #endif
    
    	_dispatch_queue_set_current(&_dispatch_main_q);
    	_dispatch_queue_set_bound_thread(&_dispatch_main_q);
    
    #if DISPATCH_USE_PTHREAD_ATFORK
    	(void)dispatch_assume_zero(pthread_atfork(dispatch_atfork_prepare,
    			dispatch_atfork_parent, dispatch_atfork_child));
    #endif
    	_dispatch_hw_config_init();
    	_dispatch_time_init();
    	_dispatch_vtable_init();
    	_os_object_init();/ / the key
    	_voucher_init();
    	_dispatch_introspection_init();
    }
    Copy the code
  • Initinit_dyld_objc_notify_register; / / init_objc_notifY_register; / / init_dyLD_OBJC_notify_register; The call to sNotifySingle –> sNotifyObjCInie= parameter 2 to sNotifyObjcInit() forms a closed loop

    void
    _os_object_init(void)
    {
    	_objc_init();/ / the key
    	Block_callbacks_RR callbacks = {
    		sizeof(Block_callbacks_RR),
    		(void(*) (const void *))&objc_retain,
    		(void(*) (const void *))&objc_release,
    		(void(*) (const void *))&_os_objc_destructInstance
    	};
    	_Block_use_RR2(&callbacks);
    #if DISPATCH_COCOA_COMPAT
    	const char *v = getenv("OBJC_DEBUG_MISSING_POOLS");
    	if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v);
    	v = getenv("DISPATCH_DEBUG_MISSING_POOLS");
    	if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v);
    	v = getenv("LIBDISPATCH_DEBUG_MISSING_POOLS");
    	if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v);
    #endif
    }
    Copy the code

So the simple way to think about it is sNotifySingle, which is to add a notification which is addObserver, _objc_init, _dyLD_OBJC_notify_register which is to send a notification, which is push, And sNotifyObjcInit is the notification handler, the selector

[Summary] : _objc_init source chain: _dyld_start –> dyldbootstrap::start –> dyld::_main –> dyld::initializeMainExecutable –> ImageLoader::runInitializers –> ImageLoader::processInitializers –> ImageLoader::recursiveInitialization –> doInitialization LibSystem_initializer (libsystem.b.dylib) –> _os_object_init (libdispatch.dylib) –> _objc_init(libobjc.a.dylib)

3.2.3 IntroductionStep 9: Find the main entry function

  • Assembly debugging, you can see the display comes to the +[ViewController Load] method

  • Go ahead and go to the C++ function ypyFunc

  • Click stepover, continue to run down, run through the entire process, will return to _dyLD_start, then call main() function, through the assembly to complete the main parameter assignment operation dyld source code implementation

    _dyLD_start LC_MAIN case, set up stack for call to main()

    #if__arm64__ && ! TARGET_OS_SIMULATOR
    	.text
    	.align 2
    	.globl __dyld_start
    __dyld_start:
    	mov 	x28, sp
    	and     sp, x28, #~15		// force 16-byte alignment of stack
    	mov	x0, #0
    	mov	x1, #0
    	stp	x1, x0, [sp, #- 16]!	// make aligned terminating frame
    	mov	fp, sp			// set up fp to point to terminating frame
    	sub	sp, sp, #16             // make room for local variables
    #if __LP64__
    	ldr     x0, [x28]               // get app's mh into x0
    	ldr     x1, [x28, #8]           // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
    	add     x2, x28, #16            // get argv into x2
    #else
    	ldr     w0, [x28]               // get app's mh into x0
    	ldr     w1, [x28, #4]           // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
    	add     w2, w28, #8             // get argv into x2
    #endif
    	adrp	x3,___dso_handle@page
    	add 	x3,x3,___dso_handle@pageoff // get dyld's mh in to x4
    	mov	x4,sp                   // x5 has &startGlue
    
    	// call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
    	bl	__ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm
    	mov	x16,x0                  // save entry point address in x16
    #if __LP64__
    	ldr     x1, [sp]
    #else
    	ldr     w1, [sp]
    #endif
    	cmp	x1, #0
    	b.ne	Lnew
    
    	// LC_UNIXTHREAD way, clean up stack and jump to result
    #if __LP64__
    	add	sp, x28, #8             // restore unaligned stack pointer without app mh
    #else
    	add	sp, x28, #4             // restore unaligned stack pointer without app mh
    #endif
    #if __arm64e__
    	braaz   x16                     // jump to the program's entry point
    #else
    	br      x16                     // jump to the program's entry point
    #endif
    
    	// LC_MAIN case, set up stack for call to main()
    Lnew:	mov	lr, x1		    // simulate return address into _start in libdyld.dylib
    #if __LP64__
    	ldr	x0, [x28, #8]       // main param1 = argc
    	add	x1, x28, #16        // main param2 = argv
    	add	x2, x1, x0, lsl #3
    	add	x2, x2, #8          // main param3 = &env[0]
    	mov	x3, x2
    Lapple:	ldr	x4, [x3]
    	add	x3, x3, #8
    #else
    	ldr	w0, [x28, #4]       // main param1 = argc
    	add	x1, x28, #8         // main param2 = argv
    	add	x2, x1, x0, lsl #2
    	add	x2, x2, #4          // main param3 = &env[0]
    	mov	x3, x2
    Lapple:	ldr	w4, [x3]
    	add	x3, x3, #4
    #endif
    	cmp	x4, #0
    	b.ne	Lapple		    // main param4 = apple
    #if __arm64e__
    	braaz   x16
    #else
    	br      x16
    #endif
    
    #endif // __arm64__ && ! TARGET_OS_SIMULATOR
    Copy the code

    Dyld main part of the assembly source code implementation

If the name of main is changed, an error will be reported. If the name of main is changed, an error will be reported

So, to sum up, the final dyLD loading process, as shown in the figure below, also explains the previous question: why load–>Cxx–>main call order

🌹 just like it 👍🌹

🌹 feel have harvest, can come a wave, collect + concern, comment + forward, lest you can’t find me next 😁🌹

🌹 welcome everyone to leave a message to exchange, criticize and correct, learn from each other 😁, improve self 🌹