First, application start up preliminary study
1. Print sequence
Take a look at this code and think about the output order of the statements
@interface Person : NSObject
@end
@implementation Person
+ (void)load {
printf("----------load-----------: %s\n", __func__);
}
@end
__attribute__((constructor)) void cc_func (a) {
printf("--------cc_func----------: %s\n", __func__);
}
int main(int argc, const char * argv[]) {
@autoreleasepool {
// insert code here...
NSLog(@"Hello, World!");
}
return 0;
}
Copy the code
You guessed it, the output order is as follows:
----------load-----------: +[Person load]
--------cc_func----------: cc_func
Dyld[40374:1115383] Hello, World!
Copy the code
It’s in this order:
Constructor method --> main()Copy the code
2. Before main
Are you confused?
Isn’t main an entry function? Why wasn’t Main executed first?
There is usually a list of things to do before main
The picture above clearly shows the start-up process and stages.
Breakpoint load method
Break in the load method and print the stack
Output:
Thread #1, queue = 'com.apple.main-thread', stop reason = breakPoint 8.1 * frame #0: 0x0000000100003e60 Dyld`+[Person load](self=Person, _cmd="load") at main.m:17:5 frame #1: 0x00007fff203ab4d6 libobjc.A.dylib`load_images + 1556 frame #2: 0x0000000100016527 dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425 frame #3: 0x000000010002c794 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 474 frame #4: 0x000000010002a55f dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 191 frame #5: 0x000000010002a600 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82 frame #6: 0x00000001000168b7 dyld`dyld::initializeMainExecutable() + 199 frame #7: 0x000000010001ceb8 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 8702 frame #8: 0x0000000100015224 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 450 frame #9: 0x0000000100015025 dyld`_dyld_start + 37Copy the code
4. Startup process (pre-main)
So let’s take a look at what we did before main, just to get a sense of what we’re going to do
5. Startup Process (Main)
The main function and its subsequent stages should be familiar to you
Second, the dyld
1. What is dyLD?
Dyld (The Dynamic Link Editor), dynamic linker. Dyld is a user-mode process that is part of Darwin maintained by Apple (DYLD) and is located at: /usr/lib/dyld to load dynamic libraries.
2. Dyld role
- Responsible for linking and loading the program. Applications are compiled and packaged into executable files
Mach-O
After, start up bydyld
Responsible for linking and loading programs into memory. - Symbol binding. Because almost everything on OS X is dynamically linked,
Mach-O
There are many references to external libraries and symbols in the file, so you need to populate the index at startupdyld
To execute. This process is also known as symbol binding (binding
).
3. Dyld loading process
- How is DYLD loaded?
- How is the program initialized?
In the breakpoint bt diagram, we see that DYLD has a _dyLD_START method. When I analyze it, I find that it is implemented in assembly. Let’s take a look.
When any new process starts, the kernel sets the user mode entry point to __dyLD_START.
The specific invocation diagram is as follows:
4._dyld_start
Dyldstartup. s This is assembly code, let’s look at it briefly
#if__arm64__ && ! TARGET_OS_SIMULATOR
.text
.align 2
.globl __dyld_start
__dyld_start:
mov x28, sp
and sp, x28, #~15 // force 16-byte alignment of stack
mov x0, #0
mov x1, #0
stp x1, x0, [sp, #- 16]! // make aligned terminating frame
mov fp, sp // set up fp to point to terminating frame
sub sp, sp, #16 // make room for local variables
#if __LP64__
ldr x0, [x28] // get app's mh into x0
ldr x1, [x28, #8] // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
add x2, x28, #16 // get argv into x2
#else
ldr w0, [x28] // get app's mh into x0
ldr w1, [x28, #4] // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
add w2, w28, #8 // get argv into x2
#endif
adrp x3,___dso_handle@page
add x3,x3,___dso_handle@pageoff // get dyld's mh in to x4
mov x4,sp // x5 has &startGlue
// call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
bl __ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm
mov x16,x0 // save entry point address in x16.Copy the code
From the comments, you can see that the dyLDBootstrap ::start(app_MH, argc, argv, DyLD_MH, &startGlue) method is called, which is also seen in the screenshot in the previous section.
5.dyldbootstrap::start
This method is the start method in C++ namespace under the actual dyldbootstrap. The code is as follows:
DyldInitialization. CPP implementation
namespace dyldbootstrap {
...
//
// This is code to bootstrap dyld. This work in normally done for a program by dyld and crt.
// In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{
// Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0.0.0.0);
// if kernel had to slide dyld, we need to fix up load sensitive locations
// we have to do this before using any global variables
rebaseDyld(dyldsMachHeader);
// kernel sets up env pointer to be just past end of agv array
const char** envp = &argv[argc+1];
// kernel sets up apple pointer to be just past end of envp array
const char** apple = envp;
while(*apple ! =NULL) { ++apple; }
++apple;
// set up random value for stack canary
__guard_setup(apple);
#if DYLD_INITIALIZER_SUPPORT
// run all C++ initializers inside dyld
runDyldInitializers(argc, argv, envp, apple);
#endif
_subsystem_init(apple);
// now that we are done bootstrapping dyld, call dyld's main
uintptr_t appsSlide = appsMachHeader->getSlide(a);returndyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue); }}Copy the code
The function finally executes the first argument to dyld::_main(), which we see as macho_header
This may be familiar to us if we know the Mach-O structure. Dyld is used to load Mach-O files, so this should give you an idea.
Start function operation
- According to the
dyldsMachHeader
To calculate theslide
To determine whether relocation is required (in the rebaseDyld function) - Mach_init () initialization operation (in rebaseDyld function)
- Overflow protection
- To calculate
appsMachHeader
Offset, calldyld::_main
function
Let’s focus on the dyld::_main operation
6.dyld::_main()
Dyld ::main function implementation
//
// Entry point for dyld. The kernel loads dyld and jumps to __dyld_start which
// sets up some registers and call this function.
//
// Returns address of main() in target program which __dyld_start jumps to
//
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide,
int argc, const char* argv[], const char* envp[], const char* apple[],
uintptr_t* startGlue)
{
if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
launchTraceID = dyld3::kdebug_trace_dyld_duration_start(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, (uint64_t)mainExecutableMH, 0.0);
}
//Check and see if there are any kernel flags
dyld3::BootArgs::setFlags(hexToUInt64(_simple_getenv(apple, "dyld_flags"), nullptr));
#if __has_feature(ptrauth_calls)
// Check and see if kernel disabled JOP pointer signing (which lets us load plain arm64 binaries)
if ( const char* disableStr = _simple_getenv(apple, "ptrauth_disabled")) {if ( strcmp(disableStr, "1") = =0 )
sKeysDisabled = true;
}
else {
// needed until kernel passes ptrauth_disabled for arm64 main executables
if ( (mainExecutableMH->cpusubtype == CPU_SUBTYPE_ARM64_V8) || (mainExecutableMH->cpusubtype == CPU_SUBTYPE_ARM64_ALL) )
sKeysDisabled = true;
}
#endif
// Grab the cdHash of the main executable from the environment
uint8_t mainExecutableCDHashBuffer[20];
const uint8_t* mainExecutableCDHash = nullptr;
if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash")) {unsigned bufferLenUsed;
if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
mainExecutableCDHash = mainExecutableCDHashBuffer;
}
getHostInfo(mainExecutableMH, mainExecutableSlide);
#if! TARGET_OS_SIMULATOR
// Trace dyld's load
notifyKernelAboutImage((macho_header*)&__dso_handle, _simple_getenv(apple, "dyld_file"));
// Trace the main executable's load
notifyKernelAboutImage(mainExecutableMH, _simple_getenv(apple, "executable_file"));
#endif
uintptr_t result = 0; sMainExecutableMachHeader = mainExecutableMH; sMainExecutableSlide = mainExecutableSlide; .return result;
}
Copy the code
The code is quite long, so let’s discard the useless or non-main flow code and analyze the main flow:
- Environment Variable Configuration
- Set the corresponding value according to the environment variable to get the current operating architecture
- Shared cache
- Check whether shared cache is enabled and whether the shared cache maps to the shared area
- Initialization of the main program
- call
instantiateFromLoadedImage
The function instantiates oneImageLoader
object
- call
- Insert dynamic library
- traverse
DYLD_INSERT_LIBRARIES
Environment variable, callloadInsertedDylib
loading
- traverse
- The link of the main program
- Dynamic link library
- Weak sign binding
- Execute the initialization method
- Look for the main program entry, i.e
main
function
The illustration is as follows:
1). Dyld environment variable
- Get the cdHash of the main executable from the environment variable
- To obtain
Mach-O
Platform, architecture, and other information in header files - Check setting environment variables:
checkEnvironmentVariables(envp)
- in
DYLD_FALLBACK
Set the default value when null:defaultUninitializedFallbackPaths(envp)
The relevant code
// Line: 6366
// Grab the cdHash of the main executable from the environment
// cdHash to get the main executable from the environment
uint8_t mainExecutableCDHashBuffer[20];
const uint8_t* mainExecutableCDHash = nullptr;
if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash")) {unsigned bufferLenUsed;
if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
mainExecutableCDHash = mainExecutableCDHashBuffer;
}
// Get the current runtime architecture information from the Mach-o header
getHostInfo(mainExecutableMH, mainExecutableSlide);
// Line: 6453
CRSetCrashLogMessage("dyld: launch started");
// Set the context according to the executable header, parameters, etc
setContext(mainExecutableMH, argc, argv, envp, apple);
// Pickup the pointer to the exec path.
// Get the executable file path
sExecPath = _simple_getenv(apple, "executable_path");
// Line: 6535
{
checkEnvironmentVariables(envp); // Check setting environment variables
defaultUninitializedFallbackPaths(envp); // Set the default value when DYLD_FALLBACK is null
}
Copy the code
This can be done by setting environment variables in Scheme, as described in the dyld2.cpp file
Dyld environment variable
struct EnvironmentVariables {
const char* const * DYLD_FRAMEWORK_PATH;
const char* const * DYLD_FALLBACK_FRAMEWORK_PATH;
const char* const * DYLD_LIBRARY_PATH;
const char* const * DYLD_FALLBACK_LIBRARY_PATH;
const char* const * DYLD_INSERT_LIBRARIES;
const char* const * LD_LIBRARY_PATH; // for unix conformance
const char* const * DYLD_VERSIONED_LIBRARY_PATH;
const char* const * DYLD_VERSIONED_FRAMEWORK_PATH;
bool DYLD_PRINT_LIBRARIES_POST_LAUNCH;
bool DYLD_BIND_AT_LAUNCH;
bool DYLD_PRINT_STATISTICS;
bool DYLD_PRINT_STATISTICS_DETAILS;
bool DYLD_PRINT_OPTS;
bool DYLD_PRINT_ENV;
bool DYLD_DISABLE_DOFS;
boolhasOverride; . };Copy the code
Example:
- DYLD_PRINT_OPTS = YES
- DYLD_PRINT_ENV = YES, print all environment variables
- OBJC_PRINT_LOAD_METHODS Displays calls to the + (void)load methods of Class and Category
- OBJC_PRINT_INITIALIZE_METHODS Prints the call information for Class + (void)initialize
2). SharedCache
App may use a lot of system dynamic libraries, such as UIKit and Foundation, which are system dynamic libraries. After App startup, it will be time-consuming to load dynamic libraries when the corresponding dynamic library capabilities are needed. Therefore, the system has put the dynamic libraries used by iOS into the dynamic library cache in advance. Will the big cache file into the iOS directory (/ System/Library/Caches/com. Apple. Dyld /), to improve the performance of application startup, this is the role of dynamic Library cache.
There is a way to extract dynamic libraries from the dynamic shared cache. You can use the launch-cache/dsc_extractor.cpp in dyLD source code to extract dynamic libraries
- will
#if 0
Code and#endif
delete - Compile ` dsc_extractor. CPP
clang++ -o desc_extractor desc_extractor.cpp
Copy the code
- Using desc_extractor
/ desc_Extractor Dynamic library shared cache file directory stores the result folderCopy the code
The code involved in the shared cache is:
- CheckSharedRegionDisable Checks whether shared cache is enabled (mandatory in iOS)
- MapSharedCache Loads the shared cache library
- Only the current process is loaded
mapCachePrivate
(Emulator only supports loading into current process) - The shared cache is loaded for the first time
mapCacheSystemWide
- The shared cache is not loaded for the first time, so nothing is done
- Only the current process is loaded
MapSharedCache --> loadDyldCache --> mapCachePrivate ├ -> mapCacheSystemWideCopy the code
The relevant code
// Line: 6584
// load shared cache
// Check whether shared cache is enabled. IOS is required
checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
if( gLinkContext.sharedRegionMode ! = ImageLoader::kDontUseSharedRegion ) {#if TARGET_OS_SIMULATOR
if ( sSharedCacheOverrideDir)
mapSharedCache(mainExecutableSlide);
#else
// Check whether the shared cache maps to the shared region
mapSharedCache(mainExecutableSlide);
#endif
}
// Line: 4078
static void mapSharedCache(uintptr_t mainExecutableSlide)
{
dyld3::SharedCacheOptions opts;
opts.cacheDirOverride = sSharedCacheOverrideDir;
opts.forcePrivate = (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion);
#if__x86_64__ && ! TARGET_OS_SIMULATOR
opts.useHaswell = sHaswell;
#else
opts.useHaswell = false;
#endif
opts.verbose = gLinkContext.verboseMapping;
// <rdar://problem/32031197> respect -disable_aslr boot-arg
// <rdar://problem/56299169> kern.bootargs is now blocked
opts.disableASLR = (mainExecutableSlide == 0) && dyld3::internalInstall(a);// infer ASLR is off if main executable is not slid
loadDyldCache(opts, &sSharedCacheLoadInfo);
// update global state
if( sSharedCacheLoadInfo.loadAddress ! =nullptr ) {
gLinkContext.dyldCache = sSharedCacheLoadInfo.loadAddress;
dyld::gProcessInfo->processDetachedFromSharedRegion = opts.forcePrivate;
dyld::gProcessInfo->sharedCacheSlide = sSharedCacheLoadInfo.slide;
dyld::gProcessInfo->sharedCacheBaseAddress = (unsigned long)sSharedCacheLoadInfo.loadAddress;
sSharedCacheLoadInfo.loadAddress->getUUID(dyld::gProcessInfo->sharedCacheUUID);
dyld3::kdebug_trace_dyld_image(DBG_DYLD_UUID_SHARED_CACHE_A, sSharedCacheLoadInfo.path, (const uuid_t *)&dyld::gProcessInfo->sharedCacheUUID[0] and {0.0}, {{ 0.0}},constmach_header *)sSharedCacheLoadInfo.loadAddress); }}// Line: 858
bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
{
results->loadAddress = 0;
results->slide = 0;
results->errorMessage = nullptr;
#if TARGET_OS_SIMULATOR
// simulator only supports mmap()ing cache privately into process
return mapCachePrivate(options, results);
#else
if ( options.forcePrivate ) {
// mmap cache into this process only Loads the current process
return mapCachePrivate(options, results);
}
else {
// fast path: when cache is already mapped into shared region
bool hasError = false;
if ( reuseExistingCache(options, results) ) { hasError = (results->errorMessage ! =nullptr); // Already loaded
} else {
// slow path: this is first process to load cache
hasError = mapCacheSystemWide(options, results); // First load
}
return hasError;
}
#endif
}
Copy the code
3). Main program initialization
- through
instantiateFromLoadedImage
To obtainImageLoader
ImageLoaderMachO::instantiateMainExecutable
createImageLoader
(Main program)sniffLoadCommands
The function getsMach-O
Of the fileLoad Command
Perform various checks
The relevant code
// Line: 6860
CRSetCrashLogMessage(sLoadingCrashMessage);
// instantiate ImageLoader for main executable
// Load the executable to generate an ImageLoader instance
sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
gLinkContext.mainExecutable = sMainExecutable;
gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
// Line: 3092
// The kernel maps in main executable before dyld gets control. We need to
// make an ImageLoader* for the already mapped in main executable.
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
// try mach-o loader
// if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
addImage(image);
return (ImageLoaderMachO*)image;
// }
// throw "main executable not a known format";
}
// ImageLoaderMachO.cpp Line: 566
// create image for main executable
ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context)
{
//dyld::log("ImageLoader=%ld, ImageLoaderMachO=%ld, ImageLoaderMachOClassic=%ld, ImageLoaderMachOCompressed=%ld\n",
// sizeof(ImageLoader), sizeof(ImageLoaderMachO), sizeof(ImageLoaderMachOClassic), sizeof(ImageLoaderMachOCompressed));
bool compressed;
unsigned int segCount;
unsigned int libCount;
const linkedit_data_command* codeSigCmd;
const encryption_info_command* encryptCmd;
sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd);
// instantiate concrete class based on content of load commands
if ( compressed )
return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
else
#if SUPPORT_CLASSIC_MACHO
return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
#else
throw "missing LC_DYLD_INFO load command";
#endif
}
Copy the code
4). Insert dynamic library
In this step, loadInsertedDylib is called to load the library that you’re walking through, so you can do security attacks, LoadInsertedDylib checks for dylib signatures from paths such as DYLD_ROOT_PATH, LD_LIBRARY_PATH, and DYLD_FRAMEWORK_PATH.
The relevant code
// Line: 6974
// load any inserted libraries
// Load all the libraries specified by DYLD_INSERT_LIBRARIES
if( sEnv.DYLD_INSERT_LIBRARIES ! =NULL ) {
for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib ! =NULL; ++lib)
loadInsertedDylib(*lib);
}
// record count of inserted libraries so that a flat search will look at
// inserted libraries, then main, then others.
sInsertedDylibCount = sAllImages.size(a)- 1;
// Line: 5176
static void loadInsertedDylib(const char* path)
{
unsigned cacheIndex;
try {
LoadContext context;
context.useSearchPaths = false;
context.useFallbackPaths = false;
context.useLdLibraryPath = false;
context.implicitRPath = false;
context.matchByInstallName = false;
context.dontLoad = false;
context.mustBeBundle = false;
context.mustBeDylib = true;
context.canBePIE = false;
context.origin = NULL; // can't use @loader_path with DYLD_INSERT_LIBRARIES
context.rpath = NULL;
load(path, context, cacheIndex);
}
catch (const char* msg) {
if ( gLinkContext.allowInsertFailures )
dyld::log("dyld: warning: could not load inserted library '%s' into hardened process because %s\n", path, msg);
else
halt(dyld::mkstringf("could not load inserted library '%s' because %s\n", path, msg));
}
catch(...). {halt(dyld::mkstringf("could not load inserted library '%s'\n", path)); }}Copy the code
5). Link the main program
The relevant code
// Line: 6982
// link main executable
// Link the main program
gLinkContext.linkingMainExecutable = true;
#if SUPPORT_ACCELERATE_TABLES
if ( mainExcutableAlreadyRebased ) {
// previous link() on main executable has already adjusted its internal pointers for ASLR
// work around that by rebasing by inverse amount
sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
}
#endif
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL.NULL), - 1);
sMainExecutable->setNeverUnloadRecursive(a);if ( sMainExecutable->forceFlat() ) {
gLinkContext.bindFlat = true;
gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
}
Copy the code
6). Link dynamic library
The relevant code
// Line: 6999
// link any inserted libraries
// do this after linking main executable so that any dylibs pulled in by inserted
// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
// Link all inserted dynamic libraries
if ( sInsertedDylibCount > 0 ) {
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL.NULL), - 1);
image->setNeverUnloadRecursive(a); }if ( gLinkContext.allowInterposing ) {
// only INSERTED libraries can interpose
// register interposing info after all inserted libraries are bound so chaining works
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
image->registerInterposing(gLinkContext); // Register symbol insertion}}}Copy the code
7). Weak symbol binding
The relevant code
// Line: 7060
// apply interposing to initial set of images
for(int i=0; i < sImageRoots.size(a); ++i) {// Apply symbol insertion
sImageRoots[i]->applyInterposing(gLinkContext);
}
ImageLoader::applyInterposingToDyldCache(gLinkContext);
// Bind and notify for the main executable now that interposing has been registered
uint64_t bindMainExecutableStartTime = mach_absolute_time(a);/ / note:
sMainExecutable->recursiveBindWithAccounting(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true);
uint64_t bindMainExecutableEndTime = mach_absolute_time(a); ImageLoaderMachO::fgTotalBindTime += bindMainExecutableEndTime - bindMainExecutableStartTime; gLinkContext.notifyBatch(dyld_image_state_bound, false);
// Bind and notify for the inserted images now interposing has been registered
if ( sInsertedDylibCount > 0 ) {
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true.nullptr); }}// <rdar://problem/12186933> do weak binding only after all inserted images linked
// Weak symbol binding
sMainExecutable->weakBind(gLinkContext);
gLinkContext.linkingMainExecutable = false;
sMainExecutable->recursiveMakeDataReadOnly(gLinkContext);
Copy the code
8). Execute the initialization method
The relevant code
// Line: 7087
CRSetCrashLogMessage("dyld: launch, running initializers");
#if SUPPORT_OLD_CRT_INITIALIZATION
// Old way is to run initializers via a callback from crt1.o
if(! gRunInitializersOldWay )initializeMainExecutable(a);#else
// run all initializers
// Perform initialization
initializeMainExecutable(a);#endif
// Line: 1636
void initializeMainExecutable(a)
{
// record that we've reached this step
gLinkContext.startedInitializingMainExecutable = true;
// run initialzers for any inserted dylibs
ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
initializerTimes[0].count = 0;
const size_t rootCount = sImageRoots.size(a);if ( rootCount > 1 ) {
for(size_t i=1; i < rootCount; ++i) {
sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]); }}// run initializers for main executable and everything it brings up
sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
// register cxa_atexit() handler to run static terminators in all loaded images when this process exits
if( gLibSystemHelpers ! =NULL )
(*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL.NULL);
// dump info if requested
if ( sEnv.DYLD_PRINT_STATISTICS )
ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}
// ImageLoader.cpp Line: 609
void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
uint64_t t1 = mach_absolute_time(a);mach_port_t thisThread = mach_thread_self(a); ImageLoader::UninitedUpwards up; up.count =1;
up.imagesAndPaths[0] = { this.this->getPath() };
processInitializers(context, thisThread, timingInfo, up);
context.notifyBatch(dyld_image_state_initialized, false);
mach_port_deallocate(mach_task_self(), thisThread);
uint64_t t2 = mach_absolute_time(a); fgTotalInitTime += (t2 - t1); }// ImageLoader.cpp Line: 587
// <rdar://problem/14412057> upward dylib initializers can be run too soon
// To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs
// have their initialization postponed until after the recursion through downward dylibs
// has completed.
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
uint32_t maxImageCount = context.imageCount() +2;
ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
ImageLoader::UninitedUpwards& ups = upsBuffer[0];
ups.count = 0;
// Calling recursive init on all images in images list, building a new list of
// uninitialized upward dependencies.
for (uintptr_t i=0; i < images.count; ++i) {
images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
}
// If any upward dependencies remain, init them.
if ( ups.count > 0 )
processInitializers(context, thisThread, timingInfo, ups);
}
// ImageLoader.cpp Line: 1595
// Get the image initialization
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
recursive_lock lock_info(this_thread);
recursiveSpinLock(lock_info);
if ( fState < dyld_image_state_dependents_initialized- 1 ) {
uint8_t oldState = fState;
// break cycles
fState = dyld_image_state_dependents_initialized- 1;
try {
// initialize lower level libraries first
for(unsigned int i=0; i < libraryCount(a); ++i) { ImageLoader* dependentImage =libImage(i);
if( dependentImage ! =NULL ) {
// don't try to initialize stuff "above" me yet
if ( libIsUpward(i) ) {
uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
uninitUps.count++;
}
else if ( dependentImage->fDepth >= fDepth ) {
dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); }}}// record termination order
if ( this->needsTermination() )
context.terminationRecorder(this);
// let objc know we are about to initialize this image
uint64_t t1 = mach_absolute_time(a); fState = dyld_image_state_dependents_initialized; oldState = fState; context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
// initialize this image
bool hasInitializers = this->doInitialization(context);
// let anyone know we finished initializing this image
fState = dyld_image_state_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_initialized, this.NULL);
if ( hasInitializers ) {
uint64_t t2 = mach_absolute_time(a); timingInfo.addTime(this->getShortName(), t2-t1); }}catch (const char* msg) {
// this image is not initialized
fState = oldState;
recursiveSpinUnLock(a);throw; }}recursiveSpinUnLock(a); }Copy the code
NotifySingle function
The relevant code
// dyld2.cpp Line: 985
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{...if ( state == dyld_image_state_mapped ) {
// <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache
// <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches
if(! image->inSharedCache()
|| (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) {
dyld_uuid_info info;
if ( image->getUUID(info.imageUUID) ) {
info.imageLoadAddress = image->machHeader(a);addNonSharedCacheImageUUID(info); }}}if( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit ! =NULL) && image->notifyObjC()) {uint64_t t0 = mach_absolute_time(a);dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0.0);
// Pay attention to this sentence
(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
uint64_t t1 = mach_absolute_time(a);uint64_t t2 = mach_absolute_time(a);uint64_t timeInObjC = t1-t0;
uint64_t emptyTime = (t2-t1)*100;
if( (timeInObjC > emptyTime) && (timingInfo ! =NULL) ) {
timingInfo->addTime(image->getShortName(), timeInObjC); }}... }// Line: 4643
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
// record functions to call
sNotifyObjCMapped = mapped;
sNotifyObjCInit = init; // Assign
sNotifyObjCUnmapped = unmapped;
// call 'mapped' function with all images mapped so far
try {
notifyBatchPartial(dyld_image_state_bound, true.NULL.false.true);
}
catch (const char* msg) {
// ignore request to abort during registration
}
// <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem)
for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(a); it ! = sAllImages.end(a); it++) { ImageLoader* image = *it;if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC()) {dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0.0);
(*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); }}}// dyldAPIs.cpp line: 2188
// This function is only available to objC at runtime
The function dyLD_OBJC_notify_register needs to be searched in the libobjc source code
void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped,
_dyld_objc_notify_init init,
_dyld_objc_notify_unmapped unmapped)
{
dyld::registerObjCNotifiers(mapped, init, unmapped); // it is called here
}
Copy the code
A search for _dyLD_OBJC_notify_register in the objC4 source code shows that the method is called in _objc_init with parameters.
So sNotifyObjCInit is assigned to load_images in objC, and load_images calls all the +load methods. NotifySingle is a callback function.
The link length of the initialization process is relatively long. We will focus on it in the next section.
9). Main program entry
The code for the program entry in dyld2. CPP is as follows:
// Line: 7104
#if TARGET_OS_OSX
if ( gLinkContext.driverKit ) {
result = (uintptr_t)sEntryOverride;
if ( result == 0 )
halt("no entry point registered");
*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
}
else
#endif
{
// find entry point for main executable
result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN(a);if( result ! =0 ) {
// main executable uses LC_MAIN, we need to use helper in libdyld to call into main()
if( (gLibSystemHelpers ! =NULL) && (gLibSystemHelpers->version >= 9) )
*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
else
halt("libdyld.dylib support not present for LC_MAIN");
}
else {
// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD(a); *startGlue =0; }}Copy the code
How to prove that the load method is called after the C++ constructor method?
The easiest way to do this is with breakpoints.
You can see the current breakpoint is in the load method
The current backtrace
Thread #1, queue = 'com.apple.main-thread', stop reason = breakPoint 8.1 * frame #0: 0x0000000100003e60 Dyld`+[Person load](self=Person, _cmd="load") at main.m:17:5 frame #1: 0x00007fff203ab4d6 libobjc.A.dylib`load_images + 1556 frame #2: 0x0000000100016527 dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425 frame #3: 0x000000010002c794 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 474 frame #4: 0x000000010002a55f dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 191 frame #5: 0x000000010002a600 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82 frame #6: 0x00000001000168b7 dyld`dyld::initializeMainExecutable() + 199 frame #7: 0x000000010001ceb8 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 8702 frame #8: 0x0000000100015224 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 450 frame #9: 0x0000000100015025 dyld`_dyld_start + 37Copy the code
__attribute__((constructor))void cc_func()
And finally, main()
The current backtrace
Thread #1, queue = 'com.apple.main-thread', stop reason = breakPoint 11.1 * frame #0: 0x0000000100003eb6 Dyld`main(argc=1, argv=0x00007ffeefbff3e8) at main.m:27:22 frame #1: 0x00007fff20528f3d libdyld.dylib`start + 1 frame #2: 0x00007fff20528f3d libdyld.dylib`start + 1Copy the code
As you can see, after executing the first two steps, you are back to _dyLD_START and then call main().
3. Initialization process
Above we have a clear idea of App loading, but is this the whole process?
Of course not. Now that we’ve dug a hole, let’s take a look at how the App loads and initializations.
1. Review
Review the previous breakpoint approach and explore step by step.
The breakpoint is in the +load method bt
(LLDB) thread #1, queue = 'com.apple.main-thread', stop reason = breakPoint 8.1 frame #0: 0x0000000100003e60 Dyld`+[Person load](self=Person, _cmd="load") at main.m:17:5 frame #1: 0x00007fff203ab4d6 libobjc.A.dylib`load_images + 1556 frame #2: 0x0000000100016527 dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425 frame #3: 0x000000010002c794 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 474 frame #4: 0x000000010002a55f dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 191 frame #5: 0x000000010002a600 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82 * frame #6: 0x00000001000168b7 dyld`dyld::initializeMainExecutable() + 199 frame #7: 0x000000010001ceb8 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 8702 frame #8: 0x0000000100015224 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 450 frame #9: 0x0000000100015025 dyld`_dyld_start + 37Copy the code
2.dyld start
From the stack information, we start the trace from dyLD_start
As you can see from the assembly, bootstrap::start comes next and the dyld::_main method is called
After that, initializeMainExecutalbe() is called
Called after the ImageLoader: : runInitializers
ImageLoader::processInit is then called
Internal calls the ImageLoader: : recursiveInitial
Then you see the notifySingle
You can see that it was registered in the Dyld Registerobjc Cnoti
Above that, you can see the internal call to _DYLD_OBJC_Notify_register
At this point, the trail is dead. There’s no more information
Return overdo see, this place missed a method ImageLoader: : doInitialization
How does it work internally?
DoInit implementation
3.libSystem init
Then you just have to look at the calls in libSystem
As you can see, it calls the Dispatch function internally
4.dispatch init
Libdispatch_init ()
Here you see a call to _objc_init
When we go to the next symbolic breakpoint, objc_init, we discover new ground
This method calls _objc_init of objC
5.objc init
Then we explore objC and find the function _dyLD_OBJC_Notify_register that we confused earlier
As you can see, this is where it gets called
Aha! At this point, a notifySingle is a callback function
It passes load_images as the second argument, so after execution, load_images is done
If you look at the call in loadImages, you can see why the Person Load method was called.
At this point, you should have a better understanding of dyLD loading and application initialization.
The direct relationship between each lib is shown below:
Let’s take a look at what objc_init does.