Ios – Class loading process (_objc_init implementation principle)

Preface:

The previous article analyzed the relationship between DYLD and objC, and the various initialization methods in the _objC_init method and

Relationship between _DYLD_OBJC_Notify_register and dyLD links. _DYLD_OBJC_notify_register method &map_Images parameter method implementation, and the key method _read_images, due to the length of the call _read_images method is not analyzed. This article focuses on the _read_images method and how lazy and non-lazy classes are loaded.

Concept:

  • Lazy loading: classes that do not implement the +(void)load method are called lazy loading classes

  • Non-lazy-loaded classes: They implement the +(void)load method and are called non-lazy-loaded classes

  • Ro Ro stands for readOnly, which contains information about class names, methods, protocols, and instance variables at compile time. Since it is read-only, it belongs to Clean Memory, which is Memory that does not change after loading

  • **rw ** Rw stands for readWrite, which can be read or written. Because of its dynamic nature, it is possible to add attributes, methods, protocols to a class, In the latest WWDC 2020 Advancements in the Objective-C Runtime – WWDC 2020 – Videos – Apple Developer, In fact, only 10% of the classes in the RW actually change their methods, so there’s the RWE, the extra information about the class. For classes that do require additional information, one of the RWE extension records can be assigned and slid into the class for its use. Rw is in dirty memory, and dirty memory is the memory that changes when the process is running. Class structures become Ditry memory when used, because the runtime writes new data to it.

  • Rw can be understood as rW memory size = RO memory + Rwe additional memory information

_read_images()

Open objc source and search for _read_images to go to _read_images

/*********************************************************************** * _read_images * Perform initial processing of the headers in the linked * list beginning with headerList. * * Called by: map_images_nolock * * Locking: runtimeLock acquired by map_images **********************************************************************/ void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses) { header_info *hi; uint32_t hIndex; size_t count; size_t i; Class *resolvedFutureClasses = nil; size_t resolvedFutureClassCount = 0; static bool doneOnce; bool launchTime = NO; TimeLogger ts(PrintImageTimes); runtimeLock.assertLocked(); #define EACH_HEADER \ hIndex = 0; \ hIndex < hCount && (hi = hList[hIndex]); \ hIndex++ if (! doneOnce) {... } // Fix up @selector references static size_t UnfixedSelectors; {... } ts.log("IMAGE TIMES: fix up selector references"); // Discover classes. Fix up unresolved future classes. Mark bundle classes. bool hasDyldRoots = dyld_shared_cache_some_image_overridden(); for (EACH_HEADER) {... } ts.log("IMAGE TIMES: discover classes"); // Fix up remapped classes // Class list and nonlazy class list remain unremapped. // Class refs and super refs are remapped for message dispatching. if (! noClassesRemapped()) {... } ts.log("IMAGE TIMES: remap classes"); #if SUPPORT_FIXUP // Fix up old objc_msgSend_fixup call sites for (EACH_HEADER) {... } ts.log("IMAGE TIMES: fix up objc_msgSend_fixup"); #endif bool cacheSupportsProtocolRoots = sharedCacheSupportsProtocolRoots(); // Discover protocols. Fix up protocol refs. for (EACH_HEADER) {... } ts.log("IMAGE TIMES: discover protocols"); // Fix up @protocol references // Preoptimized images may have the right // answer already but we don't know for sure. for (EACH_HEADER) {... } ts.log("IMAGE TIMES: fix up @protocol references"); // Discover categories. Only do this after the initial category // attachment has been done. For categories present at startup, // discovery is deferred until the first load_images call after // the call to _dyld_objc_notify_register completes. rdar://problem/53119145 if (didInitialAttachCategories) {... } ts.log("IMAGE TIMES: discover categories"); // Category discovery MUST BE Late to avoid potential races // when other threads call the new category code before // this thread finishes its fixups. // +load handled by prepare_load_methods() // Realize non-lazy classes (for +load methods and static instances) for (EACH_HEADER) {... } ts.log("IMAGE TIMES: realize non-lazy classes"); // Realize newly-resolved future classes, in case CF manipulates them if (resolvedFutureClasses) {... } ts.log("IMAGE TIMES: realize future classes"); if (DebugNonFragileIvars) {... } // Print preoptimization statistics if (PrintPreopt) {... } #undef EACH_HEADER }Copy the code

It’s a very long code, 400 + lines, and it’s very confusing, so fold as much code as you can. The following is an analysis of the ts. Log () method

doneOnce

Fixed sel

Class loading

Class remapping

This is usually not the way to go

Revised agreement

Non-lazy-loaded classes

Print module

That’s the parsing in the _read_images method. Let’s focus on the readClass method in the reading of the following class

readClass

Enter the readClass source code

The above image shows the method that all classes walk, can not confirm whether it is system class and custom. Create the LGPerson class name in the readClass class

mangledName()

Debugging through breakpoints

addNameClass

addClassTableEntry

Go ahead, return className and go to realizeClassWithoutSwift

realizeClassWithoutSwift

Enter realizeClassWithoutSwift source code

Get the class to study

Data is read from Mach_O and assigned to RO to open up RW memory space, and data from RO is copied to RW

Viewing RO Data

Keep going down

Rw = ro_or_rw_ext; rw = ro_or_rw_ext; rw = ro_or_rw_ext; rw = ro_or_rw_ext;

Set_ro source code implementation, its path is: Set_ro — set_ro_or_rwe (find get_ro_OR_rwe, is obtained from ro_OR_rw_ext by ro_OR_rw_ext_t type) — ro in ro_OR_rw_ext_t through the source knowable ro is mainly divided into two cases:

  • Read from the RW if there is a runtime

  • If there is no runtime, read from ro

Recursive calls realizeClassWithoutSwift, confirm the inheritance relationship

Set the ISA relationship chain

Bidirectional binding inherits the connection, the parent class can find the child class, the child class can find the parent class, and finally the root class

Add the classification

At this point, the analysis of the loading process of the class is completed. Load method, breakpoint debugging analysis of non-lazy loading class loading process, then lazy loading class call process how?

Lazy load class analysis

Open up the source code,Get rid of loadAnd seeBt and stack

And you also end up in the realizeClassWithoutSwift method

Conclusion:

The realization of the _read_images

Class loading process

Class method loading, divided into lazy loading classes and non-lazy loading classes, both of which eventually go to the realizeClassWithoutSwift method to implement the process of code to link library to mach_O to memory.

  • Lazy loading class

lookUpImpOrForward

realizeClassMaybeSwiftMaybeRelock

realizeClassWithoutSwift

methodizeClass

Call stack: [LGPerson alloc] –> objc_alloc –>callAlloc –> _objc_msgSend_uncached –>lookUpImpOrForward –>initializeAndLeaveLocked–>initializeAndMaybeRelock–>realizeClassMaybeSwiftAndUnlock–>realizeClassMaybeSwiftMaybeRe lock –> realizeClassWithoutSwift

  • Non-lazy-loaded classes

_getObjc2NonlazyClassList

readClass

realizeClassWithoutSwift

methodizeClass

Call stack: _dyld_start –> _objc_init –> _dyld_objc_notify_register –> dyld::registerObjcNotifiers –> dyld::notityBatchPartial –> map_images –>map_images_nolock –> _read_images –> realizeClassWithoutSwift

One method remains unanalyzed, which will be continued in the next articleclassificationHow is it loaded into memory