preface

When the application is loaded from dyld, the program doesn’t run yet. You need to map the image to run, because the image file is in Mach-O format, and you need to convert it to the familiar class structure (objc_class). This will link dyld to the dynamic library -> map_image -> load_images -> main

How are Dyld and LibobJC related

When dyld maps the dynamic library to an image, the image initialization method is called recursively. For libojbc, the entry is _objc_init, _objc_init calls _dyLD_OBJC_notify_register (&map_images, load_images, unmap_image) from dyLD dynamic library. After doInitialization of Dyld’s ImageLoader, call map_images and load_images internally. This will pass class loading to LibobJC. This can be seen in the last part of the objc_init article, Xcode’s call stack

map_images

Call source libobJC, query mag_images, see the following code

void
map_images(unsigned count, const char * const paths[],
           const struct mach_header * const mhdrs[])
{
    mutex_locker_t lock(runtimeLock);
    return map_images_nolock(count, paths, mhdrs);
}
Copy the code

Map_images_NOLock

The general process: calculate the number of classes, adjust the size of various tables, initialize the SEL method table, the focus is:_read_images, look directly at the _read_images method

_read_images

It’s actually mach-O

void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses) { ... #define EACH_HEADER \ hIndex = 0; \ hIndex < hCount && (hi = hList[hIndex]); \ hIndex++ // conditional control to perform a load if (! doneOnce) { ... } // Fix up @selector References static size_t for '@selector'. // Fix up @selector References static size_t for '@selector' UnfixedSelectors; {... } ts.log("IMAGE TIMES: fix up selector references"); // Discover classes. Fix up unresolved future classes. Mark bundle classes. Bool hasDyldRoots = dyld_shared_cache_some_image_overridden(); for (EACH_HEADER) { ... } ts.log("IMAGE TIMES: discover classes"); // Fix up remapped classes // List of classes remain unremapped and nonlazy Class list remain unremapped. // Refs  and super refs are remapped for message dispatching. if (! noClassesRemapped()) { ... } ts.log("IMAGE TIMES: remap classes"); #if SUPPORT_FIXUP // Fix old objc_msgSend_fixup call sites for (EACH_HEADER) {... } ts.log("IMAGE TIMES: fix up objc_msgSend_fixup"); // Discover protocols. Fix up protocol refs. for (EACH_HEADER) {... } ts.log("IMAGE TIMES: discover protocols"); // Preoptimized images may have the right answer already but we don't. // Preoptimized images may have the right answer already but we don't know for sure. for (EACH_HEADER) { ... } ts.log("IMAGE TIMES: fix up @protocol references"); // Discover categories. Only do this after the initial category // Attachment has been done present at startup, // discovery is deferred until the first load_images call after // the call to _dyld_objc_notify_register completes. if (didInitialAttachCategories) { ... } ts.log("IMAGE TIMES: discover categories"); For (EACH_HEADER) {// omit part of the code... classref_t const *classlist = _getObjc2ClassList(hi, &count); bool headerIsBundle = hi->isBundle(); bool headerIsPreoptimized = hi->hasPreoptimizedClasses(); for (i = 0; i < count; i++) { Class cls = (Class)classlist[i]; Class newCls = readClass(CLS, headerIsBundle, headerIsPreoptimized); // If (newCls! = cls && newCls) {... } // Category discovery MUST BE Late to avoid potential races // When other threads call the new Category code befor // this thread finishes its fixups. // +load handled by prepare_load_methods() // Realize non-lazy classes (for +load methods and static instances) for (EACH_HEADER) { ... } ts.log("IMAGE TIMES: realize non-lazy classes"); // The class that has not been processed, // Realize newly resolved future classes, in case CF manipulates them if (resolvedFutureClasses) {... } ts.log("IMAGE TIMES: realize future classes"); . #undef EACH_HEADER }Copy the code

Main process:

  • Conditional control runs a load
  • Read the list of methods, fix the @selector mess in the precompilation phase, where you have the same method in different classes, but the same method has different addresses, because the addresses of the classes are different
  • Error messy class handling
  • Fixed remapping of some classes that were not loaded into the file
  • Fixed some messages
  • readProtocol: when there is an agreement in our class
  • Fixed protocol not being loaded
  • Classified treatment
  • readClass: Associate the class name with the class and store the associated class in the hash table
  • realizeClassWithoutSwift: initializes lazily loaded classes
  • Classes that have not been processed, optimize those that have been violated

readClass

Break point validation in ReadClass, address of loaded class, notice the code in objCGETSECT(_getObjc2ClassList, classref_t const, "__objc_classlist")Here the corresponding is for the Mach-o fileSection64(_DATA, _objc_classlist)The address is the same as the memory address of the class in the Mach-o file, as shown below

Realize non-lazy classes

There are two ways to load classes: lazily load classes and non-lazily load classes (classes that implement the + (void)load{} method).

  • Non-lazy loading classes in the readImages process will gorealizeClassWithoutSwiftmethods
  • Lazily loading a class waits until the first time the class receives a message, inlookUpImpOrForward, the implementation method of the class will be walked

The realization of the class

The exploration of the whole process from dyld to _objc_init to read_images is gradually connected in series, and the context is more and more clear. Behind is the important content of the class, familiar with after, will help us to solve the following questions 1, why have rw, ro, rwe 2, 3 + load method call time, what is the order + load method calls 4, classification and the main class of the same method, the classification of why call

RealizeClassWithoutSwift analysis

Static Class realizeClassWithoutSwift(Class CLS, Class Previously) { Auto ro = (const class_ro_t *) CLS ->data(); auto ro = (const class_ro_t *) CLS ->data(); auto isMeta = ro->flags & RO_META; if (ro->flags & RO_FUTURE) { rw = cls->data(); ro = cls->data()->ro(); ASSERT(! isMeta); cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE); } else { // Normal class. Allocate writeable class data. rw = objc::zalloc<class_rw_t>(); rw->set_ro(ro); rw->flags = RW_REALIZED|RW_REALIZING|isMeta; cls->setData(rw); } cls->cache.initializeToEmptyOrPreoptimizedInDisguise(); #if FAST_CACHE_META if (isMeta) cls->cache.setBit(FAST_CACHE_META); Supercls = realizeClassWithoutSwift(remapClass(CLS ->getSuperclass()), nil); metacls = realizeClassWithoutSwift(remapClass(cls->ISA()), nil); If (STRCMP (mangledName, "LGPerson") == 0){if (! isMeta) { printf("%s LGPerson.... \n",__func__); }} #if SUPPORT_NONPOINTER_ISA / / the name of the class and the class name should be same with it, and CLS - > setInstancesRequireRawIsa (); // SUPPORT_NONPOINTER_ISA #endif // set the isa bitmap and CLS ->setSuperclass(supercls); cls->initClassIsa(metacls); // Omit part of the code... If (supercls) {addSubclass(supercls, CLS); } else { addRootClass(cls); }}}}}}}}}}}}}}}}}}}}}}} return cls; }Copy the code

This method is ignoringmethodizeClassUnder the premise of the implementation of this class, the parent class, metaclass. All three of them had clean RO data and built RW, but the methods in RO did not do sorting and other processing. At this time, the methods in RO were printed, and unknown symbols appearedThe next endpoint is researchmethodizeClassMethod, see what it does, if there’s any place in the method to assign the name of the method, print out the method name that we want, not the notation

methodizeClass

Static void methodizeClass(Class CLS, Class x) { // Install methods and properties that the class implements itself. // Get methods method_list_t *list = ro->baseMethods();  // Prepare the method, assign the name of the method, and sort by address size. If (list) {prepareMethodLists(CLS, &list, 1, YES, isBundleClass(CLS), nullptr); / / attachlist: It's a two-dimensional array that forms a list of methods, that integrates the methods of the classes, the methods of the classes into one data structure, but when you actually debug it, you don't come here because RWE doesn't exist yet, Rwe if (rWE) rwe->methods. AttachLists (&list, 1); } property_list_t *proplist = ro->baseProperties; if (rwe && proplist) { rwe->properties.attachLists(&proplist, 1); } protocol_list_t *protolist = ro->baseProtocols; if (rwe && protolist) { rwe->protocols.attachLists(&protolist, 1); } // Root classes get bonus method implementations if they don't have // them already. These apply before category replacements. if (cls->isRootMetaclass()) { // root metaclass addMethod(cls, @selector(initialize), (IMP)&objc_noop_imp, "", NO); } // Attach categories. if (previously) { if (isMeta) { objc::unattachedCategories.attachToClass(cls, previously, ATTACH_METACLASS); } else { // When a class relocates, categories with class methods // may be registered on the class itself rather than on // the metaclass. Tell attachToClass to look for those. objc::unattachedCategories.attachToClass(cls, previously, ATTACH_CLASS_AND_METACLASS); }} / / surface meaning rehearsal classification are attached to the class, but actually did not go here attachcategory objc: : unattachedCategories. AttachToClass (CLS, CLS, isMeta? ATTACH_METACLASS : ATTACH_CLASS); #if DEBUG // Debug: sanity-check all SELs; log method list contents for (const auto& meth : rw->methods()) { if (PrintConnecting) { _objc_inform("METHOD %c[%s %s]", isMeta ? '+' : '-', cls->nameForLogging(), sel_getName(meth.name())); } ASSERT(sel_registerName(sel_getName(meth.name())) == meth.name()); } #endif }Copy the code

When the class and class implement the + (void)load method, the read_image process is finished. Ro and rW have values, but the class has not been loaded yet, so rWE has not been initialized. Then the load_image process is finished

attachCategoryprocess

To debug interrupt points in attachCategory, the call stack looks like this:

AttachCategory is called inside load_images, so what’s going on in there? AttachCategory registers a class and then builds a list of methods and protocols

If (slowPath (PrintReplacedMethods)) {// omit the code... constexpr uint32_t ATTACH_BUFSIZ = 64; method_list_t *mlists[ATTACH_BUFSIZ]; property_list_t *proplists[ATTACH_BUFSIZ]; protocol_list_t *protolists[ATTACH_BUFSIZ]; uint32_t mcount = 0; uint32_t propcount = 0; uint32_t protocount = 0; bool fromBundle = NO; bool isMeta = (flags & ATTACH_METACLASS); Auto rwe = CLS ->data()->extAllocIfNeeded(); for (uint32_t i = 0; i < cats_count; i++) { auto& entry = cats_list[i]; method_list_t *mlist = entry.cat->methodsForMeta(isMeta); if (mlist) { if (mcount == ATTACH_BUFSIZ) {// ATTACH_BUFSIZ = 64 prepareMethodLists(cls, mlists, mcount, NO, fromBundle, __func__); rwe->methods.attachLists(mlists, mcount); mcount = 0; } mlists[ATTACH_BUFSIZ - ++mcount] = mlist; fromBundle |= entry.hi->isBundle(); } property_list_t *proplist = entry.cat->propertiesForMeta(isMeta, entry.hi); if (proplist) { if (propcount == ATTACH_BUFSIZ) { rwe->properties.attachLists(proplists, propcount); propcount = 0; } proplists[ATTACH_BUFSIZ - ++propcount] = proplist; } protocol_list_t *protolist = entry.cat->protocolsForMeta(isMeta); if (protolist) { if (protocount == ATTACH_BUFSIZ) { rwe->protocols.attachLists(protolists, protocount); protocount = 0; } protolists[ATTACH_BUFSIZ - ++protocount] = protolist; }} if (McOunt > 0) { PrepareMethodLists (CLS, mlists + ATTACH_BUFSIZ - McOunt, McOunt, NO, fromBundle, __func__); rwe->methods.attachLists(mlists + ATTACH_BUFSIZ - mcount, mcount); if (flags & ATTACH_EXISTING) { flushCaches(cls, __func__, [](Class c){ // constant caches have been dealt with in prepareMethodLists // if the class still is constant here, it's fine to keep return ! c->cache.isConstantOptimizedCache(); }); }} rwe-> property.attachLists (proplists + attach_bufsiz-propcount, propcount); // Add rwe->protocols. AttachLists (protolists + ATTACH_BUFSIZ - protocount, protocount);Copy the code

The method is basically to add the methods, protocols, and whatever else you need, and then to call them when needed to see the attachList internal method, as well as the changes in the list_array_TT structure during the process

void attachLists(List* const * addedLists, uint32_t addedCount) { if (addedCount == 0) return; if (hasArray()) { // many lists -> many lists uint32_t oldCount = array()->count; uint32_t newCount = oldCount + addedCount; array_t *newArray = (array_t *)malloc(array_t::byteSize(newCount)); newArray->count = newCount; array()->count = newCount; for (int i = oldCount - 1; i >= 0; i--) newArray->lists[i + addedCount] = array()->lists[i]; for (unsigned i = 0; i < addedCount; i++) newArray->lists[i] = addedLists[i]; free(array()); setArray(newArray); validate(); } else if (! list && addedCount == 1) { // 0 lists -> 1 list list = addedLists[0]; validate(); } else { // 1 list -> many lists Ptr<List> oldList = list; uint32_t oldCount = oldList ? 1:0; uint32_t newCount = oldCount + addedCount; setArray((array_t *)malloc(array_t::byteSize(newCount))); array()->count = newCount; if (oldList) array()->lists[addedCount] = oldList; for (unsigned i = 0; i < addedCount; i++) array()->lists[i] = addedLists[i]; validate(); }}Copy the code
  • To load the first category, the attachList process goes to the else logic at the bottom, and the corresponding data is as follows

OldList is the method of this class

  • After the for loop, array()->lists[0] is the method of classification

Such categories of parties, attributes, protocols, and so on are added

The class implements + (void)load{}; the main class does not implement + (void)loadSo, it doesn’t follow the attachCategory method logic, so how does the category load in? Look at the following call stackThe category is loaded when read_images is loaded

The class does not implement + (void)load{}; the main class implements + (void)loadIn the read_images phase, the classified data is loaded

The class does not implement + (void)load{}, nor does the main class implement + (void)load, the data loading of the class is delayed until the first message is sent to the class

The more +load methods are implemented in the classification and the main class, the longer the startup time will be, because there’s a part of the algorithm that’s doing the calculation here