
Through the first three articles, I have understood the loading process of the program, the interworking relationship between dyld and _objc_init, the execution of load_images, and the iterative changes of dyLD.

Dyld links to the image file images. According to the application loading process, images have been mapped into the application, but only to the corresponding library, not into the corresponding memory. Our project, after compiling, will form machO(executable file), but how to load into memory? That’s what we’re going to explore.

Prepare resources

  • objcSource download:Multiple versions of objC source code
To enter the body

inObjc sourceIs executed when the project is running_objc_initAnd then the corresponding registration_dyld_objc_notify_register, as shown below:is_dyld_objc_notify_register(&map_images, load_images, unmap_image)This code, to undertake the role, with the diagram described below:

  • map_images: Manages all symbols in executable files and dynamic libraries, and loads classes, selectors, protocols, and categories.
  • load_images: Load executionloadMethods.

Analysis of the_objc_init()Inside the method


Read the environment variables that affect the runtime, and with a few modifications in the source code, print the information directly, as shown below:

After running, you can get the printed result:

It’s all printedobjcThe corresponding information. Of course, you can also adjust the printing situation by setting the console, such as:

  • Whether theisaOptimize processingOBJC_DISABLE_NONPOINTER_ISA;
  • Whether or not to printloadmethodsOBJC_PRINT_LOAD_METHODS;

Set method: Product –> Scheme –> Edit Scheme Run –> Arguments –> Environment Variables, add Environment Variables.

For example, print the isa pointer to the LGPerson class:

In the Settings box, set OBJC_DISABLE_NONPOINTER_ISA to YES, as shown in the figure below:

Run the project again, debug through LLDB, print the binary isa pointer:

The mantissa is 0, indicating that pointer optimization is off.

The same method can be setOBJC_PRINT_LOAD_METHODS, to output the callloadMethod, as shown in the figure below:


About the threadkey– such as the per-thread data destructor, as shown below:


runC++Static constructors. indyldBefore we call our static constructor,libcWill be called_objc_init (), so we have to do it by ourselves, as shown in the picture below:


Where the Runtime runtime environment is initialized, two tables are initialized:

  • unattachedCategories.init(32)Initialization of the classification table;
  • allocatedClasses.init()Creation of in-memory class tables;


Initialize libobJC’s exception handling system

When an exception occurs, it determines if it is objC, and if it is, the callback function uncaught_Handler is executed. Search globally for uncaught_handler to find the method for setting the callback function.

In the OC layer, can be set by calling the method NSSetUncaughtExceptionHandler callback function, the callback function will be assigned to uncaught_handler.


The cache condition is initialized


Start the callback mechanism. Normally this doesn’t do much because all initialization is lazy, but for some processes we can’t wait to load trampolines dylib.


inProgram loading processAs you’ll see in this article, when the program is loaded at the end, it executes_objc_initFunction, and then it will execute_dyld_objc_notify_register(), which is equivalent to a bridge to register the three methods todyldTo:

Inside this function, the three methods are:

  • map_images: Manages all symbols in executable files and dynamic libraries, doneclass,selector,protocol,categoryThe load;
  • load_images: Load executionloadMethods.
  • unmap_image:dyldwillimageWhen removed, this function fires.

_read_images()Analysis of the

The overall analysis


Processing bydyldThe given image file to map.


  • Process:map_images() –> map_images_nolock() –> _read_images()

By the _read_images() function, we’re already at the center of gravity.

Void _read_images(header_info **hList, uint32_t hCount, int totalClasses) int unoptimizedTotalClasses) { header_info *hi; uint32_t hIndex; size_t count; size_t i; Class *resolvedFutureClasses = nil; size_t resolvedFutureClassCount = 0; static bool doneOnce; bool launchTime = NO; TimeLogger ts(PrintImageTimes); runtimeLock.assertLocked(); #define EACH_HEADER \ hIndex = 0; \ hIndex < hCount && (hi = hList[hIndex]); \ hIndex++ //✅1: conditional control to perform a load ---- find a network table if (! doneOnce) {... } //✅2: Fix the '@selector' mess in the precompile phase // because the location of the method with the same name is different in each image file, // Fix up @selector references // sel name + address static size_t UnfixedSelectors; {... } ts.log("IMAGE TIMES: fix up selector references"); // Discover classes. Fix up unresolved future classes. Mark bundle classes. bool hasDyldRoots = dyld_shared_cache_some_image_overridden(); //✅3: error messy class handling for (EACH_HEADER) {... } ts.log("IMAGE TIMES: discover classes"); // Fix up remapped classes // Class list and nonlazy class list remain unremapped. // Class refs and super refs are Remapped for message dispatching. / / ✅ 4: some heavy repair mapping has not been image file loaded in the class of the if (! noClassesRemapped()) {... } ts.log("IMAGE TIMES: remap classes"); #if SUPPORT_FIXUP //✅5: fix some messages! // Fix up old objc_msgSend_fixup call sites for (EACH_HEADER) {... } ts.log("IMAGE TIMES: fix up objc_msgSend_fixup"); #endif //✅6: When there is a protocol in our class :readProtocol // Discover protocols. Fix up protocol refs. for (EACH_HEADER) {... } ts.log("IMAGE TIMES: discover protocols"); //✅7: fixed up @protocol references // Preoptimized images may have the right // answer already but we don't know for sure. for (EACH_HEADER) {... } ts.log("IMAGE TIMES: fix up @protocol references"); // Discover categories. Only do this after the initial category // attachment has been done. For categories present at startup, // discovery is deferred until the first load_images call after // the call to _dyld_objc_notify_register completes. Rdar: / / problem / 53119145 / / ✅ 8: classification to deal with the if (didInitialAttachCategories) {... } ts.log("IMAGE TIMES: discover categories"); // Category discovery MUST BE Late to avoid potential races // when other threads call the new category code before // this thread finishes its fixups. // +load handled by prepare_load_methods() // Realize non-lazy classes (for +load Methods and static instances) //✅9: Static instances for (EACH_HEADER) {... } ts.log("IMAGE TIMES: realize non-lazy classes"); //✅10: unprocessed class, // Realize newly resolved future classes, in case CF manipulates them if (resolvedFutureClasses) {... } ts.log("IMAGE TIMES: realize future classes"); if (DebugNonFragileIvars) {... } // Print preoptimization statistics if (PrintPreopt) {... } #undef EACH_HEADER }Copy the code
  • 1: Conditional control for one-time loading;
  • 2: fix precompile phase@selectorThe problem of confusion;
  • 3: Error messy class handling;
  • 4: fixed remapping of some classes that were not loaded by the image file;
  • 5: Fix some messages
  • 6: When we have an agreement in our class:readProtocol;
  • 7: fix protocols that are not loaded
  • 8: classification treatment;
  • 9: class loading processing;
  • 10: Unprocessed classes optimize those classes that have been violated;

Obviously 8 and 9 are the core. Load_categories_nolock and realizeClassWithoutSwift.

Local detailed analysis

doneOnceAnalysis of the

  • Handling of small object types.

  • Create a class master table that contains all the classes to make it easy to find them quickly.

  • The size of the table also follows the load factor, where namedClassesSize = totalClasses * 4/3 is the inverse of the load factor of 3/4. NamedClassesSize corresponds to the total capacity, and totalClasses corresponds to the space to be occupied.

  • The runtime_init() function has two tables:


  • gdb_objc_realized_classesIs a master table, whether the class is instantiated or not.
  • allocatedClassesIt contains allallocatedClasses and metaclasses of.

So gdb_objc_realized_classes should include allocatedClasses. The doneOnce function is the same as creating a master table for the class.

UnfixedSelectorsAnalysis of the

Fixed @selector confusion during precompilation, as shown below:

Because sel = name + address, in each image file, there are methods with the same name, but these methods are located differently, so to handle locally, you can debug by breaking points:

From the debugging results, you can see that the same classretainThe method, but the address is different. But in the enddyldThe loaded real address prevails.

As shown in the figure above, there will be multiple libraries in our system. If each library has A retain method, then the execution of the method needs to be translated to the exit of the program. The address of the retain method in library A is equivalent to the first address. The address of the retain method in library B needs to translate the address size of library A. Therefore, the method needs to be translated for different addresses.

readClassError-cluttered class handling (core focus)

The name of the class is initialized in this section. Classes have been moved but not deleted, the wrong messy class is handled, for example, there is a space in which the class is stored, when the space is moved, the original class will be killed, if it is not completely killed, there will be a residue, it will become a mess, there will be wild pointer, until later, it will be killed. The source code is as follows:

With _getObjc2ClassList, you get the list of classes from the machO executable, and then you can process the classes. You can see the situation through debugging, as shown in the figure below:

When the readClass function is not executed, CLS is still an address 0x00007FFF889de040. When the readClass function is executed, CLS becomes __NSStackBlock__.

This indicates that in the readClass function, should do some processing related to the class name and address, next to the readClass analysis.

Go to readClass to see its implementation:

Printf (“%s -KC – %s\n”,__func__,mangledName); printf(“%s -KC – %s\n”,__func__,mangledName); And print, as shown below:

We can find our own class LGPerson that we created.

Of course, we can filter out the other classes and print only the LGPerson class, as shown below:

You can print directly to the corresponding LGPerson class. It might be confusing. What’s the use? A class can be handled separately here for easy traceability.

Breakpoint 1 and breakpoint 2 are used to set breakpoint 1 and breakpoint 2, as shown below:

The reason why I did this is to check some data that says it will go 3365 places in the code of this if judgment block, but in fact it is not executed. Continuing to trace the code, the program will run to addNamedClass(), which adds the class name to the named non-metaclass mapping (associated with the class information and added to the master table gdb_objc_realized_classes), as shown below:

  • willclsClass to joingdb_objc_realized_classesIn the table.Total tableIs in thedoneOnceCreated in.
  • performaddNamedClassAfter the function,classwithaddressIt’s related. The core logic isNXMapInsertTo deal with. That’s the association that happens when you insert the master table. In order toMapPair(key-value)The form of.

To track down the breakpoint, we execute the addClassTableEntry function, as shown below:

  • Insert classes and metaclassesallocatedClassesIn the table. This is a list fromruntime_initCreated in.


  • throughnoClassesRemappedMethod to determine whether there is a class reference (_objc_classrefs)Remapping is required
  • The next step is to do the classremapping, read ismachoThe data in the__objc_classrefs__objc_superrefs. The final callremapClassRefPerform the remapping.


  • repairselFor example, in my second blog, I wrote about AppleallochookOperation,allocimpTo call directlyobjc_allocInstead of walkingallocThe implementation of the. Of course, you don’t normally follow that logicllvmThe stages have been dealt with.

Let’s look at the source code for fixupMessageRef, which is familiar, as shown below:

discover categories

According to the comments, this logic will not be entered (even if the sorted + load is implemented). The classification must be loaded after load_images.

realize non-lazy classes

Non-lazily loaded class processing, as shown below:

  • Normally classes that implement themselves will not enter this logic (unless implemented+ loadMethods).
  • You can see from the comments that only non-lazily loaded classes will enter this logic,nlclslistIs to get a list of non-lazily loaded classes. throughmacho__objc_nlclslistTo obtain. To achieve the+loadThe class of the method will appear in__objc_nlclslistIn the.
  • The core isrealizeClassWithoutSwiftInitialization logic. This method has been encountered in the previous slow message lookup process.

Non-lazy loading classes fall into three categories:

1. This class implements the + load method.

2. Subclasses implement the + load method. (Because subclass initializations are associated with initializations of superclasses)

3. The +load method is implemented for classification. (This includes your own category and subcategory categories)

__objc_nlclslist = __objc_nlclslist;

  • And that means trying to avoid+ loadMethod. The whole process is a chain reaction. add+loadThe class of the method will appear in__objc_nlclslistIn the.

Why is there a difference between lazy and non-lazy loaded classes?

Because Apple systems are allocated on demand, the fewer classes that are initialized during startup, the faster startup will be.

Now that we know where the non-lazy-loaded classes are instantiated, where are lazy-loaded classes instantiated?

To instantiate it, you must call realizeClassWithoutSwift, put a debug breakpoint in it, and remove the load method:

As you can see from the stack information, it was instantiated during the slow message lookup when alloc was called. Call a class method directly, find the information in the stack, and perform the same steps. This illustrates instantiation when the class first sends a message.

  • For non-lazily loaded classes, implemented+loadMethod (subclass/class/self), the class will be loaded ahead of time, for+ loadIn preparation for the call.
  • For lazily loaded classes, this is the first time the message is sentobjc_msgSend,lookUpImpOrForwardWhen the message is slow to find.

Lazy loading vs. non-lazy loading:

  • Lazy loading class: Data loading is delayed until the first message is sent.

lookUpImpOrForward –> realizeClassMaybeSwiftMaybeRelock –> realizeClassWithoutSwift –> methodizeClass

  • Non-lazily loaded classes:map_imagesWhen all class data is loaded.

readClass –> _getObjc2NonlazyClassList –> realizeClassWithoutSwift –> methodizeClass