IOS martial arts esoteric article summary

Writing in the front

The _DYLD_OBJC_NOTIFy_register for _objC_init is an example of how to load analysis classes and classes.

A possible secret Demo for this section

_objc_init ()

1) environ_init method

environ_init()You do this by initializing a set of environment variables and reading the environment variables that affect the runtime

  • The key code for this method isThe for loopThe code inside.

There are two ways to print all environment variables

  • willThe for loopSeparate it out, remove all the conditions, print the environment variables

  • By terminal commandexport OBJC_HELP = 1Print environment variables

These Environment Variables can be configured using target — Edit Scheme — Run –Arguments — Environment Variables. :

  • DYLD_PRINT_STATISTICSSet:DYLD_PRINT_STATISTICSFor YES, the console printsApp, including the overall loading time and the dynamic library loading time, i.eStart time before main (check pre-main time), you can learn about the time consuming part by setting itStart the optimization
  • OBJC_DISABLE_NONPOINTER_ISA: Eliminate the generation of correspondingnonpointer isa(nonpointer isaPointer to the addressAt the end of 1), all generated are ordinaryisa
  • OBJC_PRINT_LOAD_METHODS: printingClassCategory+ (void)loadMethod call information
  • NSDoubleLocalizedStrings: Project internationalization and localization (Localized) is a time-consuming task, want to test the internationalized translated language textUIWhat it’s going to look like, you can specify this startup. You can set theNSDoubleLocalizedStringsYES
  • NSShowNonLocalizedStrings: Once in a while, when you’re doing internationalizationThe string is not localized“Can be setNSShowNonLocalizedStringsYESAll,Strings that are not localized are all capitalized

① Environment variable – OBJC_DISABLE_NONPOINTER_ISA

In order toOBJC_DISABLE_NONPOINTER_ISAFor example, set it toYES, as shown in the figure below

  • Is not setOBJC_DISABLE_NONPOINTER_ISABefore,isaBinary print of the address, ending with1

  • Set up theOBJC_DISABLE_NONPOINTER_ISAAfter the environment variable, the end becomes0

So OBJC_DISABLE_NONPOINTER_ISA can control the ISA optimization switch to optimize the entire memory structure

② Environment variable – OBJC_PRINT_LOAD_METHODS

  • Configure the printloadMethod environment variablesOBJC_PRINT_LOAD_METHODS, is set toYES
  • inTCJPersonClass to rewrite+loadFunctions, running programs,loadThe function is printed as follows

So OBJC_PRINT_LOAD_METHODS can monitor all +load methods to handle boot optimizations (which will be covered in a future article)

(2) tls_init method

tls_init()The method is about threadskeyThe main binding isLocal thread pooltheInitialize theAs well asdestructor

(3) static_init method

The static_init() method notes that it runs C++ static constructors (only system-level constructors)

indyldBefore calling the static constructor,libcWill be called_objc_initSo you have to do it yourself

(4) runtime_init method

This is mainly runtime initialization, which is divided into two parts:Class initialization,Class to initialize the table(More on the corresponding functions later)

(5) exception_init method

exception_init()mainlyInitialize libobJC's exception handling system and register the callback for exception handling to monitor the handling of exceptions

  • When you havecrash(crashIs when the system produces some disallowed instruction, and then some signal given by the system) occurs, will come_objc_terminateMethod, walk touncaught_handlerThrow exceptions

  • searchuncaught_handlerIn theThe app layerA function is passed to handle the exception so that the function can be called, and then back to the originalThe app layer, as shown below, wherefnIs the function passed in, i.euncaught_handlerIs equal to thefn

(1) classification of crash

The main reason for crash is that unprocessed signals are received, mainly from three places: kernel kernel, other processes and App itself.

Therefore, there are three types of crash

  • Mach exceptions: These are the lowest level kernel-level exceptions. User developers can catch Mach exceptions by setting exception ports for Thread, task, and host directly through the Mach API

  • Unix signals: Also known as BSD signals, if the developer does not catch a Mach exception, the exception is converted to the corresponding Unix signal by the host layer method ux_Exception () and sent to the error thread via the threadSignal () method. Single can be caught using the method Signal (x, SignalHandler)

  • NSException Application-level exception: It is an uncaught Objective-C exception that causes the program to send itself a SIGABRT signal and crash. For an uncaught Objective-C exception, you can catch it with a try catch. To capture or through NSSetUncaughtExceptionHandler () mechanism.

Aiming at application level anomalies, can by registering exception handling function, namely NSSetUncaughtExceptionHandler mechanism, realize the thread to keep alive, collect upload crash logs

② Application-level crash interception

So in development, it’s targeted atcrashIntercept processing, that isappAn exception handle is given in codeNSSetUncaughtExceptionHandler, pass a function to the system, call the function when an exception occurs (the function can thread alive, collect and upload crash logs), and then return to the originalappIn the layer, its essence is oneThe callback function, as shown in the figure below

The above method is only suitable for collecting application-level exceptions. What we need to do is to replace the ExceptionHandler with a custom function

6 cache_t: : init () method

Mainly cache initialization, source code as follows

All landowners _imp_implementationWithBlock_init method

The method is mainlyStart the callback mechanismNormally this doesn’t do much because all initialization is lazy, but for some processes we can’t wait to load, rightlibobjc-trampolines.dylib, the source code is as follows

⑧ _DYLD_OBJC_NOTIFY_register: DYLD register

The concrete implementation of this method is inIOS martial arts secrets ⑦: DYLD loading process – application loadingThe source code implementation is described in detail indyldIn the source code, the following_dyld_objc_notify_registerMethod declaration

From the comment on the _dyLD_OBJC_notify_register method:

  • Only forObjc runtimeuse
  • Registration handlerTo map, unmap, and initializeobjcImage call
  • dyldWill pass through an includeobjc-image-infoArray callback to the image filemappedfunction

The three parameters in _DYLD_OBJC_notify_register have the following meanings:

  • map_images:dyldwillimageThis function is triggered when the image file is loaded into memory
  • load_image:dyldInitialize theimageTriggers the function
  • unmap_image:dyldwillimageThis function is triggered when it is removed

Ii. The association between DYLD and Objc

The source code implementation and invocation of the method are as follows, that is, the association between DYLD and Objc can be reflected through the source code

dyldSource code – concrete implementation libobjcSource code – callIt follows from the above

  • mappedIs equivalent tomap_images
  • initIs equivalent toload_images
  • unmappedIs equivalent tounmap_image

indyldSource code – For implementation, clickregisterObjCNotifiersGo in thereSo we have the following equivalence

  • sNotifyObjCMapped= =mapped= =map_images
  • sNotifyObjCInit= =init= =load_images
  • sNotifyObjCUnmapped= =unmapped= =unmap_image

Load_images call timing

inIOS martial arts secrets ⑦: DYLD loading process – application loadingMiddle, we knowload_imagesIs in thenotifySingleMethod, throughsNotifyObjCInitCalled, as shown below

Map_images call timing

The call timing of load_images has been explained in the dyLD load process. Let’s take map_images as an example to see when it is called

  • Global search in DYLDsNotifyObjcMappedIn thenotifyBatchPartialMethod call

  • Global searchnotifyBatchPartialIn theregisterObjCNotifiersMethod call

Now let’s sort out the DYLD process:

  • inrecursiveInitializationMethod callbool hasInitializers = this->doInitialization(context);The way to do that is to judgeimageLoaded or not
  • doInitializationThis method is calleddoImageInitanddoModInitFunctions(context)These two methods will go inlibSystemCall in framelibSystem_initializerMethod, which will eventually be called_objc_initmethods
  • _objc_initWill be called_dyld_objc_notify_registerwillMap_images, load_images, unmap_imageThe incomingdyldmethodsregisterObjCNotifiers
  • inregisterObjCNotifiersIn the method, let’s take_dyld_objc_notify_registerThe incomingmap_imagesAssigned tosNotifyObjCMappedThat will beload_imagesAssigned tosNotifyObjCInitThat will beunmap_imageAssigned tosNotifyObjCUnmapped
  • inregisterObjCNotifiersMethod, we will pass the parameter and assign the value to start the callnotifyBatchPartial()
  • notifyBatchPartialMethod is called(* sNotifyObjCMapped) (objcImageCount, paths, MHS); Trigger the map_images method
  • dyldtherecursiveInitializationMethod is running out of callsbool hasInitializers = this->doInitialization(context)Method is callednotifySingle()methods
  • innotifySingle()Will call(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());Above we willload_imagesAssigned tosNotifyObjCInit, so it will be triggeredload_imagesmethods
  • sNotifyObjCUnmappedWill be inremoveImageMethod trigger, literally deleteImage(Mapped image file)

Map_images is called before load_images, i.e. Map_images is called before load_images.

Dyld is associated with Objc

In combination withdyldLoading process,dyldwithObjcThe association of is shown below

  • inIn the dyldRegister the callback function, which can be interpreted asAdd observer
  • inobjcIn thedyldRegistration, can be understood asSend a notification
  • Triggered the callback, can be interpreted as an execution notificationselector

Let’s look at what map_images, load_images, and unmap_image do.

  • map_images: Mainly managementFile and dynamic libraryAll of thesymbol, i.e.,Class, Protocol, Selector, categoryEtc.
  • load_images: Load executionThe load method
  • unmap_image: Uninstalls and removes data

Among themcodethroughcompile, readMach-O executable file, from againMach-ORead in thememory, as shown in the figure below

Third, map_images

Before we look at the source code, we need to explain why map_images has & and load_images does not

  • map_imagesisReference typesThe outside world changes and changes with it
  • load_imagesisValue types, do not pass values

When the image file is loaded into memorymap_imagesWill trigger, i.emap_imagesThe main function of the method is to transformMach-OIn theThe class information is loaded into memory.

map_imagescallmap_images_nolock, includinghCountRepresents the number of image files, call_read_imagesTo load the image file (the key to this method)

_read_images

_read_images is mainly used to load class information, that is, class, classification, protocol, etc., into the source code implementation of _read_images, which is mainly divided into the following parts:

  • Condition control creates a table by one load
  • ②. Fix the precompile phase@selectorThe chaotic problem of
  • ③. Wrong messy class handling
  • Fix remapping some classes that were not loaded by the image file
  • ⑤. Fix some messages
  • ⑥ When there is a protocol in a class:readProtocolRead the agreement
  • ⑦. Fix protocols that were not loaded
  • ⑧. Classification processing
  • ⑨. Class loading processing
  • ⑩ For classes that are not processed, optimize those that are violated

Condition control creates a table by one load

indoneOnceThrough the processNXCreateMapTableCreate a table to store class information, that is, create a classHash tablegdb_objc_realized_classes, its purpose is to make class lookup convenient and fastTo viewgdb_objc_realized_classesThe notes on thisHash tableUsed forStores named classes that are not in the shared cache.Whether the class is implemented or notAnd its capacity is4/3 of the number of classes.

②. Fix the precompile phase@selectorThe chaotic problem of

Mostly by passing_getObjc2SelectorRefsgetMach_OStatic segment in__objc_selrefs, traversing the list callsel_registerNameNoLockwillSELAdded to thenamedSelectorsThe hash table

Where selector –> sel is not a simple string, it’s a string with an address.

_getObjc2SelectorRefsThe following is the source code ofMach-OStatic segment in__objc_selrefs, subsequent pass_getObjc2At the beginning ofMach-OStatic segment fetch, all corresponding to differentsection name

sel_registerNameNoLockThe source path is as follows:sel_registerNameNoLock -> __sel_registerName, as shown below, its key code isauto it = namedSelectors.get().insert(name);, that is,selinsertnamedSelectorsHash table

③. Wrong messy class handling

Mainly fromMach-OAll classes are taken out and processed in traversal

Through code debugging, it is known that the execution is notreadClassMethods before,clsIt’s just an address

In the implementationreadClassMethods after,clsIs the name of a classAt this point, the information for the class is currently stored only in the address + name

If (newCls! = CLS &&newcls) {}

Fix remapping some classes that were not loaded by the image file

The unmapped Class and Super Class are remapped, where

  • _getObjc2ClassRefsIs to obtainMach-OStatic segment in__objc_classrefsnamelyThe references to classes
  • _getObjc2SuperRefsIs to obtainThe Mach - OThe static period__objc_superrefsnamelyA reference to a parent class
  • It can be learned from the comments that byremapClassRefThe classes areLazy-loaded classes, so this part of the code was not executed when it was initially debugged

⑤. Fix some messages

Mainly through_getObjc2MessageRefsTo obtainMach-OThe static period__objc_msgrefsAnd traverse throughfixupMessageRefRegister the function pointer andfixIs the new function pointer

⑥ When there is a protocol in a class:readProtocolRead the agreement

  • throughNXMapTable *protocol_map = protocols();createprotocolHash table, the table name isprotocol_map

  • through_getObjc2ProtocolListAccess to theMach-OStatic segment in__objc_protolistA list of protocols, which is read from the compiler and initializedprotocol

  • Loops through the protocol list. PassesreadProtocolMethod to add the protocol toprotocol_mapThe hash table

⑦. Fix protocols that were not loaded

Mainly through_getObjc2ProtocolRefsAccess to theMach-OThe static period__objc_protorefs(as in ⑥__objc_protolistIt’s not the same thing), and then iterate over the protocol that needs to be fixed and passremapProtocolRefTo compareCheck whether the current protocol is the same as the memory address in the protocol listIf theIf different, replace

Among themremapProtocolRefThe source code implementation is as follows

⑧. Classification processing

It deals primarily with classifications that need to be initialized and loaded into the class, and for classifications that occur at runtime, the discovery of classifications is deferred until the first load_images call after the call to _dyLD_OBJC_Notify_register is complete

⑨. Class loading processing

It is mainly to realize the loading process of the class, and realize the non-lazy loading class

  • through_getObjc2NonlazyClassListTo obtainMach-OThe static period__objc_nlclslistNon-lazily loaded class table
  • throughaddClassTableEntryInsert non-lazily loaded classes into the class table and store them in memory. If they are already added, they will not be added. Ensure that the entire structure is added
  • throughrealizeClassWithoutSwiftImplement the current class, as in previous ③readClassThe only memory read isAddress + Name, the class ofdataThe data is not loaded

Apple’s official definition of a non-lazy-loaded class is:

NonlazyClass is all about a class implementing or not a +load method.

So classes that implement the +load method are non-lazy-loaded classes, or otherwise lazy-loaded classes

  • Lazy loading:Class does not implement the load methodWhen we send a message to this class, if it is the first time, during the message search process, we will determine whether the class is loaded or not. If it is not loaded, the class will be loaded
  • The lazy loading:The load method is implemented inside the class, the class will be loaded ahead of time

Why does implementing load become a non-lazy-loaded class?

  • Mainly becauseloadwillLoading in advance(loadMethods inload_imagesThe call,The premiseisClass there)

When do lazy classes load?

  • inA method is calledWhen loading

⑩ For classes that are not processed, optimize those that are violated

The main idea is to implement classes that are not handled and optimize classes that are violated

What we need to focus on is 3readClassAnd in the pet-name rubyrealizeClassWithoutSwiftTwo methods

(3) the readClass

readClassIt basically reads the class, and before calling this method,clsIt’s just an address, and after executing this method,clsIs the name of the class, its source code implementation as follows, the key code isaddNamedClassandaddClassTableEntry, the source code implementation is as follows

Through the source code, mainly divided into the following steps:

  • (1) throughmangledNameGets the name of the class wheremangledNameThe source code for the method is as follows

  • If any parent of the current class is missingweak-linkedClass, returnsnil, after debugging will not go inside the judgment

  • (3) popFutureNamedClass judgment is not normally entered, this is a special operation for the future of the class to be processed, so it does not operate on RO, RW (breakable point debugging, create class and system class are not entered)

  • (4) byaddNamedClassAdds the current class to the already created onegdb_objc_realized_classesHash table, which is used to hold all classes

  • (5) byaddClassTableEntry, adds the initialized class toallocatedClassesTable, this table is in_objc_initIn theruntime_initI’m going to initialize it.

  • ⑥ If you want to be inreadClassLocate the source code to a custom class, you can customize plus if judgment

In summary, the main function of readClass is to read the mach-O class into memory, that is, insert it into the table, but the current class only has two information: address and name, and the Mach-O data has not been read.

⑨ realizeClassWithoutSwift: implementation class

The realizeClassWithoutSwift method has related operations of ro and RW. This method is mentioned in the slow search of message flow. The method path is: Slowly find (lookUpImpOrForward) – realizeAndInitializeIfNeeded_locked – realizeClassMaybeSwiftAndLeaveLocked – RealizeClassMaybeSwiftMaybeRelock – realizeClassWithoutSwift (implementation class)

RealizeClassWithoutSwift method is mainly used to implement the class, load the class data into memory, mainly has the following parts of the operation:

  • 1) readdataData and setRo, rw
  • ② Recursive callrealizeClassWithoutSwiftperfectInheritance chain
  • (3) bymethodizeClassMethods the class
① Read data and set ro and RW

Read class data and convert it to RO, rW initialization and ro copy to RO in RW

  • rosaidreadOnly, i.e.,read-only, which contains information about class names, methods, protocols, and instance variables. Since it is read-only, it belongs toClean MemoryAnd theClean MemoryRefers to theMemory that does not change after loading
  • rwsaidreadWrite, i.e.,Can read but writeDue to its dynamic nature, it is possible to add attributes, methods, and protocols to classes in the latest 2020WWDCtheMemory optimizationinstructionsAdvancements in the Objective-C runtime – WWDC 2020 – Videos – Apple DeveloperMentioned,rw, in fact, inrwOnly in the10% of the classActually changed their methods, so there arerwe, i.e.,Additional information about the classFor those classes that do require additional information, you can assign itrweExtend one of the records and slide it into the class for its use. Among themrwBelong todirty memoryAnd thedirty memoryRefers to theMemory that changes while the process is running.Class structureAs soon asuseIt will become aditry memoryBecause the runtime writes new data to it, such as creating a new method cache and pointing to it from the class

② Recursive call realizeClassWithoutSwift to improve the inheritance chain

Recursive calls to realizeClassWithoutSwift complete the inheritance chain and set the RW of the current class, parent class, and metaclass

  • Recursive callsrealizeClassWithoutSwiftSet up theSuperclass, metaclass
  • Set up theIsa pointing to parent and metaclass
  • throughaddSubclassaddRootClassSets the parent’s bidirectional linked list pointing relationship, i.eSubclasses can be found in the parent class, and subclasses can be found in the parent class

There’s a problem with realizeClassWithoutSwift recursion call, when isa finds the root metaclass, the isa of the root metaclass is pointing to itself, it doesn’t return nil, so there’s the following recursive termination condition, the purpose of which is to ensure that the class is only loaded once

In realizeClassWithoutSwift

  • If a classThere is no, the returnnil
  • If a classHave been implemented, directly returnscls

inremapClassMethod, ifclsIf no, return directlynil

③ Class by methodizeClass method

throughmethodizeClassMethods,roRead the method list (including methods in the classification), attribute list, protocol list to assign values torwAnd returncls

Breakpoint debugging realizeClassWithoutSwift (objc4-818.2)

If we need to track custom classes, we also need to add custom logic before the realizeClassWithoutSwift call in step 9 of the _read_images method, mainly to facilitate debugging of custom classes

  • _read_imagesStep 9 of the methodrealizeClassWithoutSwiftAdd custom logic before call

  • inTCJPersonThe rewrite+loadMethod, because only non-lazily loaded classes call realizeClassWithoutSwift for initialization

  • Rerun the program, and there we are_read_imagesCustom logic section in Step 9 of

  • inrealizeClassWithoutSwiftCall part with breakpoint, run and break

  • Came torealizeClassWithoutSwiftIn the method, inauto ro = (const class_ro_t *)cls->data();Add breakpoints, run and break – this is mostly from assemblymachoIt reads in the filedataAccording to a certain data format conversion (strong conversionclass_ro_t *Type) at this timeroAnd ourclsIt doesn’t matter. Take a step down and seeroWhat’s in it?

  • Among themauto isMeta = ro->flags & RO_META;Judge currentclsWhether it’s a metaclass or not, this is not a metaclass, so it’s going to go down here, inelseThe inside of therw->set_ro(ro);Add a break point, stop and lookrwFor the time of,rwis0x0, includingrorwe

We see that the values are empty where ro_or_rw_ext is ro or rw_ext, ro is clean memory and rw_ext is dirty memory.

At this time to printclsWe found that the last address was empty

  • Move the breakpoint toif (isMeta) cls->cache.setBit(FAST_CACHE_META);To continue printingclsThe last address was also empty. incls->setData(rw);In theclsthedataIt’s reassigned. Why is it empty?

This is because ro is a clean memory address, so why is there a clean memory address and a dirty memory address? This is because the iOS running will result in constant to add and delete memory, will be more serious to the operation of the memory, in order to prevent modification of the original data, so the original clean memory copy a to the rw, why rwe have rw (memory) dirty, this is because not all of the classes for dynamic insertion, deletion. When we add a property, a method changes a lot of memory and has a big impact on memory consumption, so when we dynamically process the class, we generate an RWE.

This is where we need to lookset_roSource code implementation, the path is:Set_ro -- set_ro_or_rwe (find get_ro_or_rweIs throughro_or_rw_ext_tType fromro_or_rw_extGet) —ro_or_rw_ext_tIn thero

From the source code, we know that ro is obtained in two main cases: whether there is a runtime or not

  • Read from the RW if there is a runtime

  • Conversely, if there is no runtime, read from ro

  • Let’s move on to the important method, as shown below:

I’m going to call the parent class and the metaclass and get them to do the same thing, and the reason why I’m going to do the parent class and the metaclass here is to determine the inheritance chain, and then there’s recursion, and when the CLS doesn’t exist, it returns.

Continue to go down to the if (isMeta) {code, isMeta is YES, at this time because it is indeed a metaclass. CLS – > setInstancesRequireRawIsa (); This method is to set up ISA.

  • inif (supercls && ! isMeta)Add a breakpoint, continue to run to break, at this point breakpointclsIt’s the address, not the previous oneThe TCJPerson.Why? That’s because the topmetacls = realizeClassWithoutSwift(remapClass(cls->ISA()), nil);Method takes the metaclass. So let’s verify that

We see that the CLS is indeed metaclass at this point.

MethodizeClass: methodized class

MethodizeClass source code implementation is as follows, mainly divided into several parts:

  • willProperty list, method list, protocol listSuch as postrweIn the
  • additionalclassification(explained in the next article)

The logic of rwe

The logic for adding the list of methods to RWE is as follows:

  • To obtainrothebaseMethods
  • throughprepareMethodListsMethods the sorting
  • rightrweProcess and passattachListsinsert
How to order methods

6: Runtime article, the method of search algorithm is through binary search algorithm, sel-IMP is sorted, so how to sort it?

  • Enter theprepareMethodListsThe source code implementation, its internal is throughfixupMethodListMethods the sorting

  • Enter thefixupMethodListSource code implementation, is based onselector addressThe sorting

Validation method sorting

Next we can verify the sorting of methods by debugging

  • inmethodizeClassMethod to add custom logic and break

  • readcj_roIn themethodlist

  • Enter theprepareMethodListsMethods,roIn thebaseMethodsSort, add custom breakpoints (mostly for research purposes), execute breakpoints, run to custom logic and break (add herecj_isMeta, mainly used to filter out metaclasses with the same namemethods)

  • Step by step, comefixupMethodList, that is, theselSort, enterfixupMethodListSource code implementation, (selAccording to theselAdressSort), break again to the following part, that is, the method has gone through one level of sorting

So sort before and aftermethodlistThe comparison is as follows, so the summary is as follows:methodizeClassMethod that implements methods (protocols, etc.) in a classserialization.

  • Go back tomethodizeClassIn the method

We see at this time of rwe is NULL, namely designs no assignment, no go (that is, the data () – > ro – > rw – > rwe (no))?? Why is that? We will analyze this problem later….

Boy to this, do you think of another problem? In non-lazy loading we know when realizeClassWithoutSwift is called, so lazy loading when realizeClassWithoutSwift is called.

In our test code+loadMethod comment out

It’s also called in the main methodcj_instanceMethod1methodsrealizeClassWithoutSwiftMethod break point, breakpoint come over, we type stack information, as follows

Why can I get to realizeClassWithoutSwift method? Because we called the alloc method to send the message. This process was discussed earlier in the iOS Martial Arts secrets ⑥ : Runtime methods and messages. This is the beauty of lazy loading, where the first message is processed before the actual class is loaded.

soLazy loading classandNon-lazy-loaded classestheData loading timeAs shown in the figure below

AttachToClass method

attachToClassThe main method is to add the classification to the main class, the source code implementation is as follows

This is because the external loop in attachToClass is to find a category and then go to attachCategories once.

AttachCategories method

inattachCategoriesIn the methodPrepare classified data, the source code implementation is as follows

  • Auto rwe = CLS ->data()->extAllocIfNeeded(); Rwe creation, so why rWE initialization here? Because now we have one thing to do: add attributes, methods, protocols, and so on to this class, which is to process the old Clean Memory

    • Enter theextAllocIfNeededMethod of source code implementation, judgmentrweWhether there is, if there is direct access, if there is no open
    • Enter theextAllocSource code implementation, that is, onrwe 0-1In this process, the class will bedataThe data is loaded
  • ② The key code is RWE ->methods.attachLists(mlists + attach_bufsiz-mcount, McOunt); That is, stored at the end of mlists, the data for mlists comes from the for loop in front

  • ③ At debug run time, the name in category_T is compiled as TCJPerson (see clang) and run as TCJA, the name of the classification

  • ④ mlists[ATTACH_BUFSIZ – ++ McOunt] = mlist; After debugging, it is found that McOunt at this time is equal to 1, which can be understood as reverse insertion,64 is allowed to accommodate 64 (at most 64 categories).

Conclusion: This class needs to add attributes, methods, protocols, etc., so it needs to initialize RWE. The initialization of RWE mainly involves classification, addMethod, addProperty, and AddProtocol, that is, the initialization of RWE will only be carried out when the original class is modified or processed.

AttachLists Method: Insert

attachListsHow do you insert data? Method property protocols can be passed directlyattachListsInsert?

Methods and attributes inherit from entsize_list_TT, and the entsize_list_TT protocol is similar to entsize_list_TT implementation, which is a two-dimensional array.

Enter theattachListsMethod source code implementation

From the source code implementation of attachLists, it can be concluded that the insertion table is mainly divided into three situations:

  • Situation (1)Many to many: If currently calledattachListsthelist_array_ttIn a two-dimensional arrayThere are multiple one-dimensional arrays
    • throughmallocBased on the new size, create an array of type array_t, obtained from array()
    • Reverse traversal moves the original data to the end of the container
    • Traversal moves the new data to the start of the container
  • Case 2.0 to 1: If calledattachListsthelist_array_tt2 d arrayIs empty and the number of new sizes is 1
    • Direct assignmentaddedListOne of the firstlist
  • Case 3.More than 1 to: If currently calledattachListsthelist_array_tt2 d arrayThere's only one one-dimensional array
    • Use malloc to create a collection of capacities and sizes of type array_t. That is, create an array, place it in array, and fetch it from array()
    • Since there is only one one-dimensional array, we assign directly to newArrayThe last position of the
    • Looping stores a new list from the start of the array, where array()->lists indicates the first element position

In the case of ③1 to many, lists here refer to classification

  • This is everyday development. WhyA subclass implements a superclass method that overrides itThe reason why
  • Similarly, for methods of the same name,Classification methods override class methodsThe reason why
  • This operation comes from an algorithmic mindsetLRULeast recently used, plus thisnewlistThe purpose of this is to use thisnewlistMethod in, this onenewlistThe value to the user is high, i.ePriority calls
  • Will come toMore than 1 toThe main reason is thatThere are categories added, that is, the old element in the back, the new element in the front, the fundamental reason is mainlyPriority call categoryThat’s what classification is all about

Hum, there is only principle without operation, I believe you a ghost, that next, we will verify one side.

Rwe data loading (validation)

Get ready to test the code classTCJPerson, and classificationTCJAandTCJB

Rwe — Data loading for this class

The next step is to verify the rWE data 0-1 process by debugging, adding the class’s method list

inattachCategoriesAdd custom logic inextAllocAdd breakpoints to run and break, as can be seen from stack informationattachCategoriesIn the methodauto rwe = cls->data()->extAllocIfNeeded();Here it is. Here it isOpen up rwe

So why do we initialize rWE here? We need to add attributes, methods, protocols, etc. to this class, so that the original clean memory will be processed. Rwe will be processed during the classification process, that is, the RWE initialization, and there are several methods involved in the initialization of the RWE, respectively: Category + addMethod + addPro + addProtocol

  • p rwe.p *$0For the time of,rweIn thelist_array_ttIt’s empty. The initialization hasn’t been assigned yet so it’s all empty

  • So let’s go ahead and execute toif (list) {Live off, andp list,p *$2For the time of,listisTCJPersonA list of methods for this class

  • inattachListsIn the methodif (hasArray()) {Set a breakpoint, and run to break, continue execution, will go toelse-ifProcess, i.e.,0 to 1TCJPersonThe addition of methods to the list of this class will go0 to 1process

  • p addedListsIn this case, it is onelistThe address of the pointer, givenmlistsIs the first element of typemethod_list_t *const *

  • Then,p addedLists[0]–>p *$6–>p $7.get(0).big()To view

  • Continue top addedLists[1]–>p *$9If there is no value, the memory is accessed by someone else.

Conclusion: So case ① — 0 to 1 is a one-dimensional assignment.

Rwe — TCJA classification data loading

Continue with the previous step, printlist.p listFor the time of,listismethod_list_tstructure

Pick it up, continue down, and go tomethod_list_t *mlist = entry.cat->methodsForMeta(isMeta);.p mlist–>p *$12–>p $13.get(0).big()For the time of,mlistIs the classificationTCJA

inif (mcount > 0) {Partially add a breakpoint, continue execution, and break

I’m going to do one step downmlistsSet of sets

Mlists + attach_bufsiz-mcount indicates memory translation

  • p mlists + ATTACH_BUFSIZ - mcountBecause themcount = 1.ATTACH_BUFSIZ = 64From the first position to63A, that is,The last element

Enter theattachListsMethods,if (hasArray()) {Add a breakpoint at, and continue execution, since there is already onelistSo it goes toMore than 1 toThe process of

At the end of the execution, the current outputarrayp array()

List_array_tt

So method_list_t has a lot of method_T in it.
,>

Conclusion: If there is only one category in this class, then case ③, i.e. 1 to many, will be reached.

Rwe — TCJB classification data loading

If you add another classification TCJB, you go to the third case, which is many-to-many

Again go toattachCategories -- if (mcount > 0) {And into theattachListsAnd went toMany to manyIn the case

View the currentarrayIn the form of namelyp array()“And read on.p *$25That’s stored in the first oneTCJBList of methods

That is to say, after a sorting method, the first row is the classification TCJB method. Believe it or not? Believe it or not, let’s close all breakpoints and see the output:

To sum up, the attachLists method is mainly to load data of classes and categories into RWE

  • First of all,Load the data of this classFor the time of,Rwe has no data nullGo,0 to 1process
  • whenAdd a categoryAt this timeRwe has only one list, i.e.,The list of this classGo,More than 1 toprocess
  • againAdd a categoryAt this timeThere are two lists in rWE, i.e.,List of classes + categoriesGo,Many to manyprocess

Classes are loaded into memory from Mach-OThe flow chart is shown below

Now that we’re here, let’s talk about the classification.

Nature of classification

In the previous test codemain.mFile definitionTCJPersontheClassification TCJ

(1) throughclangwillOCCode conversion toC++code

Clang instruction xcrun-sdk iphonesimulator clang-arch arm64-rewrite-objc main.m -o main-arm64.cpp

② Bottom analysis

fromcppAt the bottom of the file, the first thing you see is that the categories are stored inMachOOf the file__DATAPart of the__objc_catlistIn the

And secondly you can see thatTCJPersonStructure of classificationfoundTCJPersonInstead of_CATEGORY_TCJPerson_Is be_category_tOrnamentation. Let’s see_category_tWhat is it, search_category_

We find that _category_t is a structure with names (the name is the class name, not the class name), CLS, list of object methods, list of class methods, protocols, attributes.

Why do classified methods keep instance methods and class methods separate?

  • Classification has two lists of methods because classification is not meta-classification, the classification method is inThe runtimethroughattachToClassInserted into theclassthe

Now let’s look at the method

There are three object methods and a class method of the form sel+ signature + address, the same as the method_t structure.

Now let’s look at the properties

We found that the variable name of the attribute exists but there is no corresponding set and get method, which can be set by the association object.

After watchingcppFile. Check it outObjc4-818.2 -In version source codecategory_t

Classification loading

Through the previous introduction we know that the class is divided into lazy loading class and non-lazy loading class, their loading timing is not the same, so what is the classification? Let’s explore them in turn

Preparation: CreateTCJPersonTwo categories of:TCJAandTCJB

In the previous analysis, realizeClassWithoutSwift -> methodizeClass -> attachToClass -> load_categories_nolock -> extAlloc -> The loading of RWE is mentioned in attachCategories, which analyzes how the data of the classification is loaded into the class, and the loading order of the classification is as follows: TCJA -> TCJB is loaded into the class, that is, the later it is added, the earlier it is added

One viewmethodizeClassThe source code implementation can be foundClass of dataandClassified dataIt’s handled separately, mainly because inCompilation phase, has been determinedThe home location of the method(i.e.Instance methodsStored in theclass,Class methodStored in theThe metaclass), and theclassificationIt was added later

Classes need to be added to the class via attatchToClass before they can be used by the outside world.

  • ① Classified dataLoading time: according to theWhether classes and classes implement load methodsTo distinguish between different times
  • 2.attachCategoriesPreparing classification Data
  • 3.attachListswillClassification data is added to the main classIn the

The loading time of the classification

Next, we explore the loading time of classified data, taking the main class TCJPerson + classification TCJA and TCJB to realize the +load method as an example

Using ②attachCategories to prepare the loading time of ①

According to the previous study, when we go to attachCategories method, there will be the loading of classification data. We can check when to call attachCategories by backward method. By searching, are there two methods to call

  • load_categories_nolockIn the method

  • addToClassMethod, it’s been debugged here and it’s never enteredifIn a process, a normal class is usually loaded only once unless it is loaded twice

  • Run without any breakpointsObjc4-818.2 -Test the code, you can get the following print log, through the log can be foundaddToClassThe next step in the method isload_categories_nolockApproach is toLoading classification Data

  • Global search “load_categories_NOLock” call. There are two calls

    • Once inloadAllCategoriesIn the method

    • Once in the _read_images method

  • After debugging, it won’t go_read_imagesIn the methodIf the processOf, but walkloadAllCategoriesIn the method

  • Global search viewloadAllCategoriesIs found in theload_imagesWhen the call

  • Can also beattachCategoriesAdd custom logic breakpoints,btViewing stack Information

Therefore, in this case, the backward application path of data loading time of the categories is attachCategories -> load_categories_NOLock -> loadAllCategories -> load_images

The normal path of our classification loading process is: realizeClassWithoutSwift -> methodizeClass -> attachToClass ->attachCategories

The forward and reverse processes are shown in the figure below:

Let’s look at another situation:TCJPerson main class + TCJA implementation + Load.Classification TCJB does not implement the +load methodBreakpoint is set onattachCategoriesAdd custom logic, step by step,p entry.cat–>p *$0

If you keep going, it will come againattachCategoriesMethod interrupts,p entry.cat–>p *$2

Summary: As long as there is a non-lazy loading category, all the categories will be marked as bit non-lazy loading category, meaning that the rWE has been loaded once, it will not be lazy loading again, and the TCJPerson will be processed again

Assorting use of categories and classes

From the two examples above, we can roughly divide the cases of classes and classifications that implement +load into four categories.

classification classification
class Classification implementation +load Classification is not implemented +load
Class implements + load Non-lazy loading class + non-lazy loading class Non-lazy loading class + lazy loading class
Class does not implement +load Lazy loading class + non-lazy loading class Lazy loading class + lazy loading classification
Non-lazy loading classes and non-lazy loading classification

That is, the main class implements the +load method, and the classification also implements the +load method. In the loading time of the classification mentioned above, we have analyzed this situation, so we can directly draw the conclusion that in this case

  • Class data is loaded through_getObjc2NonlazyClassListLoad, i.e.,Ro, rwThat’s rightrweAssignment is initialized inextAllocIn the method
  • Classified data loadingIs through theload_imagesLoad into a class

Its invocation path is:

  • map_images -> map_images_nolock -> _read_images -> readClass -> _getObjc2NonlazyClassList -> realizeClassWithoutSwift -> methodizeClass -> attachToClassFor the time of,mlistsIt’s a one-dimensional array, and then it goes toload_imagesPart of the
  • load_images --> loadAllCategories -> load_categories_nolock -> load_categories_nolock -> attachCategories -> attachListsFor the time of,mlistsis2 d array
Non-lazy loading classes and lazy loading classification

That is, the main class implements the +load method, but the class does not implement the +load method

  • Open therealizeClassWithoutSwiftA custom breakpoint inro

As you can see from the printout above, the order of the methods is TCJB — >TCJA->TCJPerson class. At this point, the classification has been loaded but not sorted, indicating that the data is compiled when CLS ->data is read without non-lazy loading. It doesn’t need to be added at runtime.

  • Came tomethodizeClassMethod breakpoint section

  • Came toprepareMethodListstheforLoop part

  • Go to the if (sort) {section of the fixupMethodList method

    • Among themSortBySELAddressThe source code implementation is as follows:Sort by address of name

    • Go to themlist->setFixedUp();In the readingmlist

Printing shows that only methods with the same name are sorted, while the rest of the classification is not, where the IMP addresses are ordered (from small to large) — the sort in fixupMethodList is only sorted for the name address

Conclusion: Data loading of non-lazy loading classes and lazy loading classes can be concluded as follows:

  • Loading of classes and categoriesIs in theread_imagesI load the data
  • Among themThe data of datainCompile timeIt’s already done
Lazy loading classes and lazy loading classification

That is, neither the main class nor the classification implements the +load method

  • Run the program without any breakpoints to get the print log

RealizeClassMaybeSwiftMaybeRelock of them are in the news process slow to find some function, namely the function on the first call news

  • inreadClassBreak and readcj_roThat is, read the wholedata

The count of the baseMethodList is 16, which is also read from data, so there is no need to go through a slow load_images load

Summary: Lazy classes and lazy classes load data when the message is first called, and the data data is completed at compile time

Lazy-loaded classes and non-lazy-loaded classes

That is, the main class does not implement the +load method, the class implements the +load method

  • Run the program without any breakpoints to get the print log

  • inreadClassMethod interrupts and viewscj_ro

BaseMethodList count = 8; baseMethodList count = 8; baseMethodList count = 8;

  • inload_categories_nolockMethod in the custom debug code break point, viewbt

Summary: Lazy loading class + non-lazy loading class data loading, as long as the class implements load, will force the main class to load in advance, that is, the main class forced to convert to non-lazy loading class style

Summary of assortative use of categories and classes

Class and classification are used together, and the loading time of its data is summarized as follows:

  • Non-lazy loading class + non-lazy loading class: class loading in_read_images, the classification of the loading inload_imagesMethod, the class is loaded and classified information is pasted to the class
  • Non-lazy loading class + lazy loading class: class loading in_read_images, the loading of classification is inCompile time
  • Lazy loading class + lazy loading classification: class loading inThe first message is sentWhen, the classification load is inCompile time
  • Lazy loading class + non-lazy loading class: As long as the classification is implementedload, will force the main classLoading in advanceIn which the_read_imagesDoes not implement classes inload_imagesMethod that triggers the data loading of the class, i.eRwe initializationAnd load classified data at the same time

Four, load_images

The load_images method is mainly used to load image files. The two most important methods are prepare_load_methods (load) and call_load_methods (call).

① Load_images source code implementation

Prepare_load_methods source code implementation

(2). 1 schedule_class_load method

This method is mainly based onClass's inheritance chain recursively calls to get the loadUntil theclsThe recursion ends when it doesn’t existTo ensure that the parent class loads first

(2). 1.1 add_class_to_loadable_list method

This method is mainly toThe load methodandCLS class nameTogether to theloadable_classesIn the table

(2). 1.2 getLoadMethod method

This method is primarily about getting methodsSel for the loadThe method of

(2). 2 add_category_to_loadable_list

The main is to get all the non-lazy loading categories inThe load methodThat will beClass name +load methodJoin tableloadable_categories

(3) call_load_methods

This method has three main operations

  • Repeated callsClass + loadUntil there are no more
  • One callClassification of + load
  • If there are classes or more untried classifications, run more+load

(3). 1 call_class_loads

It’s basically loadingClass load method

The load method has two hidden parameters. The first parameter is id (self) and the second parameter is sel (CMD)

(3). 2 call_category_loads

This is basically a load method that loads a class once

To sum up, the overall calling process and principle of load_images method are illustrated as follows

  • Call procedure diagram

  • Principle of graphic

Fifth, unmap_image

6. Initalize

The Apple documentation on Initalize says so

Initializes the class before it receives its first message. Called before the class receives the first message.

Then we find lookUpImpOrForward in objC4-818.2’s source code

lookUpImpOrForward->realizeAndInitializeIfNeeded_locked->initializeAndLeaveLocked->initializeAndMaybeRelock->initializeN onMetaClass

ininitializeNonMetaClassCall the parent class recursivelyinitializeAnd then callcallInitialize

callInitializeIt’s an ordinary oneMessage is sent

Conclusions about Initalize:

  • initializeCalled before the first method of the class or its subclass is called (before the message is sent)
  • Add only in the classinitializeBut if you don’t use it, you won’t call itinitialize
  • Of the parent classinitializeMethods are executed before subclasses
  • When a subclass is not implementedinitializeMethod, the parent class is calledinitializeMethods; The subclass implementationinitializeMethod overrides the parent classinitializemethods
  • When more than one classification is implementedinitializeMethod, which overrides the methods in the class and executes only one (which executes the classifiers last loaded into memory)

Write in the back

Study harmoniously without being impatient. I’m still me, a different color of fireworks. Finally, a summary table of environment variables is attached