The background that

In iOS development, classification is a method we often use. We can add some custom methods for the system classes. This article will sort out the implementation principle of classification.

define

Category is a language feature added after Objective-C 2.0. It is a concrete implementation of the decorator pattern. The main purpose is to dynamically add methods to a class without changing the existing class.

Usage scenarios

1 Add methods to existing classes and declare private methods.

Decompose the large class files, and divide the implementation of a class into multiple independent source files.

advantages

1 there is no need to increase the methods of existing classes by adding subclasses, and the methods in the classification are basically no different from the original class methods.

2. Classification can divide the methods of a huge class, so as to facilitate the future maintenance, update and improve the reading of the code.

disadvantages

Methods in the classification have higher priority than the original class and the superclass method, and overwriting the superclass method can result in a super message breaking. Therefore, it is best not to overwrite methods in the original class.

The difference between

There is no such thing as a Category in Swift, but it can be implemented using Swift Extensions.

The principle of analytic

_category_t

Go to Apple GitHub, download the source code, open objc.xCodeProj, and search for category_t

Next, we will create our own category and compile it into a c++ file to see the compiled source code for the actual category

The following three files are preferred

Person_OC

// Person_OC.h #import <Foundation/Foundation.h> NS_ASSUME_NONNULL_BEGIN @interface Person_OC : NSObject { int _age; } - (void)run; @end NS_ASSUME_NONNULL_END // Person_OC.m #import "Person_OC.h" @implementation Person_OC - (void)run { NSLog(@"--__--||  Person_OC run"); } @endCopy the code

Person_OC+Test1

// Person_OC+Test1.h
#import "Person_OC.h"

NS_ASSUME_NONNULL_BEGIN

@interface Person_OC (Test1)<NSCopying>

@property (nonatomic, assign) int age;

- (void)test;

+ (void)aMethod;

@end

NS_ASSUME_NONNULL_END

// Person_OC+Test1.m
#import "Person_OC+Test1.h"

@implementation Person_OC (Test1)

- (void)test {
    
}

+ (void)aMethod {
    
}

- (void)setAge:(int)age {
    
}

- (int)age {
    return 20;
}

@end
Copy the code

Person_OC+Test2

// Person_OC+Test2.h
#import "Person_OC.h"

NS_ASSUME_NONNULL_BEGIN

@interface Person_OC (Test2)

@end

NS_ASSUME_NONNULL_END

// Person_OC+Test2.m
#import "Person_OC+Test2.h"

@implementation Person_OC (Test2)

- (void)run {
    NSLog(@"--__--|| Person_OC (Test2) run");
}

@end
Copy the code

Then go to the project directory and convert the Preson_OC+ test1. m file into a c++ file to view the compilation process.

xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc Person_OC+Test1.m
Copy the code

After compilation, a Preson_OC+ test1.cpp file is generated in the directory.

Open Preson_OC+ test1.cpp with Xcode and search for category_t. You can see that the category is compiled to generate a _category_t structure with the source code and comments below

Struct _category_t {// class name const char *name; Struct _class_t * CLS; // list of object methods const struct _method_list_t *instance_methods; // list of class methods const struct _method_list_t *class_methods; // const struct _protocol_list_t *protocols; // const struct _prop_list_t *properties; };Copy the code

As you can see, you can add attributes to a category, but you can’t add member variables directly. If you want to add member variables, you have to use the associative object technique, which we’ll talk about later.

_method_list_t

Next, we analyze the _method_list_t struct and search for _method_list_t in the file. We find the source code in three places as follows

static struct /*_method_list_t*/ {
	unsigned int entsize;  // sizeof(struct _objc_method)
	unsigned int method_count;
	struct _objc_method method_list[3];
} _OBJC_$_CATEGORY_INSTANCE_METHODS_Person_OC_$_Test1 __attribute__ ((used, section ("__DATA,__objc_const"))) = {
	sizeof(_objc_method),
	3,
	{{(struct objc_selector *)"test", "v16@0:8", (void *)_I_Person_OC_Test1_test},
	{(struct objc_selector *)"setAge:", "v20@0:8i16", (void *)_I_Person_OC_Test1_setAge_},
	{(struct objc_selector *)"age", "i16@0:8", (void *)_I_Person_OC_Test1_age}}
};
Copy the code

As you can see from _OBJC__CATEGORY_INSTANCE_METHODS_Person_OC__Test1, this structure is a list of methods that store objects.

static struct /*_method_list_t*/ {
	unsigned int entsize;  // sizeof(struct _objc_method)
	unsigned int method_count;
	struct _objc_method method_list[1];
} _OBJC_$_CATEGORY_CLASS_METHODS_Person_OC_$_Test1 __attribute__ ((used, section ("__DATA,__objc_const"))) = {
	sizeof(_objc_method),
	1,
	{{(struct objc_selector *)"aMethod", "v16@0:8", (void *)_C_Person_OC_Test1_aMethod}}
};
Copy the code

As you can see from _OBJC__CATEGORY_CLASS_METHODS_Person_OC__Test1, this structure is a list of storage class methods.

static struct /*_method_list_t*/ {
	unsigned int entsize;  // sizeof(struct _objc_method)
	unsigned int method_count;
	struct _objc_method method_list[1];
} _OBJC_PROTOCOL_INSTANCE_METHODS_NSCopying __attribute__ ((used, section ("__DATA,__objc_const"))) = {
	sizeof(_objc_method),
	1,
	{{(struct objc_selector *)"copyWithZone:", "@24@0:8^{_NSZone=}16", 0}}
};
Copy the code

As you can see from _OBJC_PROTOCOL_INSTANCE_METHODS_NSCopying, this structure is a list of object methods that store the NSCopying protocol.

It can be seen from the source code, the above three structures, but also store the method occupied memory, method number, and method list.

_protocol_list_t

Next, we analyze the _protocol_list_t struct and search the file for _protocol_list_t

static struct /*_method_list_t*/ {
	unsigned int entsize;  // sizeof(struct _objc_method)
	unsigned int method_count;
	struct _objc_method method_list[1];
} _OBJC_PROTOCOL_INSTANCE_METHODS_NSCopying __attribute__ ((used, section ("__DATA,__objc_const"))) = {
	sizeof(_objc_method),
	1,
	{{(struct objc_selector *)"copyWithZone:", "@24@0:8^{_NSZone=}16", 0}}
};

struct _protocol_t _OBJC_PROTOCOL_NSCopying __attribute__ ((used)) = {
	0,
	"NSCopying",
	0,
	(const struct method_list_t *)&_OBJC_PROTOCOL_INSTANCE_METHODS_NSCopying,
	0,
	0,
	0,
	0,
	sizeof(_protocol_t),
	0,
	(const char **)&_OBJC_PROTOCOL_METHOD_TYPES_NSCopying
};
struct _protocol_t *_OBJC_LABEL_PROTOCOL_$_NSCopying = &_OBJC_PROTOCOL_NSCopying;

static struct /*_protocol_list_t*/ {
	long protocol_count;  // Note, this is 32/64 bit
	struct _protocol_t *super_protocols[1];
} _OBJC_CATEGORY_PROTOCOLS_$_Person_OC_$_Test1 __attribute__ ((used, section ("__DATA,__objc_const"))) = {
	1,
	&_OBJC_PROTOCOL_NSCopying
};
Copy the code

Store protocol methods in the _method_list_t structure. It is then stored in the _objC_CATEGORY_PROTOCOL_s__person_oc__test1 by the _PROTOCOL_LIST_t structure. Protocol_count is the number of protocols and the _PROTOCOL_T structure that stores protocol methods.

_prop_list_t

Finally, we analyze the struct of type _prop_list_t and search the file for _prop_list_t

static struct /*_prop_list_t*/ {
	unsigned int entsize;  // sizeof(struct _prop_t)
	unsigned int count_of_properties;
	struct _prop_t prop_list[1];
} _OBJC_$_PROP_LIST_Person_OC_$_Test1 __attribute__ ((used, section ("__DATA,__objc_const"))) = {
	sizeof(_prop_t),
	1,
	{{"age","Ti,N"}}
};
Copy the code

The property list structure _OBJC__PROP_LIST_Person_OC__Test1 corresponds to the _prop_list_t structure and stores the space occupied by the property, the number of properties, and the property list.

Assignment process

After analyzing the _category_t structure, let’s look at the specific assignment process in the structure. The source code is as follows

struct _class_t {
	struct _class_t *isa;
	struct _class_t *superclass;
	void *cache;
	void *vtable;
	struct _class_ro_t *ro;
};

struct _category_t {
	const char *name;
	struct _class_t *cls;
	const struct _method_list_t *instance_methods;
	const struct _method_list_t *class_methods;
	const struct _protocol_list_t *protocols;
	const struct _prop_list_t *properties;
};

extern "C" __declspec(dllimport) struct _class_t OBJC_CLASS_$_Person_OC;

static struct _category_t _OBJC_$_CATEGORY_Person_OC_$_Test1 __attribute__ ((used, section ("__DATA,__objc_const"))) = 
{
	"Person_OC",
	0, // &OBJC_CLASS_$_Person_OC,
	(const struct _method_list_t *)&_OBJC_$_CATEGORY_INSTANCE_METHODS_Person_OC_$_Test1,
	(const struct _method_list_t *)&_OBJC_$_CATEGORY_CLASS_METHODS_Person_OC_$_Test1,
	(const struct _protocol_list_t *)&_OBJC_CATEGORY_PROTOCOLS_$_Person_OC_$_Test1,
	(const struct _prop_list_t *)&_OBJC_$_PROP_LIST_Person_OC_$_Test1,
};
static void OBJC_CATEGORY_SETUP_$_Person_OC_$_Test1(void) {
	_OBJC_$_CATEGORY_Person_OC_$_Test1.cls = &OBJC_CLASS_$_Person_OC;
}
Copy the code

Struct_category_t; struct_category_t; OBJC_CLASS_Person_OC; Static void OBJC_CATEGORY_SETUP_Person_OC_Test1(void The CLS pointer to _OBJC_CATEGORY_Person_OC_Test1 points to the OBJC_CLASS_Person_OC structure address. We can see here that the CLS pointer should point to the address of the category’s main class object.

The stored procedure

Through the above analysis, we found that the classified source code is indeed the object methods, class methods, attributes and so on that we define are stored in the catagory_T structure. Let’s go back to the Runtime source to see how catagory_t stores methods, properties, protocols, and so on in class objects.

Search for _objc_init, the Runtime initialization function

/*********************************************************************** * _objc_init * Bootstrap initialization. Registers our image notifier with dyld. * Old ABI: called by dyld as a library initializer * New ABI: called by libSystem BEFORE library initialization time **********************************************************************/ #if ! __OBJC2__ static __attribute__((constructor)) #endif void _objc_init(void) { static bool initialized = false; if (initialized) return; initialized = true; // fixme defer initialization until an objc-using image is found? environ_init(); tls_init(); static_init(); lock_init(); exception_init(); // Register for unmap first, in case some +load unmaps something _dyld_register_func_for_remove_image(&unmap_image); dyld_register_image_state_change_handler(dyld_image_state_bound, 1/*batch*/, &map_2_images); dyld_register_image_state_change_handler(dyld_image_state_dependents_initialized, 0/*not batch*/, &load_images); }Copy the code

After a series of initializations, map_2_images is called, and the source code for map_2_images is searched. Finally, the source code for map_2_images is found in objC-Runtime-new. mm. Inside, it calls map_images_NOLock, searches for map_images_NOLock, finds the source code in objC-os. mm, does all the calculations inside, and finally calls _read_images. The source code of these functions is not relevant to the analysis of the principle of category, so I will not show the relevant source code.

Search for _read_images and find the source code for the _read_images function in objc-Runtime-new. mm. In this function, also find the category operation code with the source code and comments below

/*********************************************************************** * _read_images * Perform initial processing of the headers in the linked * list beginning with headerList. * * Called by: map_images_nolock * * Locking: runtimeLock acquired by map_images **********************************************************************/ void _read_images(header_info **hList, Uint32_t hCount) {* * * * * * * * * * * * * middle omission has nothing to do with the category code * * * * * * * * * * * * * / / Discover categories. For (EACH_HEADER) {category_t **catlist = _getObjc2CategoryList(hi, &count); for (i = 0; i < count; i++) { category_t *cat = catlist[i]; Class cls = remapClass(cat->cls); if (! cls) { // Category's target class is missing (probably weak-linked). // Disavow any knowledge of this category. // Class target class missing (possibly weak link) catList [I] = nil; if (PrintConnecting) { _objc_inform("CLASS: IGNORING category ??? (%s) %p with " "missing weak-linked target class", cat->name, cat); } continue; } // Process this category. // First, register the category with its target class. // Then, Rebuild the class's method lists (etc) if // the class is realized. // Process this category. First, the category is registered with its target class. Then, if the class is implemented, rebuild the list of methods of the class (etc.) bool classExists = NO; if (cat->instanceMethods || cat->protocols || cat->instanceProperties) { addUnattachedCategoryForClass(cat, cls, hi); if (cls->isRealized()) { remethodizeClass(cls); classExists = YES; } if (PrintConnecting) { _objc_inform("CLASS: found category -%s(%s) %s", cls->nameForLogging(), cat->name, classExists ? "on existing class" : ""); } } if (cat->classMethods || cat->protocols /* || cat->classProperties */) { addUnattachedCategoryForClass(cat, cls->ISA(), hi); if (cls->ISA()->isRealized()) { remethodizeClass(cls->ISA()); } if (PrintConnecting) { _objc_inform("CLASS: found category +%s(%s)", cls->nameForLogging(), cat->name); } } } } ts.log("IMAGE TIMES: discover categories"); ************* omit category independent code *************}Copy the code

From the above code we can see that this code is used to find whether there is a classification. After the _getObjc2CategoryList function is used to obtain the classification list, the method, protocol, attribute, etc. In the traversal, used to addUnattachedCategoryForClass this function for the class to add additional categories, its source code is as follows

/*********************************************************************** * addUnattachedCategoryForClass * Records an unattached category. * Locking: runtimeLock must be held by the caller. **********************************************************************/ static void addUnattachedCategoryForClass(category_t *cat, Class cls, header_info *catHeader) { runtimeLock.assertWriting(); // DO NOT use cat->cls! cls may be cat->cls->isa instead NXMapTable *cats = unattachedCategories(); category_list *list; list = (category_list *)NXMapGet(cats, cls); if (! list) { list = (category_list *) calloc(sizeof(*list) + sizeof(list->list[0]), 1); } else { list = (category_list *) realloc(list, sizeof(*list) + sizeof(list->list[0]) * (list->count + 1)); } list->list[list->count++] = (locstamped_category_t){cat, catHeader}; NXMapInsert(cats, cls, list); }Copy the code

As can be seen from the code, NXMapInsert(cats, CLS, list) is the main method here, which is actually to establish a binding relationship between the current classification and the original class (in fact, through the data structure mapping) to prepare for the following things.

Back in the _read_images function, they all end up calling the remethodizeClass(CLS) function. Search for remethodizeClass and find the source code below

/*********************************************************************** * remethodizeClass * Attach outstanding categories to an existing class. * Fixes up cls's method list, protocol list, and property list. * Updates method caches for cls and its subclasses. * Locking: runtimeLock must be held by the caller **********************************************************************/ static void remethodizeClass(Class cls) { category_list *cats; bool isMeta; runtimeLock.assertWriting(); isMeta = cls->isMetaClass(); // Re-methodizing: check for more categories if ((cats = unattachedCategoriesForClass(cls, false/*not realizing*/))) { if (PrintConnecting) { _objc_inform("CLASS: attaching categories to class '%s' %s", cls->nameForLogging(), isMeta ? "(meta)" : ""); } attachCategories(cls, cats, true /*flush caches*/); free(cats); }}Copy the code

The attachCategories function is finally called inside the function, which takes the CLS and the class array CATS. As we wrote earlier, a class can have multiple categories. Earlier we said that category information is stored in the CATEGORY_T structure, so multiple categories are stored in category_list.

Search attachCategories and find the source code below

// Attach method lists and properties and protocols from categories to a class. // Assumes the categories in cats are all loaded and sorted by load order, // oldest categories first. static void attachCategories(Class cls, category_list *cats, bool flush_caches) { if (! cats) return; if (PrintReplacedMethods) printReplacements(cls, cats); bool isMeta = cls->isMetaClass(); // fixme rearrange to remove these intermediate allocations // According to each classification, Allocate memory space for method/attribute/protocol lists method_list_t **mlists = (method_list_t **) malloc(cats->count * sizeof(*mlists)); property_list_t **proplists = (property_list_t **) malloc(cats->count * sizeof(*proplists)); protocol_list_t **protolists = (protocol_list_t **) malloc(cats->count * sizeof(*protolists)); // Count backwards through cats to get newest categories first int McOunt = 0; int propcount = 0; int protocount = 0; int i = cats->count; bool fromBundle = NO; while (i--) { auto& entry = cats->list[i]; Method_list_t *mlist = entry.cat->methodsForMeta(isMeta); if (mlist) { mlists[mcount++] = mlist; fromBundle |= entry.hi->isBundle(); } // property_list_t *proplist = entry.cat->propertiesForMeta(isMeta); if (proplist) { proplists[propcount++] = proplist; } protocol_list_t *protolist = entry.cat->protocols; if (protolist) { protolists[protocount++] = protolist; Auto rw = CLS ->data(); CLS ->data(); prepareMethodLists(cls, mlists, mcount, NO, fromBundle); rw->methods.attachLists(mlists, mcount); free(mlists); if (flush_caches && mcount > 0) flushCaches(cls); rw->properties.attachLists(proplists, propcount); free(proplists); rw->protocols.attachLists(protolists, protocount); free(protolists); }Copy the code

The above source code can be seen, first according to the method list, attribute list, protocol list, MALloc allocation of memory, according to how many categories and how much memory each piece of method needs to allocate the corresponding memory address.

Then the classification array is traversed in reverse order, and the classification methods, attributes and protocols stored in each classification are put into the corresponding mList, Proplists and Protolosts arrays.

Then through the data() method of the class object, get the class object’s class_rw_T structure RW, class_rw_T stores the method of the class object, attribute and protocol data.

After that, the attachList function of method list, attribute list and protocol list was called by RW respectively, and all the classified methods, attributes and protocol list arrays were passed in. Within the attachList method, the corresponding object methods, attributes and protocols of the class were combined. AttachLists function source code is as follows

void attachLists(List* const * addedLists, uint32_t addedCount) { if (addedCount == 0) return; if (hasArray()) { // many lists -> many lists uint32_t oldCount = array()->count; uint32_t newCount = oldCount + addedCount; setArray((array_t *)realloc(array(), array_t::byteSize(newCount))); array()->count = newCount; memmove(array()->lists + addedCount, array()->lists, oldCount * sizeof(array()->lists[0])); memcpy(array()->lists, addedLists, addedCount * sizeof(array()->lists[0])); } else if (! list && addedCount == 1) { // 0 lists -> 1 list list = addedLists[0]; } else { // 1 list -> many lists List* oldList = list; uint32_t oldCount = oldList ? 1:0; uint32_t newCount = oldCount + addedCount; setArray((array_t *)malloc(array_t::byteSize(newCount))); array()->count = newCount; if (oldList) array()->lists[addedCount] = oldList; memcpy(array()->lists, addedLists, addedCount * sizeof(array()->lists[0])); }}Copy the code

The two most important methods in the function are memmove and memCPY memory copy. Memmove and memcpy are defined as follows

// memmove: memory move. /* __dst: the destination of the moved memory * __src: the first address of the moved memory * __len: */ void *memmove(void *__dst, const void *__src, size_t __len) */ void *memmove(void *__dst, const void *__src, size_t __len); // memcpy: memory copy. /* __dst: the destination of the copied memory * __src: the first address of the copied memory * __n: */ void *memcpy(void *__dst, const void *__src, size_t __n) */ void *memcpy(void *__dst, const void *__src, size_t __n);Copy the code

Description of memmove in the attachLists function call

// array()-> Lists original method, attribute, protocol list array // addedCount Category array length // oldCount * sizeof(array()-> Lists [0]) Space occupied by the original array memmove(array()->lists + addedCount, array()->lists, oldCount * sizeof(array()->lists[0]));Copy the code

After the memmove method, although the list of methods, attributes, and protocols of this class will move backward, the pointer of the corresponding array of this class still points to the original location.

Description of memcpy in the call to attachLists function

// array()-> Lists original method, attribute, protocol list array // addedCount * sizeof(array()-> Lists [0]) The space occupied by the original array memcpy(array()->lists, addedLists, addedCount * sizeof(array()->lists[0]));Copy the code

It turns out that the pointer hasn’t changed, pointing to the beginning all the way through. And after memmove and memcpy methods, the list of classified methods, properties, and protocols is placed in front of the list of methods, properties, and protocols originally stored in the class object.

Summary of stored procedures

1 At runtime, read the Mach-o executable file and load the image resource (i.e., _read_images in map_2_images) to read the data structure stored in the __objc_catList section at compile time. And stored in a temporary variable of type CATEGORY_T.

2 Iterate over the array of temporary variables, reading in turn.

3 Map catList to the original CLS class.

4 Call remethodizeClass to modify method_list, property_list, and protocol_list structures to add classified contents to the original class.

The reason for the classification storage mode

From the above source code analysis, we know that the list of classified methods, attributes, and protocols is placed in front of the list of methods, attributes, and protocols originally stored in the class object. The purpose of this is to ensure that the classification method is called first. We know that when the classification overrides the method of this class, the method of this class will be overridden. In fact, after the above analysis we know that the nature is not overwrite, but priority call. The methods of this class are still in memory.

How do I guarantee that the class method calls must precede the original class method calls

In the OC lookup method flow, a binary lookup is used when searching the list of class methods. The classification method of our class method is added to the front of the method list of the original class. When the method traverses the binary search, the method behind it will also search forward to see if there is a method with the same name (method number). If there is, the previous method will be returned. This ensures priority order, which means that the methods at the top of the list have high priority execution permission. In findMethodInSortedMethodList function can see the search process

static method_t *findMethodInSortedMethodList(SEL key, const method_list_t *list) { assert(list); const method_t * const first = &list->first; const method_t *base = first; const method_t *probe; uintptr_t keyValue = (uintptr_t)key; uint32_t count; for (count = list->count; count ! = 0; count >>= 1) { probe = base + (count >> 1); uintptr_t probeValue = (uintptr_t)probe->name; if (keyValue == probeValue) { // `probe` is a match. // Rewind looking for the *first* occurrence of this value. // This Is required for correct category overrides. // Rewind to find the * first * occurrence of this value. // This is required for correct category overrides. while (probe > first && keyValue == (uintptr_t)probe[-1].name) { probe--; } return (method_t *)probe; } if (keyValue > probeValue) { base = probe + 1; count--; } } return nil; }Copy the code

conclusion

1 The implementation principle of a Category is to put the method, attribute, and protocol data in the Category into the category_T structure, and then copy the method list, attribute list, and protocol list in the structure to the corresponding list of the class object.

Category can add attributes, but it does not automatically generate member variables and set/get methods. There are no member variables in the CATEGORY_t structure. Member variables are stored in the instance object and are determined at compile time. Categories are loaded at run time. Then we cannot add the member variables of the classification to the structure of the instance object at runtime. Therefore, no member variables can be added to a classification.