Reprinted from: http://tech.meituan.com/DiveIntoCategory.html

Abstract

No matter how well a class is designed, it is possible to encounter unpredictable situations in the evolution of future requirements. So how do you extend existing classes? In general, inheritance and composition are good choices. But in Objective-C 2.0, a category language feature is available that dynamically adds new behavior to existing classes. Today, categories are everywhere in Objective-C code, from the official Apple framework to various open source frameworks, from large apps with complex functions to simple applications, catagory is everywhere. This paper makes a comprehensive arrangement of categories, hoping to be of benefit to readers.

Introduction to the

The author of this article is from the iOS r&d team of Meituan Hotel & Tourism Business Group. We are committed to value creation, efficiency and excellence. Welcome to join us (please send your resume to [email protected]). This article is compiled by learning objective-C Runtime source code. It mainly analyzes the implementation principle of category at the Runtime layer and all aspects related to category.

  • Beginner’s Land – Category Introduction
  • Connect analogies -category and extension
  • Light -category features
  • How to load a category
  • Side branch -category and +load methods
  • By analogy -category and method overlay
  • The next level -category and associated objects

1. Introduction to Category

Categories are a language feature added after Objective-C 2.0. Their main purpose is to add methods to existing classes. In addition, Apple recommends two other usage scenarios for category 1

  • You can separate the implementation of a class into several different files. There are several obvious benefits to doing this: a) reducing the size of a single file, b) organizing different functions into different categories, c) creating a class by multiple developers, d) loading desired categories on demand, and so on.
  • Declare private methods

However, in addition to the usage scenarios recommended by Apple, the majority of developers have rich imagination and have derived several other usage scenarios of category:

  • Simulated multiple inheritance
  • Expose the framework’s private methods

This language feature of Objective-C may be trivial in purely dynamic languages such as javascript, where you can add arbitrary methods and instance variables to a “class” or object at any time. But for languages that aren’t so “dynamic,” this is a great feature.

2. Connect analogies -category and extension

Extension looks a lot like an anonymous category, but extension and named category are almost entirely different things. Extension is defined at compile time as part of a class, formed at compile time along with @Interface in the header file and @Implement in the implementation file as a complete class, which is born and dies with the class. Extensions are generally used to hide private information about classes. You must have the source code of a class to add an extension to a class, so you cannot add an extension to a system class such as NSString. (See 2 for details)

But categories are completely different, and are determined at run time. From the difference between category and Extension, we can infer the obvious fact that extension can add instance variables, while category cannot. (Because at runtime, the memory layout of the object is already determined, adding instance variables would break the internal layout of the class. This is disastrous for compiled languages).

3, light -category features

All OC classes and objects are structs at the Runtime layer. Category is struct at the Runtime layer. Category_t (category_t in objc-Runtime-new.h) It contains 1), the name of the class, 2), class (CLS), 3), a list of all the instanceMethods added to the class in the category (instanceMethods), and a list of all the classMethods added in the category (classMethods). 5) Protocols of all protocols implemented by category 6) instanceProperties added by category

typedef struct category_t {
    const char *name;
    classref_t cls;
    struct method_list_t *instanceMethods;
    struct method_list_t *classMethods;
    struct protocol_list_t *protocols;
    struct property_list_t *instanceProperties;
} category_t;Copy the code

From the definition of category, we can also see that category can be (can add instance methods, class methods, and even implement the protocol, add attributes) and not (can not add instance variables). Ok, let’s go ahead and write a category and see what a category is:

MyClass. H:

#import <Foundation/Foundation.h>

@interface MyClass : NSObject

- (void)printName;

@end

@interface MyClass(MyAddition)

@property(nonatomic, copy) NSString *name;

- (void)printName;

@endCopy the code

MyClass. M:

#import "MyClass.h"

@implementation MyClass

- (void)printName
{
    NSLog(@"%@",@"MyClass");
}

@end

@implementation MyClass(MyAddition)

- (void)printName
{
    NSLog(@"%@",@"MyAddition");
}

@endCopy the code

Let’s use Clang’s command to see what the category actually becomes:

clang -rewrite-objc MyClass.mCopy the code

Well, we’ve got a.cpp file that’s 3M in size, 10W lines long (this is definitely a joke for Apple), we’ve ignored everything that doesn’t concern us, and at the end of the file we find the following code snippet:

static struct /*_method_list_t*/ {
unsigned int entsize;  // sizeof(struct _objc_method)
unsigned int method_count;
struct _objc_method method_list[1];
} _OBJC_$_CATEGORY_INSTANCE_METHODS_MyClass_$_MyAddition __attribute__ ((used, section ("__DATA,__objc_const"))) = {
sizeof(_objc_method),
1,
{{(struct objc_selector *)"printName", "v16@0:8", (void *)_I_MyClass_MyAddition_printName}}
};

static struct /*_prop_list_t*/ {
unsigned int entsize;  // sizeof(struct _prop_t)
unsigned int count_of_properties;
struct _prop_t prop_list[1];
} _OBJC_$_PROP_LIST_MyClass_$_MyAddition __attribute__ ((used, section ("__DATA,__objc_const"))) = {
sizeof(_prop_t),
1,
{{"name","T@\"NSString\",C,N"}}
};

extern "C" __declspec(dllexport) struct _class_t OBJC_CLASS_$_MyClass;

static struct _category_t _OBJC_$_CATEGORY_MyClass_$_MyAddition __attribute__ ((used, section ("__DATA,__objc_const"))) =
{
"MyClass",
0, // &OBJC_CLASS_$_MyClass,
(const struct _method_list_t *)&_OBJC_$_CATEGORY_INSTANCE_METHODS_MyClass_$_MyAddition,
0,
0,
(const struct _prop_list_t *)&_OBJC_$_PROP_LIST_MyClass_$_MyAddition,
};
static void OBJC_CATEGORY_SETUP_$_MyClass_$_MyAddition(void ) {
_OBJC_$_CATEGORY_MyClass_$_MyAddition.cls = &OBJC_CLASS_$_MyClass;
}
#pragma section(".objc_inithooks$B", long, read, write)
__declspec(allocate(".objc_inithooks$B")) static void *OBJC_CATEGORY_SETUP[] = {
(void *)&OBJC_CATEGORY_SETUP_$_MyClass_$_MyAddition,
};
static struct _class_t *L_OBJC_LABEL_CLASS_$ [1] __attribute__((used, section ("__DATA, __objc_classlist,regular,no_dead_strip")))= {
&OBJC_CLASS_$_MyClass,
};
static struct _class_t *_OBJC_LABEL_NONLAZY_CLASS_$[] = {
&OBJC_CLASS_$_MyClass,
};
static struct _category_t *L_OBJC_LABEL_CATEGORY_$ [1] __attribute__((used, section ("__DATA, __objc_catlist,regular,no_dead_strip")))= {
&_OBJC_$_CATEGORY_MyClass_$_MyAddition,
};Copy the code

$CATEGORY_INSTANCE_METHODS_MyClass$MyAddition $OBJC$PROP_LIST_MyClass$MyAddition Both are named using the common prefix + class name +category name, and the instance method list is filled with the method printName that we wrote in the MyAddition category, The property list is filled with the name property we added in MyAddition. Also note the fact that category names are used to name lists and the category structure itself, and are static, so we can’t repeat category names in the same compilation unit or we’ll get a compilation error. 2) Second, the compiler generates the category itself _OBJC$CATEGORY_MyClass$MyAddition and initializes the category itself with the previously generated list. 3) Finally, the compiler stores a category_T array L_OBJC_LABEL_CATEGORY$in the objc_catList section of the DATA section (of course, if there are more than one category, the array will be generated.) For run-time category loading. At this point, the compiler’s work is almost done, and we’ll show you how to load a category at runtime in the next section.

4. Trace to the source – how to load category

We know that Objective-C relies on the OC Runtime, which, like other system libraries, is dynamically loaded by OS X and iOS via DyLD. To learn more about DYLD, you can go here (3).

For the OC runtime, the entry method is as follows (in objc-os.mm) :

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;

    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    lock_init();
    exception_init();

    // Register for unmap first, in case some +load unmaps something
    _dyld_register_func_for_remove_image(&unmap_image);
    dyld_register_image_state_change_handler(dyld_image_state_bound,
                                             1/*batch*/, &map_images);
    dyld_register_image_state_change_handler(dyld_image_state_dependents_initialized, 0/*not batch*/, &load_images);
}Copy the code

The category is attached to the class when map_images are called. Under the new-ABI standard, map_images called in _objc_init will eventually call the _read_images method in objC-Runtime-new.mm, At the end of the _read_images method, there is the following code snippet:

// Discover categories. for (EACH_HEADER) { category_t **catlist = _getObjc2CategoryList(hi, &count); for (i = 0; i < count; i++) { category_t *cat = catlist[i]; class_t *cls = remapClass(cat->cls); if (! cls) { // Category's target class is missing (probably weak-linked). // Disavow any knowledge of this category. catlist[i] = NULL; if (PrintConnecting) { _objc_inform("CLASS: IGNORING category \? \? \? (%s) %p with " "missing weak-linked target class", cat->name, cat); } continue; } // Process this category. // First, register the category with its target class. // Then, rebuild the class's method lists (etc) if // the class is realized. BOOL classExists = NO; if (cat->instanceMethods || cat->protocols || cat->instanceProperties) { addUnattachedCategoryForClass(cat, cls, hi); if (isRealized(cls)) { remethodizeClass(cls); classExists = YES; } if (PrintConnecting) { _objc_inform("CLASS: found category -%s(%s) %s", getName(cls), cat->name, classExists ? "on existing class" : ""); } } if (cat->classMethods || cat->protocols /* || cat->classProperties */) { addUnattachedCategoryForClass(cat, cls->isa, hi); if (isRealized(cls->isa)) { remethodizeClass(cls->isa); } if (PrintConnecting) { _objc_inform("CLASS: found category +%s(%s)", getName(cls), cat->name); }}}}Copy the code

First of all, the catlist is the category_T array that the compiler prepared for us in the previous section. In terms of how to load the catlist itself, let’s leave out the table for now. It doesn’t have much to do with the category itself. Interested students can explore Apple’s binary format and load mechanism below. Leaving out the PrintConnecting connection, this code is easy to understand: 1) add the category instance methods, protocols, and attributes to the class; 2) add the category class methods and protocols to the metaclass of the class

It is important to note that in the code there is a short comments / | | cat – > classProperties /, looks like apple has had to add attributes to class plan. Ok, let’s move on to see how the various lists of categories are eventually added to the class, taking the instance method list for example: In the above code snippet, addUnattachedCategoryForClass just do a relational mapping the class and category, and remethodizeClass is added to deal with real issues.

static void remethodizeClass(class_t *cls)
{
    category_list *cats;
    BOOL isMeta;

    rwlock_assert_writing(&runtimeLock);

    isMeta = isMetaClass(cls);

    // Re-methodizing: check for more categories
    if ((cats = unattachedCategoriesForClass(cls))) {
        chained_property_list *newproperties;
        const protocol_list_t **newprotos;

        if (PrintConnecting) {
            _objc_inform("CLASS: attaching categories to class '%s' %s",
                         getName(cls), isMeta ? "(meta)" : "");
        }

        // Update methods, properties, protocols

        BOOL vtableAffected = NO;
        attachCategoryMethods(cls, cats, &vtableAffected);

        newproperties = buildPropertyList(NULL, cats, isMeta);
        if (newproperties) {
            newproperties->next = cls->data()->properties;
            cls->data()->properties = newproperties;
        }

        newprotos = buildProtocolList(cats, NULL, cls->data()->protocols);
        if (cls->data()->protocols  &&  cls->data()->protocols != newprotos) {
            _free_internal(cls->data()->protocols);
        }
        cls->data()->protocols = newprotos;

        _free_internal(cats);

        // Update method caches and vtables
        flushCaches(cls);
        if (vtableAffected) flushVtables(cls);
    }
}Copy the code

For instance methods that add classes, attachCategoryMethods will be called.

static void attachCategoryMethods(class_t *cls, category_list *cats, BOOL *inoutVtablesAffected) { if (! cats) return; if (PrintReplacedMethods) printReplacements(cls, cats); BOOL isMeta = isMetaClass(cls); method_list_t **mlists = (method_list_t **) _malloc_internal(cats->count * sizeof(*mlists)); // Count backwards through cats to get newest categories first int mcount = 0; int i = cats->count; BOOL fromBundle = NO; while (i--) { method_list_t *mlist = cat_method_list(cats->list[i].cat, isMeta); if (mlist) { mlists[mcount++] = mlist; fromBundle |= cats->list[i].fromBundle; } } attachMethodLists(cls, mlists, mcount, NO, fromBundle, inoutVtablesAffected); _free_internal(mlists); }Copy the code

AttachCategoryMethods does a relatively simple job of piecing together the list of instance methods for all categories into one large instance method list, which is then passed on to the attachMethodLists method (I swear, this is the last code we looked at in this section), This method is a bit long, so let’s just look at a little bit:

for (uint32_t m = 0;
             (scanForCustomRR || scanForCustomAWZ)  &&  m < mlist->count;
             m++)
        {
            SEL sel = method_list_nth(mlist, m)->name;
            if (scanForCustomRR  &&  isRRSelector(sel)) {
                cls->setHasCustomRR();
                scanForCustomRR = false;
            } else if (scanForCustomAWZ  &&  isAWZSelector(sel)) {
                cls->setHasCustomAWZ();
                scanForCustomAWZ = false;
            }
        }

        // Fill method list array
        newLists[newCount++] = mlist;
    .
    .
    .

    // Copy old methods to the method list array
    for (i = 0; i < oldCount; i++) {
        newLists[newCount++] = oldLists[i];
    }Copy the code

There are two points to note: MethodA: methodA: methodA: methodA: methodA: methodA: methodA: methodA: methodA: methodA: methodA: methodA MethodA (methodA), methodA (methodA), methodA (methodA), methodA (methodA), methodA (methodA), methodA (methodA), methodA (methodA), methodA (methodA), methodA (methodA), methodA (methodA), methodA (methodA), methodA (methodA)) This is because the runtime searches for methods in the order of the list of methods, as long as it finds a method with the corresponding name, it will stop, but there may be other methods with the same name.

5. Side branch -category and +load methods

We know that the +load method can be used in both classes and categories, so there are two questions: 1) Can we call the method declared in the category when the +load method is called? 2) What is the call order of these +load methods? Given the amount of code we’ve looked at in the preceding sections, let’s start with a bit of intuition for these two problems:

Category1 and Category2. MyClass = Category1 and Category2. MyClass = Category1 and Category2. And Category1 and Category2 both write printName method of MyClass. In Xcode, click the Edit Scheme and add the following two environment variables (log can be printed when executing the load method and loading the category, see objc-private.h for more environment variable options) :

Running the project, we’ll see the console print a lot of things out. We’ll just find the information we want, in this order:

objc[1187]: REPLACED: -[MyClass printName] by category Category1

objc[1187]: REPLACED: -[MyClass printName] by category Category2

.

.

.

objc[1187]: LOAD: class ‘MyClass’ scheduled for +load

objc[1187]: LOAD: category ‘MyClass(Category1)’ scheduled for +load

objc[1187]: LOAD: category ‘MyClass(Category2)’ scheduled for +load

objc[1187]: LOAD: +[MyClass load]

.

.

.

objc[1187]: LOAD: +[MyClass(Category1) load]

.

.

.

objc[1187]: LOAD: +[MyClass(Category2) load]

The answer to the above two questions is obvious: 1) it can be called, because the task of attaching a category to a class precedes the execution of the +load method. 2) the +load method is executed in the order of the class before the category method, and the +load method is executed in the order of the compilation. The current compilation order looks like this:

We adjust the compilation order of Category1 and Category2, run. Ok, we can see that the output order of the console has changed:

objc[1187]: REPLACED: -[MyClass printName] by category Category2

objc[1187]: REPLACED: -[MyClass printName] by category Category1

.

.

.

objc[1187]: LOAD: class ‘MyClass’ scheduled for +load

objc[1187]: LOAD: category ‘MyClass(Category2)’ scheduled for +load

objc[1187]: LOAD: category ‘MyClass(Category1)’ scheduled for +load

objc[1187]: LOAD: +[MyClass load]

.

.

.

objc[1187]: LOAD: +[MyClass(Category2) load]

.

.

.

objc[1187]: LOAD: +[MyClass(Category1) load]

This is the order of execution for +load, but for methods that are “overridden”, the corresponding method in the last compiled category is found first. In this section, we just got the answer to the question in a very intuitive way. If you are interested, you can continue to explore the OC runtime code.

6. By analogy -category and method coverage

Given that we’ve covered the basics in the previous sections, there’s only one question left in this section: How do I call a method that was covered by a category in the original class? For this problem, we already know that the category does not completely replace the method of the original class with the same name. It is just at the beginning of the method list, so we can call the method of the original class by following the method list to the last method with the corresponding name:

Class currentClass = [MyClass class]; MyClass *my = [[MyClass alloc] init]; if (currentClass) { unsigned int methodCount; Method *methodList = class_copyMethodList(currentClass, &methodCount); IMP lastImp = NULL; SEL lastSel = NULL; for (NSInteger i = 0; i < methodCount; i++) { Method method = methodList[i]; NSString *methodName = [NSString stringWithCString:sel_getName(method_getName(method)) encoding:NSUTF8StringEncoding]; if ([@"printName" isEqualToString:methodName]) { lastImp = method_getImplementation(method); lastSel = method_getName(method); } } typedef void (*fn)(id,SEL); if (lastImp ! = NULL) { fn f = (fn)lastImp; f(my,lastSel); } free(methodList); }Copy the code

7. Move up to the next level -category and associated objects

As we saw above, we know that you can’t add instance variables to a category in a category. But most of the time, we need to add the value associated with the object in the category, so we can use the associated object to do this.

MyClass+Category1.h:

#import "MyClass.h"

@interface MyClass (Category1)

@property(nonatomic,copy) NSString *name;

@endCopy the code

MyClass+Category1.m:

#import "MyClass+Category1.h"
#import <objc/runtime.h>

@implementation MyClass (Category1)

+ (void)load
{
    NSLog(@"%@",@"load in Category1");
}

- (void)setName:(NSString *)name
{
    objc_setAssociatedObject(self,
                             "name",
                             name,
                             OBJC_ASSOCIATION_COPY);
}

- (NSString*)name
{
    NSString *nameObject = objc_getAssociatedObject(self, "name");
    return nameObject;
}

@endCopy the code

But where do the associated objects exist? How to store it? How do you deal with associated objects when you destroy them? Objc-references. Mm file contains a method called _object_set_associative_reference:

void _object_set_associative_reference(id object, void *key, id value, uintptr_t policy) { // retain the new value (if any) outside the lock. ObjcAssociation old_association(0, nil); id new_value = value ? acquireValue(value, policy) : nil; { AssociationsManager manager; AssociationsHashMap &associations(manager.associations()); disguised_ptr_t disguised_object = DISGUISE(object); if (new_value) { // break any existing association. AssociationsHashMap::iterator i = associations.find(disguised_object); if (i ! = associations.end()) { // secondary table exists ObjectAssociationMap *refs = i->second; ObjectAssociationMap::iterator j = refs->find(key); if (j ! = refs->end()) { old_association = j->second; j->second = ObjcAssociation(policy, new_value); } else { (*refs)[key] = ObjcAssociation(policy, new_value); } } else { // create the new association (first time). ObjectAssociationMap *refs = new ObjectAssociationMap; associations[disguised_object] = refs; (*refs)[key] = ObjcAssociation(policy, new_value); _class_setInstancesHaveAssociatedObjects(_object_getClass(object)); } } else { // setting the association to nil breaks the association. AssociationsHashMap::iterator i = associations.find(disguised_object); if (i ! = associations.end()) { ObjectAssociationMap *refs = i->second; ObjectAssociationMap::iterator j = refs->find(key); if (j ! = refs->end()) { old_association = j->second; refs->erase(j); } } } } // release the old value (outside of the lock). if (old_association.hasValue()) ReleaseValue()(old_association);  }Copy the code

We can see that all associated objects are managed by AssociationsManager, which is defined as follows:

class AssociationsManager { static OSSpinLock _lock; static AssociationsHashMap *_map; // associative references: object pointer -> PtrPtrHashMap. public: AssociationsManager() { OSSpinLockLock(&_lock); } ~AssociationsManager() { OSSpinLockUnlock(&_lock); } AssociationsHashMap &associations() { if (_map == NULL) _map = new AssociationsHashMap(); return *_map; }};Copy the code

AssociationsManager is a static AssociationsHashMap that stores all associated objects. This is equivalent to storing all associated objects in a global map. The key of the map is the pointer address of this object (the pointer address of any two different objects must be different), and the value of this map is another AssociationsHashMap, which stores the KV pair of the associated object. Objc-runtimenew.mm: objc-runtimenew.mm:

void *objc_destructInstance(id obj) { if (obj) { Class isa_gen = _object_getClass(obj); class_t *isa = newcls(isa_gen); // Read all of the flags at once for performance. bool cxx = hasCxxStructors(isa); bool assoc = ! UseGC && _class_instancesHaveAssociatedObjects(isa_gen); // This order is important. if (cxx) object_cxxDestruct(obj); if (assoc) _object_remove_assocations(obj); if (! UseGC) objc_clear_deallocating(obj); } return obj; }Copy the code

Well, the Runtime’s objc_destructInstance function determines whether the object has any associated objects, and if so, calls _object_remove_assocations to clean up associated objects.

Afterword.

As Mr. Hou said, “There are no secrets when it comes to source code.” Apple’s Cocoa Touch framework is not open source, But Objective – C runtime and Core Foundation is completely open source (in www.opensource.apple.com/tarballs/ can… The runtime source code series will continue to be updated, and you can download the source code from the above website. Writing simple, if there are mistakes, hope to correct.