preface
Category is one of the most common techniques we use, such as adding methods to a class, adding member variables, or optimizing code structure with a Category.
Let’s use the following questions as a starting point to explore the underlying principles of categories in combination with the runtime source code.
We can add methods directly in a Category, and we all know that the methods we add will be incorporated into this class, and we can also declare properties, but the properties don’t have any functionality, that is, they can’t store values, which is similar to the computed properties in Swift, and if we want this property to store values, You’re going to do it in the Runtime mode, dynamically.
To explore the
1. Why can I add methods to a Category but not member variables
We’ll start by creating a Person class, then create a Category of Person+Run, and implement the -run method in Person+Run.
We can compile Person+ run.m using the command line
xcrun -sdk iphonesimulator clang -rewrite-objc Person+Run.m
Copy the code
You get a Person+ run.cpp file, and at the bottom of the file, you can find such a structure
struct _category_t {
const char *name;
struct _class_t *cls;
const struct _method_list_t *instance_methods;
const struct _method_list_t *class_methods;
const struct _protocol_list_t *protocols;
const struct _prop_list_t *properties;
};
Copy the code
These fields are almost universally understood.
Each Category is compiled and stored in a variable of type _category_t
static struct _category_t _OBJC_$_CATEGORY_Person_$_Run __attribute__ ((used, section ("__DATA,__objc_const"))) =
{
"Person",
0, // &OBJC_CLASS_$_Person,
(const struct _method_list_t *)&_OBJC_$_CATEGORY_INSTANCE_METHODS_Person_$_Run,
0,
0,
0,
};
Copy the code
Since we only have one instance method in our Person+Run, we can see from the above code that only the corresponding location is passed.
The _category_t structure also shows that the property is stored in _prop_list_t. There is no objc_ivar_list structure of the class, so there is no place to store ivar in the _category_T structure. So you can’t add member variables.
If we manually add a set and get method to a member variable in a classification, we can call it, but there is no memory to store the value. This is similar to the calculation property in Swift, which only serves the purpose of calculation, which is equivalent to two methods (set and GET) that do not have the actual memory to store the value.
For example
@property (copy, nonatomic) NSString * name;
Copy the code
If the following declaration is in a class, the system will by default declare a member variable _name, declare the setName and name methods in.h, and provide the default implementation of the setName and name methods.
If you’re in a Category, you’re just declaring setName and name, and there’s no implementation and no _name.
2. When Category methods are merged into a class
We all know that Category categories are loaded by runtime features when our application is launched, but the details of this loading process are analyzed in conjunction with the Runtime source code.
There is a lot of runtime source code, so let’s take a quick look at the code to find out where the functionality is.
I found the following method in objc-Runtime-new.mm.
/***********************************************************************
* methodizeClass
* Fixes up cls's method list, protocol list, and property list.
* Attaches any outstanding categories.
* Locking: runtimeLock must be held by the caller
**********************************************************************/
static void methodizeClass(Class cls, Class previously)
Copy the code
And his comments are somewhat clear, fixing methods for classes, lists of protocols and variables, and associating categories that are not yet associated.
And then we kept looking, and we found the method that we needed.
void attachLists(List* const * addedLists, uint32_t addedCount)
Copy the code
We can take a piece of code from it and analyze it to solve our problem.
// many lists -> many lists
uint32_t oldCount = array()->count;
uint32_t newCount = oldCount + addedCount;
setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
array()->count = newCount;
memmove(array()->lists + addedCount, array()->lists,
oldCount * sizeof(array()->lists[0]));
memcpy(array()->lists, addedLists,
addedCount * sizeof(array()->lists[0]));
Copy the code
Before calling this method, all our categories will be called to a list (each category is an element), and then we will call attachLists. We can see that when we pass realloc, we pass newCount. This is because we need to increase the methods in the category, so we need to expand the array. At the end of expansion, the first call memmove method, in the call memcopy, we can check the Internet to see the specific difference between the two methods, here is a simple one, in fact, the effect is to copy the content of the memory behind the memory to the front memory, but memmove can deal with the problem of memory overlap.
Actually, that is, first of all, the original each element of the array first move back (we want to add a few elements, several mobile), because the location after the move, is also an array own memory space, so the problems of overlapping, mobile elements can lead to leakage problems directly, so use memmove (overlap) will detect if there is a memory.
Once you’re done, move the elements in the list where we store the methods in the classification to the front of the array.
So that’s the process. In fact, we have solved the third problem by the way.
3. Order of execution of Category methods and methods in a class
Actually said to the above, the original method in class is to move back, the location of the classification method is added to the front, and a method is called is traversal search in the list, so we call methods, will first call to the Category of methods, but that is not covered, because we still have the original method, However, this mechanism ensures that if there is a method in the classification that overrides the class, it will be found first.
4. + Load vs. +initialize
For this problem, we start from two perspectives, call method and call time.
+load
To take a simple example, we create a Person class, override the +load method, and create two new categories for Person, each implementing +load.
@implementation Person
+ (void)load {
NSLog(@"Person - load");
}
@end
@implementation Person (Test1)
+ (void)load {
NSLog(@"Person Test1 - load");
}
@end
@implementation Person (Test2)
+ (void)load {
NSLog(@"Person Test2 - load");
}
@end
Copy the code
When we proceed with the project, we get the following print.
2020-09-14 09:34:41.900161+0800 Category[4533:53426] Person-LOAD 2020-09-14 09:34:41.900629+0800 Category[4533:53426] Person Test1 - load the 2020-09-14 09:34:41. 900700 + 0800 Category (4533-53426) Person Test2 - loadCopy the code
We’re not using the Person class and its Category, so the runtime will call the + Load method when the project is running and the classes and categories are loaded.
Void load_images(const char *path __unused, const struct mach_header *mh)
The last line of the method calls call_load_methods(), which implements the +load call.
The following is an implementation of the call_load_methods() function
void call_load_methods(void)
{
static bool loading = NO;
bool more_categories;
loadMethodLock.assertLocked();
// Re-entrant calls do nothing; the outermost call will finish the job.
if (loading) return;
loading = YES;
void *pool = objc_autoreleasePoolPush();
do {
// 1. Repeatedly call class +loads until there aren't any more
while (loadable_classes_used > 0) {
call_class_loads();
}
// 2. Call category +loads ONCE
more_categories = call_category_loads();
// 3. Run more +loads if there are classes OR more untried categories
} while (loadable_classes_used > 0 || more_categories);
objc_autoreleasePoolPop(pool);
loading = NO;
}
Copy the code
Call_class_loads (), call_category_loads(), call_category_loads(), call_class_loads(), call_category_loads(), call_class_loads(), call_category_loads(), call_class_loads(), call_category_loads()
Then we enter the call_class_loads() function
static void call_class_loads(void) { int i; // Detach current loadable list. struct loadable_class *classes = loadable_classes; int used = loadable_classes_used; loadable_classes = nil; loadable_classes_allocated = 0; loadable_classes_used = 0; // Call all +loads for the detached list. for (i = 0; i < used; i++) { Class cls = classes[i].cls; load_method_t load_method = (load_method_t)classes[i].method; if (! cls) continue; if (PrintLoading) { _objc_inform("LOAD: +[%s load]\n", cls->nameForLogging()); } (*load_method)(cls, @selector(load)); } // Destroy the detached list. if (classes) free(classes); }Copy the code
As you can see from the loop in the middle, we took a pointer to each class’s +load function and called the function directly from the pointer. The same is true for the Category +load method called in the call_category_loads() function.
This also answers another puzzle, which is why +load is always called for the class before +load is called for the Category.
Consider: if there is inheritance, what is the order in which +load would be called?
From the call_class_loads() function above you can see that there is a list: Loadable_classes, which we assume holds all of our classes, because the loop below starts at 0, so we need to look at the order of the +load method for all of the classes in this list.
Prepare_load_methods, which calls schedule_class_load in its implementation, is available from the prepare_load_methods source code
static void schedule_class_load(Class cls) { if (! cls) return; ASSERT(cls->isRealized()); // _read_images should realize if (cls->data()->flags & RW_LOADED) return; // Ensure superclass-first ordering schedule_class_load(cls->superclass); add_class_to_loadable_list(cls); cls->setInfo(RW_LOADED); }Copy the code
Schedule_class_load (CLS ->superclass) Add_class_to_loadable_list is always added to the list before the subclass, so before calling a class’s +load method, you can call add_class_to_loadable_list. You must call its parent’s +load method first.
If it’s another class that doesn’t inherit, it depends on the compile order, so you can try to verify that for yourself.
Summary:
The +load method calls the +load method for each class and category once after the Runtime calls the +load method for each class and category: the class’s +load is called first
- Call in compile order
- Before calling the subclass +load, call the parent class +load
Call the +load of the classification
- Call in compile order
If there’s anything in the summary that I haven’t mentioned, you can check for yourself.
+initialize
The +initialize method is called differently. If a class has not been used before, the +initialize method will not be called until the class has been used.
@implementation Person
+ (void)initialize {
NSLog(@"Person - initialize");
}
@end
@implementation Person (Test1)
+ (void)initialize {
NSLog(@"Person Test1 - initialize");
}
@end
@implementation Person (Test2)
+ (void)initialize {
NSLog(@"Person Test2 - initialize");
}
@end
Copy the code
When we execute [Person alloc]; The +initialize method is used, and the +initialize method is implemented in the Category:
2020-09-14 10:40:23.579623+0800 Category[9134:94173] Person Test2 - initialize
Copy the code
We’ve already mentioned this before. The Category method is added before the list, so it is found and executed first, so we assume that +initialize is executed using the normal message mechanism, objc_msgSend.
Since the objc_msgSend implementation is not entirely open source and is assembly code, we need a different approach to the source code.
What is the nature of objc_msgSend? In the case of calling an instance method, you find the class through the ISA pointer, then look for the method, and then call it. If not, find the parent through superClass and continue to look for methods. In the above example, we only called an alloc method, but we also called the +initialize method, so we assume +initialize will be called when we look for the method. In this way, we locate the class_getInstanceMethod() function (which is called when looking for an SEL in a class), where, IMP lookUpImpOrForward(id inst, SEL SEL, Class CLS, int behavior)
In this function we can find the following code
if ((behavior & LOOKUP_INITIALIZE) && ! cls->isInitialized()) { initializeNonMetaClass (_class_getNonMetaClass(cls, inst)); }Copy the code
If +initialize has not yet been implemented, the class will be executed first. If we look at the initializeNonMetaClass function, it will get the superClass first and execute the +initialize function of the superClass
supercls = cls->superclass; if (supercls && ! supercls->isInitialized()) { initializeNonMetaClass(supercls); }Copy the code
This is why +initialize of the parent class is implemented first if inheritance exists.
Great summary
-
Call way
Load is called directly from the function address initialize is called from the message mechanism objc_msgSend
-
Call the moment
Load is called when the Runtime loads classes and classes (only once). Initialize is called when the class first receives a message. By default, each class is initialized only once if there is no inheritance (parent class initialize may be performed multiple times).
-
Call to order
- load
Load of class called first: the compiled class is called first, and the parent class is called first before the subclass is called. Load of Category called first: compiled first
- initialize
Initialize the parent class before initializing the child class (initialize the child class may call initialize the parent class)
supplement
The parent class initialize is executed more than once. When is it executed more than once and why? Here’s an example:
@implementation Person
+ (void)initialize {
NSLog(@"Person - initialize");
}
@end
@implementation Student
@end
Copy the code
The Student class inherits from the Person class, and only the Person parent implements +initialize, not the Student class
At this point we call [Student alloc]; , you get the following print.
2020-09-14 11:31:55.377569+0800 Category[11483:125034] Person-initialize 2020-09-14 11:31:55.377659+0800 Category[11483:125034] Person - initializeCopy the code
Person +initialize is executed twice, but the meaning is different. The first time it is executed is because the parent +initialize must be executed before the subclass’s +initialize method is called, so it prints once. The student class does not have an implementation of +initialize, so we need to use the superClass pointer to look for the parent class. +initialize is implemented in the parent class, so we can print the second time.
At the end
The length of this article is a little longer, the author wrote this article according to his own ideas and ideas, the statement process is not necessarily so conditioning and perfect, you can leave a message when you find problems in the process of reading.
Thanks for reading.