IOS world cancer -MethodSwizzling
November 03, 2017 • iOS
- The object of the Hook
- The most common hook code
- MethodSwizzling flooding under the pitfalls
- Copy parent class methods cause problems
- Switch positions and Swizzling
- Other Hook methods
- When do you need Swizzling
- An example of using language features
- reference
- Thank you for friendship
I don’t know when to start iOS interviews have become very popular asking what Runtime is, so iOSer always starts talking about MethodSwizzling when they hear Runtime. But in fact, if readers pay attention to the Hook principle of C language, they will find that the so-called hooks are tools reserved for us by the designers of the framework or language, rather than some dark technology. MethodSwizzling is actually just a simple and interesting mechanism. However, such a mechanism can always become a panacea in daily life and be used unscrupulously.
Many iOS projects are not robust enough in the initial architecture design and poor scalability in the later stage. IOSer came up with MethodSwizzling, a weapon that would hook a normal method into a project, causing the quality of the project to become unmanageable. I used to love abusing MethodSwizzling on projects, but I didn’t realize how dangerous this bad practice was until I stepped in the hole. Only then did I understand that learning a mechanism requires a deep understanding of the design of the mechanism, rather than following the trend of abuse and bringing bad consequences. And then there’s this article.
The object of the Hook
There are two common hook objects on iOS platform:
- C/C++ functions
- Objective-C method
For C/C+ + hook, the fishhook framework of Facebook can be used as a common way. For specific principles, you can refer to the book Mac OS X & iOS for in-depth understanding. You may be more familiar with Objective-C Methods, and this article will only discuss them.
The most common hook code
I believe many of you have used the JRSwizzle library, or have seen nshipster.cn/method-swiz… In the post. The above code is simplified as follows.
+ (BOOL)jr_swizzleMethod:(SEL)origSel_ withMethod:(SEL)altSel_ error:(NSError**)error_ { Method origMethod = class_getInstanceMethod(self, origSel_); if (! origMethod) { SetNSError(error_, @"original method %@ not found for class %@", NSStringFromSelector(origSel_), [self class]); return NO; } Method altMethod = class_getInstanceMethod(self, altSel_); if (! altMethod) { SetNSError(error_, @"alternate method %@ not found for class %@", NSStringFromSelector(altSel_), [self class]); return NO; } class_addMethod(self, origSel_, class_getMethodImplementation(self, origSel_), method_getTypeEncoding(origMethod)); class_addMethod(self, altSel_, class_getMethodImplementation(self, altSel_), method_getTypeEncoding(altMethod)); method_exchangeImplementations(class_getInstanceMethod(self, origSel_), class_getInstanceMethod(self, altSel_)); return YES;Copy the code
The above code will not be a problem in the very common Swizzling case, but the above code will have many security risks in the complex scenario.
MethodSwizzling flooding under the pitfalls
Github has a robust library, RSSwizzle(and this is how Swizzling is ultimately recommended in this article), that points to the risks posed by the above code.
-
Swizzling is safe only in +load.
-
The method to be hooked must be a method of the current class. There are problems if you copy an inherited IMP onto your own. Methods of the parent class should be used when called, not swizzling when copied to the child class.
-
Swizzled methods will have problems if they rely on CMD, hook, and then CMD sends changes.
-
Name if a conflict causes a previous hook to fail or a loop to be called.
MethodSwizzling is implemented in a class, and the classed methods are appended to the class’s MethodList when the Runtime loads them, If the +load is not executed in Swizzling once the same name occurs, then SEL and IMP do not match the resulting hook is a circular call.
The third is a problem that is not easily discovered. We all know that objective-C methods have two implicit arguments self, CMD, and sometimes developers who are using associative properties might not bother to declare the key (void *), Use the CMD variable objc_setAssociatedObject(self, _cmd, xx, 0) directly; This results in a dependency on CMD for the current IMP.
Once this method is Swizzling, then the CMD of the method is bound to change, there is a bug must be you can not find, after you find the heart will greet the Swizzling your method developer ancestors 18 generation hello, Again, if you Swizzling is a systematic method that happens to use CMD internally… ~_~ (here the back starts a cold sweat).
Copy parent class methods cause problems
The second one above is the most common scenario, and one that 99% of developers don’t even notice. So let’s do an experiment
@implementation Person
- (void)sayHello {
NSLog(@"person say hello");
}
@end
@interface Student : Person
@end
@implementation Student (swizzle)
+ (void)load {
[self jr_swizzleMethod:@selector(s_sayHello) withMethod:@selector(sayHello) error:nil];
}
- (void)s_sayHello {
[self s_sayHello];
NSLog(@"Student + swizzle say hello");
}
@end
@implementation Person (swizzle)
+ (void)load {
[self jr_swizzleMethod:@selector(p_sayHello) withMethod:@selector(sayHello) error:nil];
}
- (void)p_sayHello {
[self p_sayHello];
NSLog(@"Person + swizzle say hello");
}
@end
Copy the code
The above code has a Person class that implements the sayHello method, a Student that inherits from Person, Student Swizzling “sayHello”, and Person Swizzling “sayHello”.
When we generate an instance of the Student class and call the sayHello method, we expect the following output:
"person say hello"
"Person + swizzle say hello"
"Student + swizzle say hello"
Copy the code
But the output might look something like this:
"person say hello"
"Student + swizzle say hello"
Copy the code
This scenario occurs because the compile Source order in build Phases is subclass before parent class.
We all know that in the Objective-C world the +load of the parent class precedes the load of the subclass, but there’s no restriction that the parent class’s classes will load before the subclass’s classes, and it really depends on the order in which they’re compiled. It is eventually merged into the fixed sections of Mach-O in the order it was compiled.
Here’s a look at why the code looks like this.
In the beginning, the parent class has its own sayHello method, and the subclass has the s_sayHello method added by the class and calls sel as s_sayHello inside s_sayHello.
But the classification of subclasses using the MethodSwizzling mentioned above results in the change shown below
Because calling class_addMethod causes a new Method to be created and added to the Student class, the SEL does not change, and the IMP still points to the IMP unique to the parent class. Then the IMP pointer to the subclass’s two methods is swapped. The method reference then becomes the following structure. The dotted line indicates the invocation path of the method.
There is nothing wrong with Swizzling only once, but there is no guarantee that a colleague will Swizzling the parent class again for some hidden purpose, or that the third library we introduced will do so.
Therefore, when we Swizzling in the classification of Person, the method structure will change as follows.
Our code call path will look like this. You can see why Swizzling does not hook the subclass after the parent class.
This is just one of the most common scenarios, and the impact is only that the parent class does not Hook the derived class, and there are no obvious phenomena such as serious crashes, so most developers are unaware of this behavior.
There is a blog post on objective-C Method Swizzling that analyzes the uncertainties of this Swizzling approach in a more comprehensive way
Switch positions and Swizzling
The RSSwizzle mentioned earlier is another more robust Swizzling method.
The following code is used here
RSSwizzleInstanceMethod([Student class],
@selector(sayHello),
RSSWReturnType(void),
RSSWArguments(),
RSSWReplacement(
{
// Calling original implementation.
RSSWCallOriginal();
// Returning modified return value.
NSLog(@"Student + swizzle say hello sencod time");
}), 0, NULL);
RSSwizzleInstanceMethod([Person class],
@selector(sayHello),
RSSWReturnType(void),
RSSWArguments(),
RSSWReplacement(
{
// Calling original implementation.
RSSWCallOriginal();
// Returning modified return value.
NSLog(@"Person + swizzle say hello");
}), 0, NULL);
Copy the code
Since RS method needs to provide SEL of Swizzling any type of signature, RS uses macros as the entry point of code packaging, and the developers themselves guarantee the correctness of method parameter number and parameter type, so it is relatively obscure to use. Maybe this is the reason why he is so excellent but has few stars :(.
Let’s expand the macro
RSSwizzleImpFactoryBlock newImp = ^id(RSSwizzleInfo *swizzleInfo) { void (*originalImplementation_)(__attribute__((objc_ownership(none))) id, SEL); SEL selector_ = @selector(sayHello); return ^void (__attribute__((objc_ownership(none))) id self) { IMP xx = method_getImplementation(class_getInstanceMethod([Student class], selector_)); IMP xx1 = method_getImplementation(class_getInstanceMethod(class_getSuperclass([Student class]) , selector_)); IMP oriiMP = (IMP)[swizzleInfo getOriginalImplementation]; ((__typeof(originalImplementation_))[swizzleInfo getOriginalImplementation])(self, selector_); // Only this line is our core logic NSLog(@"Student + swizzle say hello"); }; }; [RSSwizzle swizzleInstanceMethod:@selector(sayHello) inClass:[[Student class] class] newImpFactory:newImp mode:0 key:((void*)0)];;Copy the code
RSSwizzle core code is actually only a function
static void swizzle(Class classToSwizzle,
SEL selector,
RSSwizzleImpFactoryBlock factoryBlock)
{
Method method = class_getInstanceMethod(classToSwizzle, selector);
__block IMP originalIMP = NULL;
RSSWizzleImpProvider originalImpProvider = ^IMP{
IMP imp = originalIMP;
if (NULL == imp){
Class superclass = class_getSuperclass(classToSwizzle);
imp = method_getImplementation(class_getInstanceMethod(superclass,selector));
}
return imp;
};
RSSwizzleInfo *swizzleInfo = [RSSwizzleInfo new];
swizzleInfo.selector = selector;
swizzleInfo.impProviderBlock = originalImpProvider;
id newIMPBlock = factoryBlock(swizzleInfo);
const char *methodType = method_getTypeEncoding(method);
IMP newIMP = imp_implementationWithBlock(newIMPBlock);
originalIMP = class_replaceMethod(classToSwizzle, selector, newIMP, methodType);
}
Copy the code
The above code has been removed irrelevant locking, defense logic, simplified understanding.
We can see that the RS code actually constructs a Block that holds the code that we need to execute.
We then pass our name originalImpProviderBloc as a parameter to our block, which contains the call to the original IMP to be Swizzling.
Note that when using class_replaceMethod if a method comes from the parent class, add a method to the subclass and set NewIMP to it, and return NULL.
In the originalImpProviderBloc we noticed that if the IMP is NULL, it is dynamic to get the parent Method and execute it.
We also use diagrams to analyze code.
When Swizzling first added a method, the call to the parent class was dynamically retrieved because the sayHello method did not exist, and the original IMP was NULL when Swizzling added a method.
If we apply the Student Hook again, since Student already has the sayHello Method, this replace will return the pointer to the original IMP, and the new IMP will be populated with the pointer to Method.
Thus, our method reference is in the shape of a linked list.
Similarly, when we hook the parent class, the parent class’s method reference is also a linked list style.
I believe that by this time you have understood RS to Swizzling way is:
If it’s a parent method, it looks it up dynamically. If it’s a parent method, it builds a method reference chain. To ensure the stability of multiple Swizzling, and will not conflict with other people’s Swizzling.
And the implementation of RS, because it is not a classified method, does not constrain the developer to call in the +load method for security, and the CMD does not change.
Other Hook methods
In fact, the Hook library has another method called Aspect, which uses the method invocation to point all the method calls to _objc_msgForward and then implement the message forward step, in which the parameter list and return value are handled by NSInvocation dynamically.
JSPatch, a well-known hot repair library in China, uses this method for reference to achieve hot repair.
But the above library requirements must be executed last to ensure that the Hook succeeds. And it is not compatible with other Hook methods, so the technology selection should be carefully considered.
When do you need Swizzling
I remember when I first learned the concept of AO P when I was learning javaWeb, FilterChain in Serverlet, developers can implement a variety of filters and then insert log, statistics, cache and other functions in the filter that are not related to the main business logic line code, Struts2, the famous framework, is implemented in this way.
In iOS, Swizzling’s API is easy to use, leading to indiscriminate abuse by developers and affecting the stability of the project. When we want to Swizzling, we should think about whether we can implement it with good code and architectural design, or with deep language features.
An example of using language features
We all know that in iOS8, the notification center holds an __unsafe_unretained observer pointer. If the observer forgets to remove it from the notification center while dealloc, it will Crash if the relevant notification is triggered later.
When I designed the anti-crash tool XXShield, I initially used the dealloc method of Hook NSObjec to do the corresponding observer removal operation. Later, one of the big players pointed out that this was a very unwise operation because dealloc would affect the release of global instances, and the developer could not guarantee the quality of the code, which would cause a large number of crashes or abnormal behavior during the entire APP run.
Let’s take a look at the ObjCRuntime source code for what an object does when it is released, around objc-Runtime-new.mm on line 6240.
/*********************************************************************** * objc_destructInstance * Destroys an instance without freeing memory. * Calls C++ destructors. * Calls ARC ivar cleanup. * Removes associative references. * Returns `obj`. Does nothing if `obj` is nil. **********************************************************************/ void *objc_destructInstance(id obj) { if (obj) { // Read all of the flags at once for performance. bool cxx = obj->hasCxxDtor(); bool assoc = obj->hasAssociatedObjects(); // This order is important. if (cxx) object_cxxDestruct(obj); if (assoc) _object_remove_assocations(obj); obj->clearDeallocating(); } return obj; } /*********************************************************************** * object_dispose * fixme * Locking: none **********************************************************************/ id object_dispose(id obj) { if (! obj) return nil; objc_destructInstance(obj); free(obj); return nil; }Copy the code
The dealloc method is called when an object is released, and the observation object bound to the instance needs to be disconnected, so we can dynamically bind an associated object to the observer when adding the observer. The associated object can hold the observer in reverse, and then remove the observer when the associated object is released. The __weak or __unsafe_unretained pointer cannot be used as a circular reference. The __weak pointer was cleared before dealloc, so we used the __unsafe_unretained pointer.
@interface XXObserverRemover : NSObject { __strong NSMutableArray *_centers; __unsafe_unretained id _obs; } @end @implementation XXObserverRemover - (instancetype)initWithObserver:(id)obs { if (self = [super init]) { _obs = obs; _centers = @[].mutableCopy; } return self; } - (void)addCenter:(NSNotificationCenter*)center { if (center) { [_centers addObject:center]; } } - (void)dealloc { @autoreleasepool { for (NSNotificationCenter *center in _centers) { [center removeObserver:_obs]; } } } @end void addCenterForObserver(NSNotificationCenter *center ,id obs) { XXObserverRemover *remover = nil; static char removerKey; @autoreleasepool { remover = objc_getAssociatedObject(obs, &removerKey); if (! remover) { remover = [[XXObserverRemover alloc] initWithObserver:obs]; objc_setAssociatedObject(obs, &removerKey, remover, OBJC_ASSOCIATION_RETAIN_NONATOMIC); } [remover addCenter:center]; } } void autoHook() { RSSwizzleInstanceMethod([NSNotificationCenter class], @selector(addObserver:selector:name:object:), RSSWReturnType(void), RSSWArguments(id obs,SEL cmd,NSString *name,id obj), RSSWReplacement({ RSSWCallOriginal(obs,cmd,name,obj); addCenterForObserver(self, obs); }), 0, NULL); }Copy the code
It is important to include the code in a custom AutoreleasePool when adding the associator.
We all know in objective-C that if an object is Autorelease it’s delayed until the end of the current method stack, and in ARC, Normally an Autorelease object is stored in a system-provided AutoreleasePool, and AutoreleasePool drain is used torelease the internally held object. Normally the command line works fine. However, in iOS, AutoReleasePool is released at idle time under Runloop control, which can improve the user experience and avoid stuttering. However, in our scenario, AutoReleasePool is problematic. We rely heavily on the observer calling dealloc and the associated object going to Dealloc. If the system’s AutoReleasePool is delayed, the associated object will not be released until the current object is reclaimed. In this case, __unsafe_unretained access is an invalid address.
We added a custom AutoreleasePool when we added the associated object to ensure that the reference to the associated object was uncomplicated and that the order in which our dependencies were released was correct. Removes the observer correctly.
reference
- JRSwizzle
- RSSwizzle
- Aspect
- Swizzling is a blogger on Objective-C Method
- The sample code
Thank you for friendship
Finally, thanks for correcting my lame text description.