TL; DR

  1. Loading Method Swizzling is not completely safe. The method may not exist at this point.
  2. In the case of Direct Methods, Direct Hook is required.
  3. In the case of Dynamic Load, the timing of Method Swizzling should be delayed.
  4. Using a more robust Hook library (such as RSSwizzle), you can discover the boundary case in time.

background

Method Swizzling is only reasonable and safe in +load.

While this conclusion is generally accepted in the industry, I encountered a different situation. Last week I received a request to Hook some system private methods and found that the class instances in + Load did not respond to these methods. In business code, however, there is no problem with direct invocation. So this method it exists, but it doesn’t exist in +load.

-[NSObject _accessibilityElements] -[NSObject _accessibilityElements] -[NSObject _accessibilityElements]

Call method changed?

Because they are system private methods, Apple makes no promises about their compatibility, and can theoretically change their names, implementation, and invocation methods at will. Direct Methods are the first suspect because of their nature.

In Objective-C, Message and IMP are bound at execution time, not compile time.

-[NSObject Foo] Such a call will be compiled at compile time to:

mov rid, xxx ; argument "instance" for method _objc_msgSend mov rsi, qword [0x12340] ; argument "selector" for method _objc_msgSend, @selector(foo) call qword [_objc_msgSend]Copy the code

From an assembly point of view, you send a message called foo to an instance of NSObject, and how that message is processed is unknown at compile time.

In the execution phase, NSObject receives the message and dynamically looks up the function address of the corresponding method before executing. It is this messaging mechanism that gives Objective-C a very convenient ability to modify IMP at run time.

If you mark a function as Direct Methods, the call will take the form of a Direct call to a C function instead of a messaging mechanism.

// NSObject+DirectMethods.m

@implementation NSObject (DirectMethods)

- (int)foo __attribute__((objc_direct)) {
    { do something ... }
    return 0;
}    

@end

int main() {
    [[NSObject new] foo];
}
Copy the code

The compiler compiles it to:

mov     rdi, xxx
call    -[NSObject foo]
Copy the code

Pointers to functions that need to be executed are determined at compile time. But at the same time, a whole series of apis based on message sending mechanisms fail this approach.

When you create an SEL using @selector(foo), Xcode will give you an error instead of a warning.

// NSObject+DirectMethods.m

@implementation NSObject (DirectMethods)

+ (void)load {
    // Error ❌: @selector expression formed with direct selector 'foo'
    // SEL selector = @selector(foo);
    SEL selector = NSSelectorFromString(@"foo");

    if(! [NSObject respondsToSelector:selector]) {
        NSLog(@"Direct Methods!"); }} - (int)foo __attribute__((objc_direct)) {
    { do something ... }
    return 0;
}    

@end
Copy the code

For more information on Direct Methods, see Objective-C Direct Methods

So Direct Methods is very much in line with what I’ve encountered. Is the target method Direct Methods? The answer can be found in the caller’s assembly, and a caller can be found in the system library

Is a serious message sent 😅.

Dynamic loading?

Instead of Direct Methods, and if the posture used by the method in the system library is normal message sending, it can only be loaded dynamically at some point after +load.

Using the image list command, it can be seen that 391 images are loaded during +load. After the App is fully started, it will stand for a period of time to suspend the App, at which time 519 images will be loaded. So, the high probability is that the target method is loaded in the next 128 images.

To verify this conjecture, we add helper methods that find all methods of the current instance:

// NSObject+TGIF.m

@implementation NSObject (TGIF)

+ (nonnull NSArray<NSString *> *)allMethodNames {
    unsigned int count = 0;
    Method *methods = class_copyMethodList([NSObject self], &count);
    NSMutableArray<NSString *> *result = [NSMutableArray arrayWithCapacity:count];
    for (int i = 0; i < count; i++) {
        const char *name = sel_getName(method_getName(methods[i]));
        [result addObject:[NSString stringWithUTF8String:name]];
    }

    free(methods);
    return [result copy];
}

+ (BOOL)respondToAccessibilityElements {
    for (NSString *name in [self allMethodNames]) {
        if ([name isEqualToString:@"_accessibilityElements"]) {
            return true; }}return false;
}

@end
Copy the code

Execute after +load and start static respectively, and sure enough +load does not respond and start static responds to this method. At this point, we can confirm that the target method is dynamically loaded.

Loading time

Once dynamic loading is determined, we need a time when the target function is loaded. You can subscribe to notifications of image loads using _dyLD_register_func_for_add_image. (Thanks to Apple’s open source dyld, The subscription block is immediately notified of all images that have been loaded prior to registration, so the timing of registration does not need to be precise).

Add image-related output:

static void _print_image(const struct mach_header *mh) {
    Dl_info image_info;
    int result = dladdr(mh, &image_info);
    if (result == 0) {
        printf("Could not print info for mach_header: %p\n\n", mh);
        return;
    }

    const char *image_name = image_info.dli_fname;
    char image_uuid[37];
    const uuid_t *image_uuid_bytes = _image_retrieve_uuid(mh);
    uuid_unparse(*image_uuid_bytes, image_uuid);
    printf("Image added: %s <%s>\n\n",  image_name, image_uuid);
}

static uint32_t _image_header_size(const struct mach_header *mh) {
    bool is_header_64_bit = (mh->magic == MH_MAGIC_64 || mh->magic == MH_CIGAM_64);
    return (is_header_64_bit ? sizeof(struct mach_header_64) : sizeof(struct mach_header));
}

static void _image_visit_load_commands(const struct mach_header *mh, void (^visitor)(struct load_command *lc, bool*stop)) { assert(visitor ! =NULL);
    uintptr_t lc_cursor = (uintptr_t)mh + _image_header_size(mh);
    for (uint32_t idx = 0; idx < mh->ncmds; idx++) {
        struct load_command *lc = (struct load_command *)lc_cursor;

        bool stop = false;
        visitor(lc, &stop);
        if (stop) {
            return; } lc_cursor += lc->cmdsize; }}static const uuid_t *_image_retrieve_uuid(const struct mach_header *mh) {
    __block const struct uuid_command *uuid_cmd = NULL;
    _image_visit_load_commands(mh, ^ (struct load_command *lc, bool *stop) {
        if (lc->cmdsize == 0) {
            return;
        }
        if (lc->cmd == LC_UUID) {
            uuid_cmd = (const struct uuid_command *)lc;
            *stop = true; }});if (uuid_cmd == NULL) {
        return NULL;
    }
    return &uuid_cmd->uuid;
}
Copy the code

Calling _print_image will output a log in the following format, which will greatly enrich our information.

Image added: /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes /iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libBacktraceRecording.dylib <CABB187A-C670-3932-BA9D-3C80B5B4E116>Copy the code

After the upfront preparation, subscribe to the notification in + Load and add breakpoints for the response target function

@implementation NSObject (TGIF)

+ (void)load {
    _dyld_register_func_for_add_image(&image_added);
}
​
static void image_added(const struct mach_header *mh, intptr_t slide) {
    _print_image(mh);
    if ([NSObject respondToAccessibilityElements]) {
        NSLog(@ ""); // Add breakpoint }}@end
Copy the code

The first breakpoint hit was the image libcmark-gfm.dylib, which was checked by hopper. There was no method in hopper, nor was there anything suspicious in the call stack. So the method is not loaded here. After excluding the image, the previous image NotesSupport method also has no related method, but if we place a breakpoint on the image, we will find that the call stack is full of information:

Translated into pseudocode is

- (void)_accessibilityBundlePrincipalClass {
    NSString *path = AXSCopyPathForAccessibilityBundle(@"UIKit");
    NSBundle *bundle = [NSBundle bundleWithPath:path];
    NSError *err = nil; [bundle loadAndReturnError:&err]; . }Copy the code

In the loadAndReturnError: message before and after the break point, get the dynamic library loaded by the message, out of the trust of the Apple developer level, the image containing the keyword Accessibility is prefered.

The target method is found in UIAccessibility:

The truth came out.

A more optimized solution

Although the search scope is greatly reduced by the previous way, but there are dozens of loaded images, manual image search seems not an elegant way.

Class_addMethod is the API used to dynamically add methods in everyday development. Large-scale function-adding operations such as loading images would not normally use such an inefficient API, but its implementation details may contain clues. When all else fails, starting with a familiar Public API can be a good place to start

The function is locked, type wrapped, and will eventually enter

// objc-runtime-new.mm
static void addMethods_finish(Class cls, method_list_t *newlist) {
    auto rwe = cls->data() - >extAllocIfNeeded(a);if (newlist->count > 1) {
        method_t::SortBySELAddress sorter;
        std::stable_sort(&newlist->begin() - >big(), &newlist->end() - >big(), sorter);
    }
    prepareMethodLists(cls, &newlist, 1, NO, NO, __func__);
    rwe->methods.attachLists(&newlist, 1);
    flushCaches(cls, __func__, [](Class c){
        return! c->cache.isConstantOptimizedCache(a); }); }Copy the code

AttachLists add a method_list_t to class_rw_ext_t’s method_array_t. We can also pinpoint when to add methods during image loading.

The second frame attachCategories is exactly what we need! Line 1418 pulls out a method_list_t *, guessing that the method we’re looking for is there.

// objc-runtime-new.mm
static void
attachCategories(Class cls, const locstamped_category_t *cats_list, uint32_t cats_count,
                 int flags)
{{... }for (uint32_t i = 0; i < cats_count; i++) {
        auto& entry = cats_list[i];
        method_list_t *mlist = entry.cat->methodsForMeta(isMeta); / / objc4-818.2 line 1418{... }} {... }}Copy the code

If you need to output method_list_t, there are two ways to output method_list_t. The first way is to switch objective-C Runtime to debug mode and output the source code directly. The other is to find the pointer in the assembly and print it. The second method is mainly introduced here.

Since neither method_list_t nor its internal method_t is public and cannot be exported directly in code or LLDB, you need to manually construct the binary-compatible structure in the Demo.

So let’s construct method_t. According to the source code, the memory structure of method_t has two forms, struct big and struct small. The form of big is the daily recognition that three Pointers are used to store SEL, types and IMP, while the form of stored in the image is small. It stores three 32-bit offsets that need to be added to the current address to get a real pointer.

template <typename T>
struct RelativePointer: nocopy_t {
    int32_t offset;
    T get(a) const {
        if (offset == 0)
            return nullptr;
        uintptr_t base = (uintptr_t)&offset;
        uintptr_t signExtendedOffset = (uintptr_t) (intptr_t)offset;
        uintptr_t pointer = base + signExtendedOffset;
        return(T)pointer; }};struct p_method_t {
    static const uint32_t smallMethodListFlag = 0x80000000;
    struct small {
        RelativePointer<const void *> name;
        RelativePointer<const char *> types;
        RelativePointer<IMP> imp;
    };
    small &small(a) const {
        return *(struct small *)((uintptr_t)this& ~ (uintptr_t)1);
    }
    SEL getSmallNameAsSELRef(a) const {
        return *(SEL *)small().name.get(a); }struct pointer_modifier {
        template <typename ListType>
        static p_method_t *modify(const ListType &list, p_method_t *ptr) {
            if (list.flags() & smallMethodListFlag)
                return (p_method_t((*)uintptr_t)ptr | 1);
            returnptr; }}; };Copy the code

We construct method_list_t, which can be interpreted as an indeterminate array of method_t, with fixed entsizeAndFlags and count as the headers, followed by method_t tiled. The source code for method_list_t uses the template, but we just need a binary compatible structure to carry the template variables and omit the iterators.

uint32_t FlagMask = 0xffff0003;
struct p_method_list_t {
    uint32_t entsizeAndFlags;
    uint32_t count;
    uint32_t entsize(a) const {
        return entsizeAndFlags & ~FlagMask;
    }
    uint32_t flags(a) const {
        return entsizeAndFlags & FlagMask;
    }
    p_method_t& get(uint32_t i) const {
        return *p_method_t::pointer_modifier::modify(*this, (p_method_t((*)uint8_t *)this + sizeof(*this) + i*entsize())); }};Copy the code

With the structure in place, our output is pretty simple, since C++ (defined in objective-cpp) has a function signature that takes an unsigned long argument and converts it to a PTR.

void PrintMethodList(unsigned long ptr) {
    p_method_list_t *mlist = (p_method_list_t *)(ptr);
    for (int i=0; i<mlist->count; i++) {
        p_method_t& method = mlist->get(i);
        printf("%s \n", (char*) method.getSmallNameAsSELRef()); }};Copy the code

Read the assembly against the attachCategories source code, locate the location where method_list_t was fetched, place conditional breakpoints, and output the number of triggers and method_list_t each time the breakpoint is triggered.

Start the demo, recommend (open open again after _accessibilityBundlePrincipalClass breakpoints, otherwise too much information too slow), you can see the target method is in the 245th hit a breakpoint in a method_list_t.

Launch the app again, the same way, but this time, ignore the first 244 times.

Put Method Swizzling at this point and you can call it a day.

conclusion

  1. The target Method is a normal Objective-C Method, not a Direct Method
  2. The target Method is not loaded when +[NSObject load]. Swizzling Method will not find the target Method
  3. The target method is dynamically loaded by loading the Bundle in UIApplicationMain, which is obviously later than + Load.
  4. The timing of _dyLD_register_func_FOR_add_image is before the target method is loaded. This notification callback does not load the current mirrored method.