preface

Explore the environment:

Hardware Mac Book Pro

System OS 11.2.2

The source address

Source debug configuration

In the quick find flow of exploring methods, we saw how objc_msgSend can quickly find methods by looking for cached methods. But what if the method has no cache? At the end of our exploration in the previous section, we found that __objc_msgSend_uncached was called at the end of the assembly, which means it was not cached.

A further search on the methodTable ookup shows the code as shown below:

The x0 register is not assigned. Instead, _lookUpImpOrForward is called through the BL directive. The function is not defined in the arm64.s file. Wondering if this is a C/C++ function, I continue to search globally for lookUpImpOrForward and actually find its function definition in objC-Runtime-new.mm. In this article, we’ll continue our exploration of the lookUpImpOrForward function, and from there we’ll look at the slow lookups of methods, which are not cached and are slower in C/C++ than assembly.

One, source code analysis before the environment

LookUpImpOrForward is called after objc_msgSend is quickly looked up but not found in the cache. At this point, there must already be a context in memory that contains information such as the parameters of lookUpImpOrForward passed in.

Let’s take a look at the parameters required by lookUpImpOrForward. Its function is defined as follows:

As you can see from the source code analysis of the lookUpImpOrForward call in the introduction, the parameters are as follows:

  • Id inst: the first parameter is x0, which is the receiver of the message
  • SEL SEL: the second parameter is x1, which is the method name SEL
  • Class CLS: The third argument, x2, is assigned by X16, where x16 is the Class information found in the quick lookup process
  • Int behavior: the fourth parameter is x3, which is 3. The function of this parameter will be described later

Two, the preparation work before method search

Before a formal method lookup, Apple does a few steps, including some fault-tolerant handling and checking, as shown in the flowchart below:

Here is the purpose of these steps in turn:

  • Check whether the incoming CLS is initialized, if an uninitialized behaviors | = LOOKUP_NOCACHE, namely behaviors will be changed to 11, define LOOKUP_NOCACHE enumerated below
/* method lookup */
enum {
    LOOKUP_INITIALIZE = 1,
    LOOKUP_RESOLVER = 2,   
    LOOKUP_NIL = 4,
    LOOKUP_NOCACHE = 8,
};
Copy the code
  • CheckIsKnownClass (CLS), which ensures that incoming classes are legitimate, i.e. approved by Apple, to improve security against CFI attacks

    • CFI attack is also called control flow attack. If there is no such step detection, the attacker will pass in an illegal class from the outside, and this class will do some malicious operations, such as changing program control, or stack overflow operation, which is a threat to the security of Apple
    • The three functions apple allows are objc_duplicateClass, objc_initializeClassPair or objc_allocateClassPair
    • If this step detects that the value passed is indeed illegal, the program will throw an exception and interrupt execution, with the following code:

  • RealizeAndInitializeIfNeeded_locked only in lookupImpOrForward function call, the code as shown in the figure below:

Create a breakpoint as shown in the following figure, and run the source code to find that most of our new classes do not enter this breakpoint. Most of the classes that trigger breakpoints are OS_dispatch_data and OS_xpc_string.

Following the functions in the branch, we can finally find a function called realizeClassWithoutSwift, which has several interesting operations after the initial ro and Rw assignments. The code is as follows:

supercls = realizeClassWithoutSwift(remapClass(cls->getSuperclass()), nil); Realized Metacls = Realized With OutSwift (remapClass(CLS ->ISA()), nil); // Realized CLS ->setSuperclass(supercls); // Set the parent CLS ->initClassIsa(metacls); // Initialize isaCopy the code

A class and its associated superclass and metaclass are completed by recursive calls from the isa chain and inheritance chain, respectively, from point to surfacerealized, thus ensuring that these operations can be done in this step even if uninitialized or relized classes are encountered. For process analysis, please refer to the following figure:

At this point, the preparation for the method lookup is complete, and you can ensure that the lookup class is a valid, initialized class, with the conditions for the method lookup. Here we have to praise apple’s preciseness, each method before the formal process, Apple has done sufficient and effective fault tolerance processing, the robustness of the code is very high.

Third, method search process

3.1 Loop to find IMP overview

Once the preparation is done, the method lookup is done through a for loop, which looks like the following:

As you can see from the code, the for loop does not set a loop termination condition. Instead, the loop body determines the termination of the loop based on the condition. Through such a loop can iteratively call the search function, eventually find method IMP, or find the root class also failed to find IMP. The analysis of this process can be represented by the following flow chart:

As you can see from this flowchart, there are two processes for finding an IMP:

  • isConstantOptimizedCacheOpen process, which is called directly under the processcache_getImpAnd this is a function that we’re going to explore later, but I’m going to press no list. This process is only entered when shared caching is enabled and will not enter this branch due to the Mac OS project used in this exploration.CONFIG_USE_PREOPT_CACHESisConstantOptimizedCacheIs defined as follows:
    #if defined(__arm64__) && TARGET_OS_IOS && ! TARGET_OS_SIMULATOR && ! TARGET_OS_MACCATALYST #define CONFIG_USE_PREOPT_CACHES 1 #else #define CONFIG_USE_PREOPT_CACHES 0 #endif // IsConstantOptimizedCache is actually implemented in objc-cache.mm in detail, but it can be seen by definition only, CONFIG_USE_PREOPT_CACHES 0 #if CONFIG_USE_PREOPT_CACHES bool isConstantOptimizedCache(bool strict = false, uintptr_t empty_addr = (uintptr_t)&_objc_empty_cache) const; #else inline bool isConstantOptimizedCache(bool strict = false, uintptr_t empty_addr = 0) const { return false; }} #endifCopy the code
  • isConstantOptimizedCacheOpen processes,
    • The process starts with a call to getMethodNoSuper_nolock to find out if the current curClass contains the IMP to be looked for
    • If yes, the goto done process is executed and the loop is terminated
    • If not, it determines whether the parent of curClass is empty, that is, whether the current class is the NSObject root class. If it is the root class, imp was not found. Otherwise, cache_getImp is called and curClass has become the parent
    • If the resulting IMP does not exist, the loop is repeated as the parent until the IMP is found, or it is certain that the IMP cannot be found

Through the analysis of the overall flow of this loop body, we found a key function, which is to find the corresponding IMP function getMethodNoSuper_nolock in the class. Let’s continue to analyze this function.

3.2 getMethodNoSuper_nolock function analysis

By clicking on the getMethodNoSuper_nolock function, you can see that the implementation is as follows:

This function looks for the corresponding IMP by facilitating MehtodList and returns nil if it does not find it. Imp’s search_method_list_inline function looks like this:

There are two ways to look for IMPs in this function, looking for an unordered mehtodList and an ordered methodList, continuing with these two functions

  • FindMethodInUnsortedMethodList chaotic search function, is actually will methodlist traversal again, the code is as follows

  • Orderly lookup function findMethodInSortedMethodList, its code is as follows:

This function implements imp lookup through binary lookup. Let’s deduce the process through an example. Given a methodList of length [2, 3, 4, 5, 6, 7, 8, 9], find 8 as an example.

Before we start, two things should be clear

1. Count >> 1 has the same effect as dividing by two. For example, if 1000 >> 1 = 100, 8 >> 1 = 4

Probe > first && keyValue == (uintptr_t)getName((uint-1)) is used to judge whether there is a method with the same name in the classification. Here are the steps of deduction:

  • The search target is large, take the target IMP as 8 example, search in the list
    • First loop:First = 2, base = 2, count = 8, probe = base + (count >> 1) = 6;
      • Obtain the name of Probe and compare it with keyValue. If it is equal, it means it has been found, and then search for a method with the same name for classification. If there is, use the classification method.
      • If keyValue is greater than probeValue, it indicates that in the latter part of the loop, the base value changes to 7 and count decreases. The first loop ends with count = count >> 1 = 7 >> 1 = 3
    • The second loop: Base = 7, count = 3, probe = base + (count >> 1) = 7 + 1 = 8, exactly equal to the corresponding keyValue, that is, find the target
  • The search target is small. Take 3 as an example, search in the list
    • First loop:First = 2, base = 2, count = 8, probe = base + (count >> 1) = 6;
      • The name of Probe was obtained and does not match the target
      • KeyValue < probeValue: indicates that the keyValue is in the first half and base does not need to be changed. Count = count >> 1 = 8 >> 1 = 4
    • Second loop: base = 2, count = 4, probe = base + (count >> 1) = 4; , keyValue < probeValue, base does not need to change, but count = count >> 1 = 4 >> 1 = 2
    • Base = 2, count = 2, probe = base + (count >> 1) = 3; At this point, probeValue is equal to keyValue, and the target is found

MethodList certainly doesn’t store such simple numbers in a real-world lookup, but the idea of binary lookup is similar and is actually much more efficient than direct traversal.

If the IMP is not found through the above steps and the parent class is not nil, cache_getImp is called. We tried to follow through to see the implementation of the function, but could not find it, but a global search revealed that it is an assembly function, the code is as follows:

Find CacheLookup Mode for GETIMP implementation as follows:

CBZ P0, 9f sign imp if found, and return IMP; If the IMP found is nil, return it directly.

If the IMP is not found when CacheLookup is executed, LGetImpMissDynamic is executed, where #0 is assigned directly to P0, which returns nil.

3.3 Processing after the loop

After the loop, if the IMP is found, goto done is executed. The code for done is as follows:

done: if (fastpath((behavior & LOOKUP_NOCACHE) == 0)) { #if CONFIG_USE_PREOPT_CACHES while (cls->cache.isConstantOptimizedCache(/* strict */true)) { cls = cls->cache.preoptFallbackClass(); } #endif log_and_fill_cache(cls, imp, sel, inst, curClass); } static void log_and_fill_cache(Class cls, IMP imp, SEL sel, id receiver, Class implementer) { #if SUPPORT_MESSAGE_LOGGING if (slowpath(objcMsgLogEnabled && implementer)) { bool cacheIt = logMessageSend(implementer->isMetaClass(), cls->nameForLogging(), implementer->nameForLogging(), sel); if (! cacheIt) return; } #endif cls->cache.insert(sel, imp, receiver); // Call cache_t insert to cache the found method slowly}Copy the code

As you can see from this section of code, the method is cached again when the IMP is found in a slow process.

If the imp is not found after the loop, perform the following judgment:

    if (slowpath(behavior & LOOKUP_RESOLVER)) {
        behavior ^= LOOKUP_RESOLVER;
        return resolveMethod_locked(inst, sel, cls, behavior);
    }
Copy the code

The code in this branch will not normally execute, and if it does, it will only execute once. This will take us to a new stage, which we will explore in the next article.

conclusion

This article explores slow lookups after a quick lookups process, which loops through a class and a methodList in its parent to find IMPs, ending the method lookups process. The next stage will continue to explore, if the method can not be found, how does Apple deal with it, that is, the method resolution process, welcome you to continue to pay attention to, also hope to get you to the shortcomings of the point of correction.

The section on CFI refers to the following articles:

Circumventing execution flow protection with the latest code reuse attack (Part 1)

CFI/CFG security protection principle in detail

GDB | simple control flow was hijacked