Start with wild pointer detection

Some time ago, there were many wild pointer crashes on the local version of 58 APP. The crash stack did not have much effective information, but only told the crash occurred when the automatic release pool released the object.

Related problems are extremely difficult to reproduce. In the development stage, we successively enabled Zombie, Scribble and AddressSanitizer detection, but related crashes did not reproduce after a large number of tests, making it difficult to locate the problems. To locate the problem, we deployed an on-line wild pointer detection tool that captures the stack at the time of the crash, the type of the wild pointer object, and, if necessary, the lightweight stack at the time of the object’s release.

Lightweight stack: In order to save memory, we only saved the stack of APP call track, and optimized the address from 8 bytes to 4 bytes through offset address operation, so as to save space as much as possible.

One of the more difficult aspects of on-line Pointers is how to determine which classes are causing the problem. Only when you know what type is causing the problem can you establish effective monitoring. If the overreleased class name cannot be identified, the full monitoring capture class name can only be enabled, and then the corresponding stack monitoring can be further enabled based on the class name. However, full monitoring will affect APP performance and effectiveness of monitoring because the monitoring scope is too large. In order to determine the type of the occurrence of the wild pointer, we sacrificed part of memory (30MB) as the cache, and cast a wide net to detect the full range of light, expecting to open the capture of the release stack information and the occurrence of the wild pointer after the type is clear. Therefore, our custom classes in the APP and the system classes we use are included in the monitoring scope.

Type of symbol

So how do we determine which system classes we use in the project?

There are many ways to obtain the system class used by the current APP. For example, symbols can be obtained by commands such as nm, and internal and external symbols can be determined according to the symbol type.

The observant student will notice that the letters U, S, and D are prefixed before the nm output symbols, which in fact represent the type of symbol.

For symbol types, see the following symbol type description:

symbols instructions
A The value of this symbol is absolute and cannot be changed during subsequent linking. Such symbolic values often appear in the interrupt vector table, for example, symbols are used to represent the positions of each interrupt vector function in the interrupt vector table.
B The value of this symbol appears in the uninitialized data segment (BSS). For example, define the global static int test in a file. The symbol test is of type B and is located in the BSS section. The value represents the offset of the symbol in the BSS segment. In general, BSS segments are allocated in RAM
C The symbol is common. The common symbol is an uninitialized speech data segment. The symbol is not contained in a normal section. Allocation is only done during linking. The value of the symbol represents the number of bytes required for the symbol. For example, if int test is defined in a C file and is referenced elsewhere, the symbol type is C. Otherwise, the type is B.
D The symbol is in the initial speech data segment. In general, it’s assigned to the Data section. For example, if we define global int baud_table[5] = {9600, 19200, 38400, 57600, 115200}, it will be allocated to the initialization data segment.
G The symbol is also in the initialization data section. Small Object A method to improve access to small Data Objects.
I This symbol is an indirect reference to another symbol.
N This symbol is a debugging symbol.
R This symbol is located in the read-only data area. Const int test[] = {123, 123}; Test is a symbol for a read-only data area. Note that under Cygwin, if GCC is used to compile directly into MZ format, test in the source file corresponds to _test, and its symbol type is D, that is, in the initialization data segment. However, if you use a cross-compiling tool like M6812-ELf-gcc, test in the source corresponds to test in the target file, that is, there is no underline, and the symbol type is R. In general, it’s in the Rodata section. Note that if const char *test = “ABC” is defined in a function, const char test_int = 3. Even nm does not yield symbolic information, but the string “ABC” is allocated to ROM, and test is in the Rodata section and has size 4.
S Symbol is in the uninitialized data area and is used for small Objects.
T This symbol is in the Text section of the code area.
U The symbol is not defined in the current file, that is, the definition of the symbol is in another file. For example, if the current file calls a function defined in another file, the function called in this file is currently undefined; But the type in the file that defines it is T. But for a global variable, the symbol type is C in the file where it is defined and U in the file where it is used.
V This symbol is a weak object.
W This symbol is a weak symbol and has not been specifically marked as a weak object symbol.
The symbol is the STabs symbol in the A.out format file.
? The symbol type is not defined

Symbol Type Description From: Timago’s blog

We can see that if the symbol type is U, it indicates that the symbol is external. So UIViewController, UIViewController doesn’t have an implementation or definition in our binary, it’s implemented and defined in UIKit, so UIViewController is an external symbol for us, so it’s going to be decorated with U. Therefore, all system classes used in the current project can be retrieved by combining the nm command with the symbol type specification. Of course, this fetching does not include dynamic invocation.

Consider: can linkmap files also get the classes used in the APP? How do you do that?

Tip: You can use the file index in linkmap to determine the source of the symbol. If the file index corresponds to the system library, then it is the external symbol, which is the system class we use, etc.

Fetching system classes using commands combined with text analysis introduces additional versioning flows. We may have to analyze and export a list of system classes into the APP every time we package, which obviously makes the already complicated packaging process even worse. So think about how you’re going to get all the system classes you’re using inside your APP.

Think of bind

Following the example above, since UIViewController is not defined in our APP, we definitely don’t know its address until the APP runs. Only after we bind can we get the corresponding address. In the binary, if we’re using UIViewController, it’s actually going to be 0x0000000000000000 in the binary. After you bind, the corresponding address will be replaced with the real address of the UIViewController. So how does the system implement BIND? In practice, we use UIViewController a lot (for example, MyViewController inherits from UIViewController), and the system always fixes addresses one by one in bind, This means that there must be information in our binaries about exactly where and which classes are used, or the system can’t fix them.

Arcane LC_DYLD_INFO_ONLY

Compressed byte stream

Bind information is stored in LC_DYLD_INFO_ONLY, which can be visually seen through MachOView.

LC_DYLD_INFO_ONLY(see the code comment below) records the offset and length of the compressed byte stream that is required when dyLD loads the image.

/* * The dyld_info_command contains the file offsets and sizes of * the new compressed form of the information dyld Needs to * load the image. This information is used by dyld on Mac OS X * 10.6 and later. All information is pointed to by this command * is encoded using byte streams, so no endian swapping is needed * to interpret it. */ public struct dyld_info_commandCopy the code

The compressed byte stream is a format for LC_DYLD_INFO_ONLY data compression storage. The corresponding data can be obtained by parsing according to a specific solution. Apple uses compressed byte streams to store rebase & Bind information here to save as much storage as possible while ensuring efficient parsing.

Before iOS 15, iOS bind can be divided into three types, namely bind, lazy_bind and Weak_bind. After iOS 15, lazy_bind was removed.

IOS 15 here means that the iOS Deployment Target is set to iOS 15, not running on iOS 15. Because packages printed with the iOS Deployment Target < 15 need to run on low systems such as iOS 14, the packaged binaries do not really have the new features of iOS 15. When the iOS Deployment Target is set to iOS 15, the low end system is no longer supported, which is why the iOS binary changes.

The iOS version of bind’s compressed byte stream is as follows:

/* * Dyld binds an image during the loading process, if the image * requires any pointers to be initialized to symbols in other images. * The bind information is a stream of  byte sized * opcodes whose symbolic names start with BIND_OPCODE_. * Conceptually the bind information is a table of tuples: * <seg-index, seg-offset, type, symbol-library-ordinal, symbol-name, addend> * The opcodes are a compressed way to encode the table by only * encoding when a column changes. In addition simple patterns * like for runs of pointers initialzed to the same value can be * encoded in a few bytes. */ public var bind_off: UInt32 /* file offset to binding info */Copy the code

In short, from bind’s compressed byte stream, we can extract a list of tuples. Its meanings are as follows:

<seg-index, seg-offset, type, symbol-library-ordinal, symbol-name, addend> πŸ‘‡ Type (e.g. Pointer), lib library to which the address is located, symbol name, addend>Copy the code

We can check the bind information through MachOView, and we can easily read the corresponding content after the data is parsed by MachOView.

The VM Address 0x100015D50 to 0x100015D58 is an NSObject. Therefore, during bind, the system clearly knows which class address 0x100015D50 to 0x100015D58 needs to be changed.

Lazy_bind has the same data format as bind and is also a list of tuples after parsing. Symbols stored in lazy_bind do not initiate immediate binding, but rather trigger the binding when the symbol is first used. Generally, the NSLog function is lazy_bind, so the binding is triggered only when the APP uses NSLog for the first time.

So how exactly do you parse a compressed byte stream into a list of tuples?

Compression byte stream resolution

There is a special “syntax” for reading bind information. In this “syntax,” an 8-bit byte is split into two parts, with the highest four bits representing instructions and the lowest four bits representing data.

#define BIND_OPCODE_MASK					0xF0
#define BIND_IMMEDIATE_MASK					0x0F
Copy the code

Thus, a byte can identify up to 16 instructions and up to 16 data. Take the first byte in the data flow in the figure as an example. The data at offset = 0x181A0 is 0x12.

let opcode = 0x12 & BIND_OPCODE_MASK  
let immediate = 0x12 & BIND_IMMEDIATE_MASK
Copy the code

After operation, the command opcode = BIND_OPCODE_SET_DYLIB_ORDINAL_IMM == 0x10 Data is immediate == 0x02. BIND_OPCODE_SET_DYLIB_ORDINAL_IMM means that the following data represents the index value of the library, i.e. the index value of the library is 0x02. With this one byte, we can get one of the data in the tuple.

Similarly, we parse the next byte 0x40, which has the instruction BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM, which tells us that the following data is a symbol string. So strings in tuples are also available. So on, after completing the entire byte stream traversal can complete get yuan group list, the code you can reference dyld | | MachOView | | WBBlades. It should be mentioned here that if the data is longer than 4 bits, the compressed byte stream will tell us through special instructions what data to expect from the next byte, including the data type and encoding format. For example, the instruction BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB tells us that the lower 4 bits of the current byte represent the index of the segment, and that successive N bytes from this byte represent the offset of the pointer from the segment. It also tells us that the encoding format is ULEB.

ULEB128: Unsigned Little Endian Base 128

SLEB128: Signed Little Endian Base 128

ULEB128 and SLEB128 are variable-length encoding formats proposed to solve the waste of storage data, suitable for storing some usually small but occasionally large numbers of data. In iOS, the common data type is fixed length. For example, UInt8 is 1 byte and can represent data ranging from 0 to 63. UInt16 refers to the two-word power saving and represents data ranging from 0 to 65535. Suppose we have an array [1, 2, 3, 4, 160, 0, 1, 2, 3, 4, 5, 7], what type should the array be? Obviously UInt8(1Byte) is not enough, while UInt16(2Byte) is somewhat wasteful. Therefore, if the data can be flexibly scaled, the storage space waste problem can be solved. ULEB128 and SLEB128 can solve this problem. Take ULEB128 as an example. The LENGTH of ULEB128 can be at least 1 byte and usually less than 5 bytes. Each byte is split into two parts. The highest bit is the marker bit, which indicates whether there are subsequent bytes. If the highest bit is 1, it indicates that there are subsequent bytes, and if it is 0, it indicates that the current byte is the last byte of data. The lower 7 bits of each byte are used to represent data.

Here’s an example:

0xA001, which takes up two bytes, is binary: ‘10100000 00000001’.

Step1: after removing the marker bit -> ‘0100000 0000001’

Step2: Adjust the small end -> ‘0000001 0100000’

So the result is ‘10100000’, which is 160 in decimal

To see a fishhook

A brief introduction to Fishhook

Tuple list can be obtained by parsing BIND, lazy_bind and Weak_bind. Each tuple will tell us symbol and pointer information, including which segment the pointer is located in and the offset in the segment. Wait, is a tuple list like a symbol table? Is there something you can do with tuple lists? Fishhook came to my mind when I saw that there might be students here. Fishhook is an open source Facebook C function Hook framework. It can hook some external C functions at runtime (the ones we wrote inside the APP can’t hook because bind is not involved).

There are many articles that introduce the fishhook principle, so the details of the fishhook implementation are not discussed here. If you are interested, you can search related articles to learn about the principle. Those who want a direct and simple overview of fishhook’s general thinking can read the summary below.

In iOS, we use variables or external functions like NSLog() that don’t call the address directly, but bind or lazy_bind to get the actual address. The actual function address after bind or lazy_bind is recorded in NL_symbol_ptr or LA_symbol_ptr, and the symbol table can be used to find the nl_SYMBOL_ptr or LA_symbol_ptr address for each function. Fishhook implements THE hook of C functions by searching the symbol table and finding the address of the function pointer to modify the function pointer.

Bind is already bound when the image is loaded, whereas lazy_bind is triggered when it is first used.

Consider: How does lazy_bind implement bind the first time a function is called?

The teacher has a list of all the students who are going to be on duty. Li Si was supposed to be on duty today, but the name of Zhang SAN, the monitor, was printed because the teacher in the teaching affairs office did not know Li Si’s name when printing the list. The teacher only recognize the list, so the teacher find zhang SAN cleaning. But Joe only did one thing, that is, he changed the name on the list to Li Si, and asked Li Si to clean the house. In this way, if the teacher orders cleaning in the future, he will directly find Li Si. That’s lazy_bind. The teacher in the story is the code we wrote, and the code only recognizes the address. The list is la_symbol_ptr, and it has the names of the students on duty. The Stub mechanism is the stub mechanism, and it just plays a supporting role. Li4 is the real external function, the function that needs to be executed.

One of the interesting things about fishhook’s recent changes is that front-end Time often reports fishhook’s crashes on iOS 14.5 and on iOS 15. The main problem is that Fishhook writes protection while writing Pointers. Fishhook #87 Pr has explained and revised.

Fishhook is not the only solution for C functions

In addition to Fishhook, the author also has a static hook method for C functions. Compared with Fishhook, this scheme does not have time-consuming search and comparison operations. Next, I will introduce this kind of special scheme: C function hook based on dynamic library.

The schema doesn’t have any code, but it’s a little harder to understand.

Step1: First, define a function with the same name and parameter and return in the main project. In this way, ld64 link will consider the NSLog used in func1 and func2 as our self-defined functions, so that it will not match the functions in the system library, and NSLog will not be marked as a function requiring bind.

Step2: inside our custom NSLog, we call the intermediate function MyNSLog of the custom dynamic library. This step is to call the real NSLog

Step3: Since there is no custom NSLog in the dynamic library to “cheat” LD64, the dynamic NSLog will call the real system function.

Now, some of you might say, “Isn’t there a risk that the NSLog of the dynamic library will call the NSLog function of the main program again? Wouldn’t that be circular?”

It won’t. Dynamic libraries are the product of compilation and linking processes. The NSLog bind to the NSLog in the system library is already written in the binary at the time of the link, so dyLD does not “defy” the binary commands to execute the NSLog in the main program at startup.

The scheme has been running online for nearly a year on a local APP. However, because it is very intrusive, it is only used for some functions that need to be started synchronously. Fishhook was also the main use of the project.

The third C function hook scheme

We mentioned above that the list of tuples extracted from the compressed byte stream contains a map of addresses and symbols. Is it possible to hook C functions with tuple lists? It is found to be feasible through demo practice. The code has been uploaded to WBBlades->ChubbyCat, GitHub search for WBBlades, demo in ChubbyCat directory.

typealias MYNSLogType = @convention(thin) (_ format: String, _ args: CVarArg...) -> Void func MYNSLog(_ format: String, _ args: CVarArg...) { print("test success") } class Test: NSObject { func test() { let replacement = unsafeBitCast(MYNSLog as MYNSLogType, to: UInt64.self) let ret = ChubbyCatHook.replaceC(name: "_NSLog", replacement:replacement) print("hook result = \(ret)") } }Copy the code

However, this solution will fail in iOS Deployment Target == 15. If you want to further solve the problem, you need to obtain the disk file from the sandbox online for auxiliary parsing. Therefore, it is only used for learning and communication.

LC_DYLD_CHAINED_FIXUPS for iOS 15

On iOS15, the way the APP rebase & bind has changed.

If we set the iOS Deployment Target to 15, viewing the packaged Mach-O file through MachOView will see unsupported LC on the new binary.

This is due to LC_DYLD_INFO_ONLY being replaced with the new LC_DYLD_EXPORTS_TRIE and LC_DYLD_CHAINED_FIXUPS.

Another unsupported LC is LC_BUILD_VERSION: /* Build for platform min OS version */

The file changes mean that iOS 15’s rebase and bind mechanisms have changed. Looking back at iOS 14 and before, DyLD implemented Rebase and bind by parsing compressed byte streams. Parsing the compressed bytes tells DyLD which addresses in the entire binary file need to be fixed, as well as which external symbol each address is reserved for during bind. So how did iOS 15 dyLD get fixed? So let’s explore dyLD.

dyld3 ? Dyld4?

Some time ago, I heard some students discussing the update of iOS 15 dyLD3 to dyLD4. I’m not sure if Apple has secretly updated DyLD, but there are hints that dyLD has changed. For example, in Instrument we can see that the namespace of some functions has changed to DYLD4. There are also some changes in some API calls, such as:

let header:UnsafePointer<mach_header> = _dyld_get_image_header(0)
Copy the code

Before iOS 15, when retrieving the header by index, if index == 0, the executable’s header was returned. In iOS 15, however, index == 0 gets the system library. Of course these changes may not be enough to have an impact on our business code, but they certainly are. LC_DYLD_CHAINED_FIXUPS is a new feature of DYLD? My answer is no. LC_DYLD_CHAINED_FIXUPS has already been embedded in dyLD since dyLD-852.2 of DYLD3, but only after the Mach-O file is changed at iOS Deployment Target == 15, can the corresponding code branch be inserted.

Return to the exploration of BIND

Rebase & Bind has changed in iOS 15. Here’s how iOS 15 can make your app launch faster. Interested students can read this article carefully, want to directly see the conclusion of the students can see the following text, I do a summary and summary of the content of the article.

In iOS 15, the compressed byte streams used for Rebase & Bind have been replaced with Fixup – Chains. When iOS starts, Dyld determines whether there are fixup-Chains first. If there are fixup-Chains, parsing is carried out according to fixup-Chains; otherwise, parsing is carried out according to compressed byte stream. The purpose of parsing is to correct the address of the application. Segment_pages -> Pages chains -> Fixup-Chains LC_DYLD_CHAINED_FIXUPS tells us how many segments there are, how many pages there are in each segment, and where fixup-chains are in each page. The pointer in Fixup-Chains points to every rebase or bind address in the current page. The data stored in these addresses is not 0x00 like before iOS 15, but 8 bytes of meaningful data in a certain format. This short 8-byte data is divided into multiple bits according to different structures, and each or successive bits has its special meaning for infering all the information required by rebase or bind. IOS 15 does away with lazy_bind(Weak_bind remains), and since Rebase and bind are integrated into a linked list, traversing the list once can complete the rebase and bind required for a page.

Why Can Fixup-Chains start up faster?

Prior to iOS 15, rebase and bind information was stored separately in the compressed byte stream. This means that dyLD will first iterate through the address recorded by the Rebase compressed byte stream to modify the address during startup. If the address is page fault for N times, the rebase pages will not be released because they are dirty pages written to data. IOS uses compression to optimize dirty pages that have not been used recently. Then, when you bind, you iterate over the addresses recorded by the BIND compressed byte stream, assuming you need to bind M pages. There may be a lot of overlap between the N and M Pages sets, which may cause a second pass, and iOS may have compressed some of the dirty Pages. In this case, bind needs to unpack the overlapping Pages. Rebase and Bind for the same page are integrated into a linked list, so you don’t have to iterate through the same page, and you don’t have to extract the same page.

doubt

Can Fixup-Chains reduce page Falut frequency? B: No, it’s still M βˆͺ N

Some people ask if this is iOS doing the binary rearrangement for us? : It’s totally different. Page faults are mentioned, but the stages are different.

Return to wild pointer detection

Going back to the beginning of the article, we obtained all the system classes used in the project by parsing the compressed byte stream while doing the wild pointer detection. In iOS 15 without compressed byte streams, how do you get classes in your project using Fixup-Chains?

If we look at the header file, we can find the fixups_header of the current file from the linkedit_data_command structure pointed to by LC_DYLD_CHAINED_FIXUPS.

Fixups_header is the entry to the fixUPS information as follows:

// header of the LC_DYLD_CHAINED_FIXUPS payload
struct dyld_chained_fixups_header
{
    uint32_t    fixups_version;    // 0
    uint32_t    starts_offset;     // offset of dyld_chained_starts_in_image in chain_data
    uint32_t    imports_offset;    // offset of imports table in chain_data
    uint32_t    symbols_offset;    // offset of symbol strings in chain_data
    uint32_t    imports_count;     // number of imported symbol names
    uint32_t    imports_format;    // DYLD_CHAINED_IMPORT*
    uint32_t    symbols_format;    // 0 => uncompressed, 1 => zlib compressed
};
Copy the code

We won’t cover the rest of the information in the structure, but we’ll just focus on symbols_offset here. Symbols_offset records the offset of the symbolic list. Symbols_offset is not a symbol table. Instead, it stores the strings required by Bind. The symbols recorded here are not used in strtab, presumably to buy space for time, which is relatively small compared to the amount of storage required to start bind. The approximate location of symbols_offset is shown below πŸ‘‡πŸ» :

Symbols_offset is at the location where the compressed byte stream was stored, which is not visible in the figure because MachOView does not support the presentation of Fixups.

Going back to Symbols_offset, we can easily get all the external symbols by walking through the string table.

//symbols_format: 0 => uncompressed if fixup.pointee.symbols_format == 0 { print("import count = \(fixup.pointee.imports_count)") let symbolStarts = UInt(fixup.pointee.symbols_offset) + ptr var length : UInt = 0 for _ in 0 .. < fixup.pointee.imports_count { let location = symbolStarts + length let symbol = UnsafeMutablePointer<CChar>(bitPattern: UInt(location)) if let name = symbol{ length += UInt(strlen(name)) + 1 print(String.init(cString: name)) } } }else if fixup.pointee.symbols_format == 1{ //TODO: zlib compressed }Copy the code

In the demo above, print the following snippet:

. . _OBJC_CLASS_$_UISceneConfiguration _OBJC_CLASS_$_NSException _OBJC_CLASS_$_UIViewController _OBJC_METACLASS_$_UIViewController _OBJC_METACLASS_$_NSObject _OBJC_METACLASS_$_UIResponder _OBJC_CLASS_$_UIResponderCopy the code

Keep in mind the information printed here, which we’ll call “imports Symbols” for now, and we’ll use it later

At this point, the data needed for our wild pointer detection can be applied even on the lowest version of the iOS 15 compiled package. In iOS, you need to know which address you need to bind which library which symbol. In iOS, you need to know which address you need to bind which library which symbol. In iOS, you need to know which address you need to bind which library which symbol. For example, if we want to call the NSLog() function, before iOS 15, the compressed byte stream would tell us which address of 8 bytes is the pointer to the NSLog() function. And before bind, use 8 bytes of 0x00 to fill in a location in the file, assuming the location is 0x140000 bytes, then all 8 bytes after 0x14000 will be 0x00. In iOS 15, 0x140,000 bytes are no longer stored at 0x00. It’s a specific 8-byte data like the one shown below. So what function does 0x801000000000000A represent?

How to implement Bind in Fixup-Chains

Dychained_fixups_header = starts_offset; dyCHAINed_fixups_header = starts_offset; That’s dyLD_CHAINed_STARts_in_image. In the dyLD_CHAINed_STARts_in_image structure, the first four bytes represent the number of segments that need to be rebase& bind for the current image, followed by the offset of each segment.

Dyld_chained_starts_in_segment = dyLD_CHAINed_STARTS_in_segment

struct dyld_chained_starts_in_segment
{
    uint32_t    size;               // size of this (amount kernel needs to copy)
    uint16_t    page_size;          // 0x1000 or 0x4000
    uint16_t    pointer_format;     // DYLD_CHAINED_PTR_*
    uint64_t    segment_offset;     // offset in memory to start of segment
    uint32_t    max_valid_pointer;  // for 32-bit OS, any value beyond this is not a pointer
    uint16_t    page_count;         // how many pages are in array
    uint16_t    page_start[1];      // each entry is offset in each page of first element in chain
};
Copy the code

Dyld_chained_starts_in_segment = dyLD_CHAINed_STARts_IN_segment = dyLD_CHAINed_STARts_in_segment Where page_start tells us the offset of fixup-Chains for each page in that page.

“Chains” is a series of pages, so you can find the start address of all Fixup-Chains in the whole section if you know the fixed size of page and the number of page.

for pageIndex in 0 .. < segment.pointee.page_count { let offsetInPage : UInt16 = segment.pointee.page_start + 16 * pageIndex; if offsetInPage == DYLD_CHAINED_PTR_START_NONE { continue } //32-bit chains which may need multiple starts per page if (offsetInPage & UInt16( DYLD_CHAINED_PTR_START_MULTI)) ! = 0 { print("multiple starts per page") }else{ // one chain start per page let chainStart = UInt64(headPtr) + segment.pointee.segment_offset + UInt64(offsetInPage) + UInt64(pageIndex * segment.pointee.page_size) let chainContentPtr = UnsafePointer<UInt>(bitPattern: UInt(chainStart)) print("page:\(pageIndex) first pointer = \(String(format: "0x%llx", chainStart)) chainContent = \(String(format: "0x%llx", (chainContentPtr? .pointee ?? 0))))}}"Copy the code

The above code is abstracted from DYLD-852.2. You can get a Fixup chains list for every page of every segment.

In this step, we actually get the first element of the list, the head of the list. So how do you get to the next element in the list? Each element in the linked list has several bits to identify the offset from the next pointer. How many bits is that? In ARM64, if the stride is 4, you need 12 bits. If stide = 8, 11 bits are required.

For example: If the stride = 4&&next = 1, it means that the distance of the next pointer is the stride * next = 4 bytes. If the stride = 8 && next = 10, it indicates that the distance of the next pointer is the stride * next = 80 bytes.

In short, whether next is represented in 11 bit or 12 bit, it always covers 16KB of a page.

After retrieving the data in the linked list, we actually know the address and data, but the corresponding symbol is still unknown.

address data Unknown symbol
0x10453c000 0x801000000000000A “_NSLog”
0x10453c008 0x801000000000000B “_printf”
. . .

And the data in this list just to make it easier for you to understand is the data before BIND. For example, the pointer 0x10453C000 points to a function address like 0x185FD0150 instead of 0x801000000000000A at runtime. This makes sense, our code runs after bind, and we can’t read a raw value like 0x801000000000000A unless we fetch the Mach-o file on disk, except from a file on disk.

So we want to know how to get the function name _NSLog from 0x801000000000000A.

In the dyLD_CHAINed_STARts_IN_segment structure, there is a member pointer_format. Pointer_format describes the data parsing format for the current segment. There are 12 types of parsing.

// values for dyld_chained_starts_in_segment.pointer_format
enum {
    DYLD_CHAINED_PTR_ARM64E                 =  1,    // stride 8, unauth target is vmaddr
    DYLD_CHAINED_PTR_64                     =  2,    // target is vmaddr
    DYLD_CHAINED_PTR_32                     =  3,
    DYLD_CHAINED_PTR_32_CACHE               =  4,
    DYLD_CHAINED_PTR_32_FIRMWARE            =  5,
    DYLD_CHAINED_PTR_64_OFFSET              =  6,    // target is vm offset
    DYLD_CHAINED_PTR_ARM64E_OFFSET          =  7,    // old name
    DYLD_CHAINED_PTR_ARM64E_KERNEL          =  7,    // stride 4, unauth target is vm offset
    DYLD_CHAINED_PTR_64_KERNEL_CACHE        =  8,
    DYLD_CHAINED_PTR_ARM64E_USERLAND        =  9,    // stride 8, unauth target is vm offset
    DYLD_CHAINED_PTR_ARM64E_FIRMWARE        = 10,    // stride 4, unauth target is vmaddr
    DYLD_CHAINED_PTR_X86_64_KERNEL_CACHE    = 11,    // stride 1, x86_64 kernel caches
    DYLD_CHAINED_PTR_ARM64E_USERLAND24      = 12,    // stride 8, unauth target is vm offset, 24-bit bind
};
Copy the code

To clarify how bind looks up symbols, we chose one of the types, DYLD_CHAINED_PTR_64. DYLD_CHAINED_PTR_64 = 8 bytes

// DYLD_CHAINED_PTR_64 struct dyld_chained_ptr_64_bind { uint64_t ordinal : 24, addend : 8, // 0 thru 255 reserved : 19, // all zeros next : 12, // 4-byte stride bind : 1; / / = = 1};Copy the code

In DYLD, if the type is DYLD_CHAINED_PTR_64, dyLD will do the following to get the value to bind.

if ( fixupLoc->generic64.bind.bind ) {
    newValue = (void*)((long)bindTargets[fixupLoc->generic64.bind.ordinal] + fixupLoc->generic64.signExtendedAddend());
}
Copy the code

We don’t need to know anything about this code here, we just need to know two things:

  • bind 与 bindTargetsArrays, which means that symbol andbindTargetsThe relevant.
  • bindAnd the above structuredyld_chained_ptr_64_bindIn theordinalThe relevant. This also means that symbol andordinalThe relevant.

Then we need to look at how bindTargets are generated and in what order. In dyLD3 there is the following code:

for (uint32_t i=0; i < header->imports_count && ! stop; ++i) { const char* symbolName = &symbolsPool[imports[i].name_offset]; . . // Always callback, in which the data is continuously stored to bindTargets. callback(libOrdinal, symbolName, 0, imports[i].weak_import, stop); }Copy the code

In the above code, we know that callback keeps storing data to bindTargets. Recall that we mentioned the imports information and got all the symbols for the imports (remember the imports symbols above).

Which means thatordinal == 0Means that the symbol of the first element in bindTargets isimports symbolsThe first string of.

So dyLD gets 0x801000000000000A from the binary file and knows its ordinal is 0x0A = 10 by dyLD_CHAINed_ptr_64_bind. The corresponding symbol is 0x0A of the imports Symbols symbol, which in my test demo is “_NSLog”.

conclusion

So far, we have introduced all the explorations caused by wild pointer detection. Here is a brief summary:

  • Wild pointer probe we expect to be able to get all usedThe system class, such asUIViewWait, unused system classes do not want to get.
  • It introduces the concept of symbol types and how to achieve this through command + text.
  • The concept and analytic method of compressed byte stream are introduced.
  • Briefly introducedfishhookBy compressing the symbol and address mapping of byte streams (i.e., the tuple above), C function replacement is implemented without symbol table.
  • Packaged files based on iOS 15 no longer have a compressed byte stream, but instead usefixup-chainsTo achieve efficientrebase & bind.
  • By parsingimportsWe can obtain all the system symbols used, which can meet the needs of our wild pointer detection.
  • To explore thebindHow do the symbols and addresses of thebindIn theordinalforimports symbolsArray index subscript conclusion.

The authors reported

Hello everyone, my name is Deng Zhuli. I am currently working in the Platform Technology Department of 58.com – User and Value Growth Center. Thank you very much for taking the time to read this long article. The reason for writing this article is that I have done a lot of technical research recently, but I have not sorted out these contents. Therefore, I want to record my exploration and discovery through an article, which may be used by some people.

Due to the limitation of personal ability and time, relevant conclusions and opinions will inevitably have mistakes. If you have any questions during the reading process, you can leave a message or add my wechat for discussion.

I am a person who likes to share and summarize, because sharing and summarizing will prompt me to think more deeply and reflect on whether my strategies and plans can stand up to scrutiny. About dynamic library lazy loading technology, log symbolization, APP stuck, wild pointer detection, Mach-O exploration and other topics can leave a message exchange ~

Reference:

1. Fishhook #87: Discussion of the Fishhook crash and its fix.

2. A brief analysis of the implementation principle of Fishhook: a good article explaining the fishhook principle.

3. Why fishhook crash on iOS 14.5: An explanation of fishhook crash

5. Explain Lazy/ non-lazy Binding to interns.

IOS 15 How to make your apps launch faster: A rare article about iOS 15 Fixup

7, DYLD: DYLD source download address.

Radare: Code for fixup-Chains parsing

Timago’s blog: Symbol type introduction reference article.

How Apple has supercharged app launching in iOS 15 and macOS Monterey: Another post on iOS 15 Fixups, but basically an introduction to Noah Martin’s article.