series
- IOS Jailbreak Principles – Sock Port (UI
- IOS Jailbreak Principles – Sock Port (2) Leak the Port Address through Mach OOL Message
- IOS Jailbreak Principles – Sock Port (CST
- IOS Jailbreak Principles – Sock Port
- IOS Jailbreak Principles – Undecimus Analysis
preface
There are many key variables and checksums in the kernel, and to obtain these variables and bypass the checksums requires us to locate these addresses in memory. This article introduces the Undecimus method for locating key memory locations based on String XREF. This method can not only accurately locate specific elements in the kernel, but also provide a good inspiration for designing your own binary analysis tools.
Locate the Kernel Task
To obtain Kernel information, we need to locate the address of the Kernel Task and read the contents through the kread of TFP0. To locate the Kernel Task, the key is to find the code that obtains the Kernel Task, then try to locate the code from memory, and then analyze the instructions to solve the file offset of the variable.
Find functions that use Kernel Task
The following code can be found in xnu-4903.221.2 to access the Kernel Task:
int
proc_apply_resource_actions(void * bsdinfo, __unused int type, int action)
{
proc_t p = (proc_t)bsdinfo;
switch(action) {
case PROC_POLICY_RSRCACT_THROTTLE:
/* no need to do anything */
break;
case PROC_POLICY_RSRCACT_SUSPEND:
task_suspend(p->task);
break;
case PROC_POLICY_RSRCACT_TERMINATE:
psignal(p, SIGKILL);
break;
case PROC_POLICY_RSRCACT_NOTIFY_KQ:
/* not implemented */
break;
case PROC_POLICY_RSRCACT_NOTIFY_EXC:
panic("shouldn't be applying exception notification to process!");
break;
}
return(0);
}
Copy the code
Here is the string “shouldn’t be applying exception notification to process!” It can be used as an aid to location. After compilation, it is stored in the __TEXT and __cstring segments, which are searched in memory to find the location of the string, called location_str.
Locate the String XREF in the function
Since ARM addressing often takes two instructions to complete, we need to statically analyze the code snippet in order to locate code that uses location_str. A cross-reference (XREF) is found when a value in the register is found equal to location_str, In this way we can locate the statement panic(“shouldn’t be applying exception notification to process!” ) corresponding instruction address.
Backtrack to find Kernel Task XREF
The fastest way to locate a Kernel Task is to go back to task_suspend(p-> Task), which must be addressed when task_suspend accesses p-> Task for the first time. Add the Kernel’s base address in memory to get the address of the Kernel Task.
kern_return_t
task_suspend(task_t task)
{
kern_return_t kr;
mach_port_t port, send, old_notify;
mach_port_name_t name;
if (task == TASK_NULL || task == kernel_task)
return (KERN_INVALID_ARGUMENT);
task_lock(task);
// ...
Copy the code
It can be seen from the above analysis that the key problem lies in XREF location. Next, we will analyze a String Based XREF location algorithm to solve the above problems.
Load Kernelcache in memory
According to the definition of Kernelcache given by iPhone Wiki [1] :
The kernelcache is basically the kernel itself as well as all of its extensions (AppleImage3NORAccess, IOAESAccelerator, IOPKEAccelerator, etc.) into one file, Then Packed /encrypted in an IMG3 (iPhone OS 2.0 and above) or 8900 (iPhone OS 1.0 through 1.1.4) Container.
Kernelcache is a package of the kernel and its extensions in a file stored in IMG3 format (iOS 2 and above).
In the previous article we introduced the method based on tfp0 sandbox escape, escape through the sandbox we can from the/System/Library/Caches/com. Apple. Kernelcaches/read kernelcache kernelcache, It is both a mirror of the current system load.
The reader can open Undecimus’ jailbreak. M file and search “Initializing Patchfinder “to locate the loading code for KernelCache, similar to the normal Mach-O file. The Mach Header and Load Commands are read and the offsets are recorded in the init_kernel function.
I won’t go over the loading process here, but just point out a few key global variables:
cstring_base
和cstring_size
是__TEXT,__cstring
Segment virtual address and length;xnucore_base
和xnucore_size
是__TEXT,__TEXT_EXEC
Segments, that is, the virtual address and length of the code segment;kerndumpbase
Is the smallest virtual address of all segments, i.e. the virtual base address loaded by kernelCache, in the normalMach-O
This value in a file is usually__PAGEZERO
The segment’s virtual address 0x100000000 appears to be in the kernel__TEXT
Segment virtual address 0xFFFFFFF007004000;kernel
Is a complete mapping of KernelCache in user space, that is, a fully loaded kernel image.
Find String Based XREF
Include a find_strref function in Undecimus to locate the XREF of the string:
addr_t
find_strref(const char *string.int n, enum string_bases string_base, bool full_match, bool ppl_base)
{
uint8_t *str;
addr_t base;
addr_t size;
enumtext_bases text_base = ppl_base? text_ppl_base:text_xnucore_base;switch (string_base) {
case string_base_const:
base = const_base;
size = const_size;
break;
case string_base_data:
base = data_base;
size = data_size;
break;
case string_base_oslstring:
base = oslstring_base;
size = oslstring_size;
break;
case string_base_pstring:
base = pstring_base;
size = pstring_size;
text_base = text_prelink_base;
break;
case string_base_cstring:
default:
base = cstring_base;
size = cstring_size;
break;
}
addr_t off = 0;
while ((str = boyermoore_horspool_memmem(kernel + base + off, size - off, (uint8_t *)string.strlen(string)))) {
// Only match the beginning of strings
// first_string || \0this_string
if ((str == kernel + base || *(str- 1) = ='\ 0') && (! full_match ||strcmp((char *)str, string) = =0))
break;
// find after str
off = str - (kernel + base) + 1;
}
if(! str) {return 0;
}
// find xref
return find_reference(str - kernel + kerndumpbase, n, text_base);
}
Copy the code
It requires passing in the string string, reference number N, base segment string_base, whether full_match exactly, and whether it is in the __PPLTEXT segment. For Kernel Task scenarios, our input parameters are as follows:
addr_t str = find_strref("\"shouldn't be applying exception notification".2, string_base_cstring, false.false);
Copy the code
That is, __TEXT,__cstring as the benchmark, does not require an exact match, find the address of the second cross-reference.
Location string address
The location logic for the string address is in the boyermoore_horspool_memmem function:
static unsigned char *
boyermoore_horspool_memmem(const unsigned char* haystack, size_t hlen,
const unsigned char* needle, size_t nlen)
{
size_t last, scan = 0;
size_t bad_char_skip[UCHAR_MAX + 1]; /* Officially called: * bad character shift */
/* Sanity checks on the parameters */
if (nlen <= 0| |! haystack || ! needle)return NULL;
/* ---- Preprocess ---- */
/* Initialize the table to default value */
/* When a character is encountered that does not occur * in the needle, we can safely skip ahead for the whole * length of the needle. */
for (scan = 0; scan <= UCHAR_MAX; scan = scan + 1)
bad_char_skip[scan] = nlen;
/* C arrays have the first byte at [0], therefore: * [nlen - 1] is the last byte of the array. */
last = nlen - 1;
/* Then populate it with the analysis of the needle */
for (scan = 0; scan < last; scan = scan + 1)
bad_char_skip[needle[scan]] = last - scan;
/* ---- Do the matching ---- */
/* Search the haystack, while the needle can still be within it. */
while (hlen >= nlen)
{
/* scan from the end of the needle */
for (scan = last; haystack[scan] == needle[scan]; scan = scan - 1)
if (scan == 0) /* If the first byte matches, we've found it. */
return (void *)haystack;
/* otherwise, we need to skip some bytes and start again. Note that here we are getting the skip value based on the last byte of needle, no matter where we didn't match. So if needle is: "abcd" then we are skipping based on 'd' and that value will be 4, and for "abcdd" we again skip on 'd' but the value will be only 1. The alternative of pretending that the mismatched character was the last character is slower in the normal case (E.g. finding "abcd" in "... azcd..." gives 4 by using 'd' but only 4-2==2 using 'z'. */
hlen -= bad_char_skip[haystack[last]];
haystack += bad_char_skip[haystack[last]];
}
return NULL;
}
Copy the code
We first parse the input parameters according to the call:
addr_t base = cstring_base;
addr_t off = 0;
while ((str = boyermoore_horspool_memmem(kernel + base + off, size - off, (uint8_t *)string.strlen(string)))) {
// Only match the beginning of strings
// first_string || \0this_string
if ((str == kernel + base || *(str- 1) = ='\ 0') && (! full_match ||strcmp((char *)str, string) = =0))
break;
// find after str
off = str - (kernel + base) + 1;
}
Copy the code
- Haystack = kernel + base + off, i.e
__TEXT,__cstring
The starting address of the segment; - Hlen = size-off, that is
__TEXT,__cstring
Section length; - Needle = string = pointer to the string to be searched;
- Nlen = strlen(string) is the length of the string to be searched.
A bad_char_SKIP array is first maintained at the beginning of the function to record how many characters should be skipped to avoid meaningless matches when a match fails. The whole algorithm adopts the reverse order scanning mode, continuously scans forward from haystack[NEEDLE_LEN-1] and checks haystack[I] == needle[I]. If the condition is still met when haystack[0] is matched, it means that the address of the string is found. Otherwise, search the BAD_char_SKIP table based on the failed character and move the haystack pointer back to continue matching.
Note that the string address obtained after a successful match is a kernelCache map with respect to user spacekernel
Is not the actual address of the string in the kernel.
Searches for addressing operations at the address of a string
After obtaining the string’s user-space address STR, we first need to calculate its virtual address in KernelCache:
addr_t str_vmaddr = str - kernel + kerndumpbase;
Copy the code
A reference to STR in kernel code must involve addressing str_vmaddr. The main addressing methods are as follows:
; 1
adrp xn, str@PAGE
add xn, xn, str@PAGEOFF
; 2
ldr xn, [xm, #imm]
; 3
ldr xn, =#imm
; 4
adr xn, #imm
; 5
bl #addr
Copy the code
Return find_reference(str_vmADDR, n, text_base). __TEXT_EXEC and __text are statically analyzed by find_reference. Register operations are simulated for address-related instructions. The main logic in the Xref64 function is that a cross reference to STR is found when the value in the register is found to be equal to str_vmADDR.
The code here is mainly the decoding and operation of the machine code, the longer length will not be posted, readers interested can read by themselves.
Locate the variable address via String XREF
Now that we have a reference to STR in the target function proc_apply_resource_actions, we need to go back up to the task_suspend instruction:
addr_t find_kernel_task(void) {
/** adrp x8, str@PAGE str --> add x8, x8, str@PAGEOFF bl _panic */
addr_t str = find_strref("\"shouldn't be applying exception notification".2, string_base_cstring, false.false);
if(! str)return 0;
str -= kerndumpbase;
// find bl _task_suspend
addr_t call = step64_back(kernel, str, 0x10, INSN_CALL);
if(! call)return 0;
addr_t task_suspend = follow_call64(kernel, call);
if(! task_suspend)return 0;
addr_t adrp = step64(kernel, task_suspend, 20*4, INSN_ADRP);
if(! adrp)return 0;
addr_t kern_task = calc64(kernel, adrp, adrp + 0x8.8);
if(! kern_task)return 0;
return kern_task + kerndumpbase;
}
Copy the code
The whole process is mainly divided into three steps:
- Go back to find
bl _task_suspend
, and solve thetask_suspend
Address of the function; - from
task_suspend
Function searches back for the first ADRP instruction, which is addressing the Kernel Task; - Solve the Kernel Task address from the addressing instruction.
Let’s go back to the proc_apply_resource_actions function fragment:
switch(action) {
case PROC_POLICY_RSRCACT_THROTTLE:
/* no need to do anything */
break;
case PROC_POLICY_RSRCACT_SUSPEND:
task_suspend(p->task);
break;
case PROC_POLICY_RSRCACT_TERMINATE:
psignal(p, SIGKILL);
break;
case PROC_POLICY_RSRCACT_NOTIFY_KQ:
/* not implemented */
break;
case PROC_POLICY_RSRCACT_NOTIFY_EXC:
panic("shouldn't be applying exception notification to process!");
break;
}
Copy the code
The machine code is not necessarily generated in case order at compile time, so we need to find the actual representation in KernelCache based on STR XREF, A simple way to do this is to apply exception notification in find_strref(“\”shouldn’t be applying exception notification”, 2, string_base_cstring, false, False), then break a breakpoint to obtain the file offset of STR XREF and disassemble kernelCache with binary analysis tool.
Breakpoint debugging shows that STR XREF is at 0x0000000000F9F084, which should be an add instruction:
/** adrp x8, str@PAGE str --> add x8, x8, str@PAGEOFF bl _panic */
Copy the code
When opened in the Mach-O viewer, 0x0000000000F9F084 is indeed an Add instruction:
There are two ways to locate task_suspend(p->task). One is that P -> Task is an offset-based structure with obvious features of member addressing, and the other is to look at the preparation of parameters before a function call. There is an offset address of +16 at 0xf9F074, which is obviously a calculation of the p-> Task address, so 0xf9F078 is a call to task_suspend(p->task).
You can retrieve the address of task_suspend from the CALL directive by going back three instructions from the add directive:
// find bl _task_suspend
addr_t call = step64_back(kernel, str, 0x10, INSN_CALL);
if(! call)return 0;
addr_t task_suspend = follow_call64(kernel, call);
if(! task_suspend)return 0;
Copy the code
The adRP statement of the Kernel Task can be found by searching the first ADRP instruction from the start address of the task_suspend function. The address of the Kernel Task can be calculated by statically analyzing adrp & add:
addr_t adrp = step64(kernel, task_suspend, 20*4, INSN_ADRP);
if(! adrp)return 0;
addr_t kern_task = calc64(kernel, adrp, adrp + 0x8.8);
if(! kern_task)return 0;
Copy the code
Note that we still get fileOff, and we need to add kerndumpbase to get the virtual address:
return kern_task + kerndumpbase;
Copy the code
Note that if you want to read Kernel tasks from the Kernel, you need to add kernel_slide to this address. The code for calculating kernel_Slide follows TFP0 and is available for readers interested in reading for themselves.
conclusion
In this paper, we analyze in detail the string-based cross-reference technique in Undecimus to locate code and variables in memory. With this technique, we can locate variable addresses in the kernel and then bypass detection and injection by reading and writing. This technique is not only the key to Jailbreak, but also gives readers some insight into binary static analysis.
The resources
- The iPhone Wiki: Kernelcache
- Apple: Darwin-XNU
- Github/pwn20wndstuff: Undecimus