Introduction: Does hot replacement of Linux kernel functions “collide” with function call conventions?

Linux kernel hot patch can repair the running Linux kernel. It is an indispensable measure to maintain the online stability. Nowadays, common Linux kernel hot patch is kpatch and livepatch. A kernel hot patch fixes running functions in the kernel, replacing the faulty kernel function with the fixed function.

The idea of function replacement is relatively simple, that is, when executing the old function, bypass its execution logic and jump to the new function. A rough way is to change the first instruction of the original function to “jump target function” instruction, that is, jump directly to the new function to achieve the purpose of replacement.

So, the question is, is this a good idea? Change the first instruction of the original function directly to jump instruction, which will destroy the register context relationship between the original function and its caller, and there is a security risk! This article explores and validates this problem.

Security impact: Problem presentation

For function calls, suppose there are two functions funA and funB, where funA calls funB, where funA is called caller, funB is callee, and funA and funB both use the same register R. As follows:

Figure 1 both funA and funB use register R, which has been modified by funB when funA uses R again

Therefore, when funA uses R again, the data is already the wrong data. This kind of problem can be solved if funA saves the data in register R before calling funB and funB returns and restores it to R, or if funB saves the original data in R and restores it before returning.

Unique calling convention

Should the register be stored by caller or Callee? For Linux, it follows the System V ABI’s Call Convention. For Linux, it follows the System V ABI’s Call Convention. There is only one function call convention under x86_64 platform. Caller caller and callee need to save and restore corresponding registers:

  • Caller-save registers : RDI, RSI, RDX, RCX, R8, R9, RAX, R10, R11
  • Callee-save registers : RBX, RBP, R12, R13, R14, R15

Calling conventions, does GCC comply?

Question: When the implementation of a function is simple and only a few registers are used, do you need to save the unused ones?

答案 : it depends. Depends on compilation options.

The GCC compiler is known to have compiler optimization options such as -o0, -o1, -O2, and -OX, whose range and depth increase as x increases (-o0 is not optimized, with the implicit meaning that it strictly follows the calling convention in the ABI to save and restore all registers used).

The Linux kernel uses -O2 optimization. GCC optionally disobeys the calling convention, as mentioned in the question, and does not need to save unused registers.

When [runtime substitution] runs into [calling convention]

GCC can do this optimization because it understands the execution flow of the program from a high perspective. When it knows callee, Caller’s register allocation, it will be bold and safe to make all kinds of optimizations.

However, the runtime substitution breaks this assumption, and the information GCC has about Callee is most likely wrong. Then these optimizations can cause serious problems. This is an example of user mode (x86_64 platform) :

C -o test -o2 (the O2 optimization option is used in the kernel)./ / Test // Input parameters: 4 #include <sys/mman.h> #include <string.h> #include <stdio.h> #include <math.h> #define noinline __attribute__ Static noinline int c(int x) {return x * x * x; } static noinline int b(int x) { return x; } static noinline int newb(int x) { return c(x * 2) * x; } static noinline int a(int x) { int volatile tmp = b(x); // tmp = 8 ** 3 * 4 return x + tmp; // return 4(not 8) + tmp } int main(void) { int x; scanf("%d", &x); if (mprotect((void*)(((unsigned long)&b) & (~0xFFFF)), 15, PROT_WRITE | PROT_EXEC | PROT_READ)) { perror("mprotect"); return 1; } /* replace function b with newb */ ((char*)b)[0] = 0xe9; *(long*)((unsigned long)b + 1) = (unsigned long)&newb - (unsigned long)&b - 5; printf("%d", a(x)); return 0; }Copy the code
  • The program is to calculate the input number, run using the jump instruction to replace the function b in the program with newb function, that is, the y = x + x calculation process to replace y = x + (2x) ^ 3 * x;
  • C -o test -o2 = -o2; c -o test -o2 = -O2;
  • Program execution:./test, input parameter: 4, output result: 2056;
  • Program error: 2056 is the result of an error that should be 2052, and the result of a direct call to newb is 2052.

This example shows that the direct use of the jump instruction to replace the function in -O2 compiler optimization, will cause problems, security is challenged and impact!!

Safety impact: Analyze the problem

In the above example, we replaced function B with the jump instruction to newb function, which resulted in a calculation error under the compilation optimization of -O2. Therefore, we need to analyze the call execution process of the function carefully to find out the problem. First, let’s look at the disassembly of the program (instruction: objdump -d test), focusing on the a, B, and newb functions:

Figure 2-O2 Disassembly results of compilation optimization

Compilation explanation:

main:

-> Store parameter 4 in edi register

-> call a function:

-> select * from newb where newb = newb;

-> Store the values in the EDI register in the EDX register

-> add the edi register to itself and put the result into EDI

-> call c function:

-> Store the values in the EDI register in the EAX register

-> edi times eax and the result is put into eAX

-> edi times eax and the result is put into eAX

-> return to the newb function

-> add edx and eax to eax

-> return to function a

-> add edi to eax

-> Return main

(Note: b does not write to the EDI register, and its code segment has been modified to jump to newb)

The reason for the data error is that in function newb, the EDI register used in function A is used, and the value of EDI register is modified to 8 in function NEWb. When newb returns, the value of EDI is still 8, and function A continues to use the value. Therefore, the calculation process becomes: 8^3 * 4 + 8 = 2,056, and the correct calculation is 8^3 * 4 + 4 = 2,052.

Instead of compiling optimization (-o0), the output is the correct 2052, which is disassembled as follows:

Figure 3 disassembly without compilation optimization

It can be seen from the disassembly that before function A calls function B, the value of EDI register is stored on the stack. After the call, the data on the stack is taken out again and finally added. This indicates that -O2 optimization option optimizes the operation of saving and restoring EDI register, while in the calling convention, EDI register should belong to caller to save/restore. As to why the compiler is optimized, our guess at the moment is:

Function A originally calls function B, and the compiler knows that the EDI register is not used in function B, so the caller function A does not save or restore the register. However, unbeknown to the compiler, while the program was running, the code segment of function B was dynamically modified, and the jump instruction was used to replace the newb function. In the newb function, data was read and written to the EDI register, and the error occurred.

This is a typical case where the caller save register is not saved and the data is in error. The -O2 option is also used to compile the kernel. Does this problem arise if you apply this scenario to kernel function hot replacement? So, we continue to explore with questions.

Security impact: Exploring the problem

Bugs are no longer observed

We constructed an example of hot replacement of kernel functions, transplanted the user mode example above into our constructed scenario, modified the code segment of the original function through the kernel module, and directly replaced the original b function with the jump instruction. However, after loading the module, the result is correct 2052. After disassembly, we find that a function in the kernel saves the EDI register:

Figure 4. The disassembly of function A in the kernel

The kernel and module are compiled with the -O2 optimization option, where the A function is not optimized and the EDI register is still saved.

At this point we predicted that it would be safe to use Jump for hot kernel function replacements.

Magical -pg option

We wonder if the kernel was compiled with other compilation options that prevented the problem from recurring. Sure enough, on exploration we found that the -pg option used by the kernel compilation did not cause the problem to recap.

According to the GCC manual, the -pg option was introduced to support the GNU GProp performance analysis tool, which adds a call mount command to the function to do some analysis work.

In the kernel, if CONFIG_FUNCTION_TRACER is enabled, the -pg option is enabled.

Figure 5 Enabling the -pg option for CONFIG_FUNCTION_TRACER

FUNCTION_TRACE, also known as fTrace, greatly improves kernel runtime debugging. In addition to the -pg option, ftrace also requires the -mfentry option to be turned on, which is used to place the call to McOunt on the first instruction of the function, Then use the scripts/recordmcount.pl script to change the call instruction to the NOP instruction. However, -mfentry is not relevant to the topic of this article and will not be covered in detail.

To verify this conclusion, we went back to the user mode example in the previous section and added the -pg option: “GCC test.c -o test-O2-pg”, and it worked. To view its disassembly:

Figure 6 Assembly with the -pg option added

As you can see, each function has a call McOunt instruction, and the A function saves the EDI register to ebx, and the NEWb function saves the EBx register. Why do registers save after adding the call mount instruction? We wondered if it was because the call mount operation was equivalent to calling an unknown function (McOunt is not defined in the same file), and GCC thought that such an unknown operation might contaminate the register data, so it did the save field operation.

Void McOunt () {asm(“nop\n”); void McOunt () {“nop\n”); }, add a declaration for McOunt in test.c and a call to that function in a:

extern void mcount(); Static noinline int a(int x){int volatile TMP = b(x); // tmp = 8 ** 3 * 4 mcount(); return x + tmp; // return 4(not 8) + tmp }Copy the code

GCC test.c McOunt. C -o2 ();

Figure 7 assembly after calling the McOunt function

We put the McOunt function in the test.c file, and the result is incorrect. In addition, there is no register saved in the disassembly, so we get the conclusion of the conjecture:

  • GCC saves registers in a source file if a function in the file (such as function a in the scenario) calls an unknown function in another file (such as McOunt in the scenario).
  • Enabling the -pg option increases the call to McOunt, and therefore adds the save operation to the register field in the function, which hides the optimization of the -O2 option.

Mysterious -fipa-ra option: the real man behind it

After our exploration and research, we found this -fiPA-RA option, which can be said to be behind the optimization. The -fiPA-RA option is described in the GCC manual as follows:

  • Use caller save registers for allocation if those registers are not used by any called function. In that case it is not necessary to save and restore them around calls. This is only possible if called functions are part of same compilation unit as current function and they are compiled before it. Enabled at levels -O2, -O3, -Os, however the option is disabled if generated code will be instrumented for profiling (-p, or -pg) or if callee’s register usage cannot be known exactly (this happens on targets that do not expose prologues and epilogues in RTL).

If this option is enabled, there is no need to save registers that are not used by the Caller in callee, provided that callee and Caller are in the same compilation unit and that callee is compiled before Caller. This makes the previous optimizations possible. The -fipa-ra option is enabled if -O2 and above are enabled. However, the -fipa-ra option is disabled if the -p or -pg options are enabled, or if it is not clear which register Callee is using.

This paragraph, in fact, can cover most of our previous conjecture test verification:

  • -O2 option automatically enabled. -FIPA-RA optimizes: In our scenario, the EDI register used by function A is not used in function B, so function A is optimized and no EDI register is saved. However, in function NEWb, EDI register is used, and the data is modified, and function NEWb is replaced by function B, then the calculation result is wrong.
  • Using -pg in -O2 disables -fipa-ra: when -pg was used at compile time, the calculation was correct, and function A saved the EDI register, indicating that function A was not optimized.
  • Not in the same compilation unit will not be optimized: remove the -pg option and manually call the McOunt function in function A. Placing this function in test.c (which is the same compilation unit as function A) will calculate differently than placing it in another file, McOunt.c (which is a different compilation unit) : The calculation in the same compilation unit is incorrect, and function A does not save the register field; In the same compilation unit, the calculation is correct, and function A (Caller) saves the register field because the compiler cannot specify which registers function B (callee) uses.

N ****otrace: Is it a second degree impact?

Developers who have used ftrace or kernel should be familiar with the notrace attribute. The kernel has several functions that are modified by notrace. The notrace function is added to the no_instrument_function attribute. For example, in the X86 definition:

#define notrace __attribute__((no_instrument_function))

-pg makes jump safe. Would it be possible for jump to stumble on notrace? Fortunately, as we will see next, notrace merely disables instrument functions, not breaches security.

The -pg option in the GCC manual gives this explanation:

  • Generate extra code to write profile information suitable for the analysis program prof (for -p) or gprof (for -pg). You must use this option when compiling the source files you want data about, and you must also use it when linking. You can use the function attribute no_instrument_function to suppress profiling of individual functions when compiling with these options.

Does this mean that the register field is no longer protected? In other words, does the presence of notrace bypass the “-pg option to mask the -fiPA-RA optimization”? Add notrace attribute to function a, because function A is caller, turn on the -pg option at compile time, then check the result and disassembly, and finally find that the result is correct, and the register site is saved in assembly code.

Figure 8 appends the noTrace attribute to function A, which does not call McOunt

We append the notrace attribute to all functions, and the result is correct and the register site is protected. However, this simple verification is not enough, so we read the GCC source code and found:

Figure 9-PG can disable the -fiPA_ra option

Figure 10 GCC checks the -fiPA-rq option as it processes each function, and if it is false, the function is not optimized

The -fipa-rq optimization option is disabled when the -pg option is used. GCC checks this option for every function, and if it is false, the function will not be optimized.

Since FLAG_iPA_RA is a global option and is not function granular, notrace is powerless. Therefore, notrace concerns can be eliminated here.

Security: draw conclusions

After the above exploration and analysis and the access to official data, we can come to the conclusion that:

  • Hot replacement of kernel functions, using the jump instruction to jump directly to the new function is safe;
  • Argument:
  1. There is one and only one call Conversion in the System V ABI that Linux follows under x86-64;
  2. The gcc-fipa -ra option optimizes call Conversion. The -O2 option automatically enables this option, but the -fipa -RA option is disabled for the -pg option.
  3. The notrace attribute cannot bypass “-pg disabled -fipa-ra”.

Exploration validation under AR****M64

According to the manual, the ARMv8 ABI uses the general purpose register for procedure calls as follows

(source: developer.arm.com/documentati…). :

Argument registers (X0-X7)

These are used to pass parameters to a function and to return a result. They can be used as scratch registers or as caller-saved register variables that can hold intermediate values within a function, between calls to other functions. The fact that 8 registers are available for passing parameters reduces the need to spill parameters to the stack when compared with AArch32.

Caller-saved temporary registers (X9-X15)

If the caller requires the values in any of these registers to be preserved across a call to another function, the caller must save the affected registers in its own stack frame. They can be modified by the called subroutine without the need to save and restore them before returning to the caller.

Callee-saved registers (X19-X29)

These registers are saved in the callee frame. They can be modified by the called subroutine as long as they are saved and restored before returning.

Registers with a special purpose (X8, X16-X18, X29, X30)

  • X8 is the indirect result register. This is used to pass the address location of an indirect result, for example, where a function returns a large structure.
  • X16 and X17 are IP0 and IP1, intra-procedure-call temporary registers. These can be used by call veneers and similar code, or as temporary registers for intermediate values between subroutine calls. They are corruptible by a function. Veneers are small pieces of code which are automatically inserted by the linker, for example when the branch target is out of range of the branch instruction.
  • X18 is the platform register and is reserved for the use of platform ABIs. This is an additional temporary register on platforms that don’t assign a special meaning to it.
  • X29 is the frame pointer register (FP).
  • X30 is the link register (LR).

Figure 9.1 shows the 64-bit X registers. For more information on registers, see . For information on floating-point parameters, see Floating-point parameters.

Figure 9.1. General- Purpose register use in the ABI

As you can see, the ARMv8 ABI clearly specifies the use of registers during function calls.

We re-tested the previous x86-64 exploration verification process on arm64 platform, the same code and the same test process, and the conclusion is the same as the conclusion under x86-64, that is, under ARM64, directly using the jump instruction to implement function replacement is also safe.

Discussion of other scenarios

Other languages do not guarantee security

For THE C language, there are fixed ABI’s and Calling conventions in different frameworks and systems, but other languages are not guaranteed. For example, rust does not have a fixed ABI itself. For example, community discussions about rust defining an ABI. Also, rustc compiler optimization may be different from GCC, so the caller/ Callee-save register problem may also occur.

The true face of Kpatch

Kpatch uses Ftrace for function replacement, and its principle is shown as follows:

Figure 11. Kpatch uses ftrace to replace the function

The main function of ftrace is to trace a function, hook a function to do some extra processing at the end or the head of the function, these functions may contaminate the register context of the function being traced. Therefore, ftrace defines a trampoline to save and restore the register (red box in Figure 11), so that the register site is still the same after coming back from the hook function.

The function of hook is a function in kpatch. Its function is to modify the value of the IP field in regS, that is, to give the address of the new function to the IP field. After the trampoline recovers the register site, It jumps directly to the new function to execute. Therefore, for Kpatch, the field operation of ftrace saving and recovery protects the process of modifying THE IP field function in Kpatch, rather than the new function to be replaced.

If you’re fixing a heat function, fTrace’s trampoline can have a performance impact. So, for performance scenarios, using the jump instruction to directly replace the function can significantly reduce the additional performance overhead.

About the author

Deng Erwei (Fufeng), worked in the Kernel research and development team of AliYun operating system in 2020, and is currently engaged in Linux kernel research and development.

Wu Yihao (Ding Huan) joined aliYun operating system team in 2017, mainly experienced in resource isolation, thermal upgrade, scheduler SLI, etc.

Shan-pei Chen is a senior technical expert with interests in architecture, scheduler, virtualization and memory management.

How can the discussion be so heated without tissue precipitation? Cloud Kernel SIG invites you to join us

The Cloud Kernel is a customized and optimized version of the Kernel product. The Cloud Kernel implements several optimized features and improved functions for Cloud infrastructure and products, aiming to improve the user experience of customers in the Cloud and under the Cloud. Like other Linux Kernel products, the Cloud Kernel theoretically runs on almost any common Linux distribution.

In 2020, the cloud core project will join the OpenAnolis community. OpenAnolis is an open source operating system community and system software innovation platform. OpenAnolis is committed to promoting the prosperity and development of software, hardware and application ecology through open community cooperation, and jointly building a cloud computing system technology base.

The original link to this article is ali Cloud original content, shall not be reproduced without permission.