Huawei | Rust hardware acceleration instruction practice

Author: Hu Kai

Recent attempts have been made to implement some general encryption algorithms using hardware acceleration instructions in the Rust standard library Core_ARCH. In the process of using these instructions, there are some implementation problems. I will summarize the solutions below in the hope of providing some assistance to future users of the Rust CORE_ARCH library. I’ll introduce these solutions, starting with a simple problem.

Initial need

Suppose we needed to implement a function func for external use, and the implementation was different on different architectures. How would we do that? The problem is relatively simple. Rust provides the #[CFG] macro to differentiate between architectures, operating systems, and so on.

To differentiate between different architectures, we can use target_arch:

// lib. Rs file
#[cfg(any(target_arch = "x86", target_arch = "x86_64")))
mod x86;
#[cfg(any(target_arch = "x86", target_arch = "x86_64")))
pub use x86::func;

#[cfg(target_arch = "aarch64")]
mod aarch64;
#[cfg(target_arch = "aarch64")]
pub use aarch64::func;
Copy the code

// x86.rs file
fn func() {
    / /... The specific implementation
}
Copy the code

// aarch64.rs
fn func() {
    / /... The specific implementation
}
Copy the code

OK, according to the above method can complete the corresponding function, the user of this module only need to compile on the corresponding environment can use. The #[CFG] macro also has many other options to choose from, such as target_OS, target_endian, etc. Users can choose according to their own needs.

further

Suppose we wanted to use hardware acceleration instructions from different architectures in func functions. What would we do? Use the #[CFG] macro as before, and you can use target_feature. Take AES hardware acceleration instructions as an example:

// lib. Rs file
#[cfg(all(
    any(target_arch = "x86", target_arch = "x86_64"),
    target_feature = "aes",))]
mod x86;
#[cfg(all(
    any(target_arch = "x86", target_arch = "x86_64"),
    target_feature = "aes",))]
pub use x86::func;

#[cfg(all(
    target_arch = "aarch64",
    target_feature = "aes",))]
mod aarch64;
#[cfg(all(
    target_arch = "aarch64",
    target_feature = "aes",))]
pub use aarch64::func;
Copy the code

// x86.rs file
#[cfg(target_arch = "x86")]
use core::arch::x86::*;
#[cfg(target_arch = "x86_64")]
use core::arch::x86_64::*;

fn func() {
    / /... Concrete implementation, call core_arch\x86 under the instruction
}
Copy the code

// aarch64.rs
#[cfg(target_arch = "aarch64")]
use core::arch::aarch64::*;

fn func() {
    / /... Concrete implementation, call core_arch\ aARCH64 under the instruction
}
Copy the code

This allows you to use the specified hardware acceleration instructions for the corresponding architecture. Here are a few things to note:

With the exception of the hardware acceleration instructions for x86\ X86_64, all hardware acceleration instructions for CORE_ARCH now compile and execute only under Rust NIGHTLY edition.
On some architecturestarget_featureSometimes it is not possible to detect whether the machine provides the specified functionality, and compilation needs to be done in one of the following ways:
```
 $ RUSTFLAGS='-C target-cpu=native' cargo build
Copy the code
```
```
 $ RUSTFLAGS='-C target-feature=+aes' cargo build
Copy the code
```

The first problem I encountered

I tried compiling and executing the above code on different machines, and most of them worked. However, a small number of machines failed to compile (funC could not be found). This is mainly because some machines do not support the corresponding hardware acceleration instructions. The code above uses hardware acceleration by default when encountering the specified architecture, not noting that some machines may not support it. So we need to make some changes:

// lib. Rs file
#[cfg(all(
    any(target_arch = "x86", target_arch = "x86_64"),
    target_feature = "aes",))]
mod x86;
#[cfg(all(
    any(target_arch = "x86", target_arch = "x86_64"),
    target_feature = "aes",))]
pub use x86::func;

#[cfg(all(
    target_arch = "aarch64",
    target_feature = "aes",))]
mod aarch64;
#[cfg(all(
    target_arch = "aarch64",
    target_feature = "aes",))]
pub use aarch64::func;

#[cfg(not(any(
    all(
        any(target_arch = "x86", target_arch = "x86_64"),
        target_feature = "aes",
    ),
    all(
        target_arch = "aarch64",
        target_feature = "aes",))))
mod soft;
#[cfg(not(any(
    all(
        any(target_arch = "x86", target_arch = "x86_64"),
        target_feature = "aes",
    ),
    all(
        target_arch = "aarch64",
        target_feature = "aes",))))
pub use soft::func;
Copy the code

// x86.rs file
#[cfg(target_arch = "x86")]
use core::arch::x86::*;
#[cfg(target_arch = "x86_64")]
use core::arch::x86_64::*;

fn func() {
    / /... Concrete implementation, call core_arch\x86 under the instruction
}
Copy the code

// aarch64.rs
#[cfg(target_arch = "aarch64")]
use core::arch::aarch64::*;

fn func() {
    / /... Concrete implementation, call core_arch\ aARCH64 under the instruction
}
Copy the code

// soft. Rs file
pub fn func() {
    / /... The concrete implementation, the general implementation, does not use hardware acceleration
}
Copy the code

A default software implementation has been added for func when hardware acceleration is not supported on the specified architecture or is not.

The second problem I encountered

The above approach solves the problem of compile failures, but when you compile on one architecture machine, the compiled binary file will run on a different architecture machine or on a machine that does not support hardware acceleration instructions. This is where cross-compilation comes in. Because #[CFG] is statically compiled, the compiled results are only relevant to the current environment. Core_arch library provides a dynamic detection method. This dynamic detection method can solve the problem of whether hardware acceleration is supported in the same architecture without cross-compilation, but cross-compilation is still needed to solve this problem in different architectures.

For example, is_x86_feature_detected! Is provided under the x86\x86_64 architecture. A macro that dynamically detects specified CPU functionality and can be used to modify code:

// lib. Rs file
#[cfg(any(target_arch = "x86", target_arch = "x86_64")))
mod x86;
#[cfg(any(target_arch = "x86", target_arch = "x86_64")))
pub use x86::func;

// AARCH64 has no dynamic detection mechanism.

pub(crate) mod soft;
Copy the code

// x86.rs file
#[cfg(target_arch = "x86")]
use core::arch::x86::*;
#[cfg(target_arch = "x86_64")]
use core::arch::x86_64::*;

pub(crate) use soft::func;

fn func() {
   ifis_x86_feature_detected! ("aes") {
      / /... Concrete implementation, call core_arch\x86 under the instruction
   } else {
      func()
   }
}
Copy the code

// soft. Rs file
pub(crate) fn func() {
    / /... The concrete implementation, the general implementation, does not use hardware acceleration
}
Copy the code

In this way, the binaries compiled in x86\x86_64 architecture can be run on another x86\x86_64 machine without cross-compilation.

conclusion

These are some of the ways I’ve used the Rust Core_ARCH library. For more details, see core_ARCH_docs.md under core_ARCH, which provides a more detailed description of the entire Core_ARCH library.

Huawei | Rust hardware acceleration instruction practice

Initial need

further

The first problem I encountered

The second problem I encountered

conclusion

Related Posts

Micro channel Small Program Learning Series (7) Micro channel small program Components (1)

Asyncio: Python asynchronous programming module

Describe common Spring interfaces and their lifecycle