You’ve all heard of Rust (Safe), as one of the features of language is Safe, but due to a sad fact that hardware is not Safe (Unsafe), so that all the “security” must be on “Unsafe” encapsulation, it also led to a completely in the sense of “Safe” is hard to do and function is extremely limited.
So let’s see where Rust’s Safe boundary is.
What does Rust think is not “unsafe”?
What is safe Rust, I’m sure you know, and I won’t repeat it here; In fact, Rust doesn’t do some behaviors that we might consider unexpected or even unsafe:
- A deadlock
- Memory and resource leaks
- Exit without performing a destructor
- Random base addresses are exposed due to pointer leaks
- The integer overflow
- Logical error
The first four are easy to understand, especially The memory leak, which is mentioned in The Book (and just look at The STD ::mem:leak from The standard library, which is not even unsafe). In particular, integer overflow and logic errors are discussed here.
The integer overflow
If a piece of code contains arithmetic overflow, it’s the programmer’s fault. In the following discussion, we need to distinguish between arithmetic overflow and wrapping arithmetic. The former is wrong and the latter is expected.
When the programmer enables debug_Assert! Assertion (for example, compilation in debug mode), the compiler inserts dynamic checks at run time and panic if an overflow occurs. Other types of builds (such as in release mode) can cause panic or do nothing in case of an overflow.
In the case of implicit wrapper overflow, the implementor must provide well-defined (even if still considered erroneous) results by using the overflow convention for two’s complement.
The Rust library provides methods for integers that allow programmers to explicitly perform wrapped arithmetic. For example, i32::wrapping_add provides two complement, wrapper addition.
The library also provides a Wrapping
type, which ensures that all standard arithmetic operations for T have Wrapping semantics.
Refer to RFC 560 for error conditions, principles, and more details on integer overflow.
Logical error
Secure code can have additional logical constraints that cannot be checked at compile time or run time. If a program breaks such a constraint, the behavior may be undefined, but it does not result in undefined behavior. This could include panic, incorrect results, unexpected aborts, or infinite loops. This behavior may also vary between runs, builds, or types of builds.
For example, implementing Hash and Eq requirements for equal values must have equal Hash values. Another example is data structures such as BinaryHeap, BTreeMap, BTreeSet, HashMap, and HashSet, which define constraints on modifications to objects in their keys. Violating such a constraint is not considered unsafe, however the program’s behavior is unpredictable and can be suspended at any time.
What Rust thinks is “undefined”
Undefined Behaviour is an interesting definition that is old friends of C and C++ programmers, and even a lot of code relies on Undefined Behaviour.
Rust code is incorrect if it has any of the following behaviors, including the code in Unsafe. ** Unsafe simply means that it is the programmer’s responsibility to avoid undefined behavior; It does not change any requirement that Rust programs must not cause undefined behavior. ** In other words, no undefined behavior should occur, whether or not unsafe is used.
When writing unsafe code, it is the programmer’s responsibility to ensure that any safe code that interacts with unsafe code cannot trigger these behaviors. Unsafe code that satisfies this attribute is said to be sound for any secure caller; Unsafe code is unsound if it can be abused by secure code to exhibit undefined behavior.
Note that the following list is not exhaustive. There is no formal model of the semantics of Rust for what behaviors are allowed and not allowed in unsafe code, so it is possible that more behaviors are considered unsafe. The list below is just the undefined behaviors we identified. Before writing unsafe code, read the Rustonomicon.
- Data Races
- Perform a dereferencing expression (*expr) on a dangling or misaligned raw pointer, even in the context of address expressions (e.g
addr_of! (&*expr)
). - Broken pointer alias rule.
&mut T
and&T
Follow the scope of LLVMnoalias
Model, unless&T
Contains aUnsafeCell<U>
. - Modify immutable data. All data in a const item is immutable. In addition, all data that is shared by reference or owned by an immutable binding is immutable unless the data is contained in a
UnsafeCell<U>
In the. - Calls undefined behavior through the compiler’s built-in instructions.
- Executes code compiled for a platform feature not supported by the current platform (see target_feature, which usually results in SIGILL).
- Call a function that has the error call Specification (ABI) or unwind that has the error unwind ABI.
- Produces an invalid value, even in private and local fields. A value is “produced” when it is assigned to or read from a place, passed to a function/primitive operation, or returned from a function/primitive operation. The following values are invalid:
- Values in bool other than false (0) or true (1).
- Discriminant in an enumeration not included in the type definition.
- An empty FN pointer.
- A value in char is surrogate or above
char::MAX
. !
All values are invalid for this type.- An integer, floating point value, or an original pointer obtained from uninitialized memory, or
str
Uninitialized memory in. - A reference or
Box<T>
Is dangling, misaligned, or points to an invalid value. - Generic references,
Box<T>
Or invalid metadata in the original pointer.- If a
dyn Trait
The vtable that the pointer/reference points to does not match the vtable of the Trait, thendyn Trait
The metadata is invalid. - If the Slice length is not a valid USize (for example, a USize read from uninitialized memory), then the Slice metadata is invalid.
- If a
- An invalid value for a type that has a custom invalid value, such as in the library
NonNull<T>
andNonZero*
.
Note: Rustc does this with the unstable rustc_layout_scalar_VALID_range_ * attribute.
Note: For any type with a limited set of valid values, uninitialized memory is also implicitly invalid. In other words, the only cases where uninitialized memory is allowed to be read are within the union and in the padding (the gap between fields/elements of a type).
Note: Undefined behavior affects the entire program. For example, calling a function in C that exhibits undefined BEHAVIOR in C means that your entire program contains undefined behavior, which also affects Rust code. Vice versa, undefined behavior in Rust can adversely affect code executed by any FFI call in another language.
Dangling pointer
A reference/pointer is dangling if it is empty, or if all the addresses it points to are not valid addresses (such as memory allocated by malloc). The range to which it points is determined by the value of the pointer and the size of the type to which it points (using size_of_val). Therefore, if the pointing range is empty, the drape is the same as non-empty.
Note that slices and strings point to their entire range, so they can’t be very long. The memory allocation length, slice, and string length cannot be larger than isize::MAX bytes.