This is a detailed summary of Rust language learning in five sections: development environment, syntax, properties, memory management, and Unicode.

The Rust Development Environment Guide

1.1 Rust code execution

Based on knowledge of compiler principles, the compiler does not translate directly from the source language to the target language, but rather into an “intermediate language”, which compiler practitioners refer to as an “IR” — instruction set, which is then translated from the intermediate language, using back-end programs and devices, into the assembly language of the target platform.

The Rust code executes:

  1. Rust code, through word segmentation and parsing, generates an AST (abstract syntax tree).

  2. Then the AST is further simplified to HIR (high-level IR) to make it more convenient for the compiler to do type checking.

  3. HIR is further compiled into MIR (Middle IR), which is an intermediate representation with the main purpose of:

A) Shorten compilation time;

B) Shortening execution time;

C) more accurate type checking.

  1. Eventually MIR is translated into LLVM IR, which is then compiled by LLVM processing into target machine code that can run on each platform.

ø IR: Intermediate language

ø HIR: Advanced intermediate language

ø MIR: Intermediate intermediate language

ø LLVM: indicates a Low Level Virtual Machine.

LLVM is a framework system of architecture compilers written in C++ to optimize compile-time, link-time, run-time, and idle-time for programs written in any programming language.

There is no doubt that IR, the intermediate language of different compilers, is different, and IR can be said to reflect the characteristics of this compiler: its algorithm, optimization method, assembly process and so on. To fully master the work and operation principle of a certain compiler, analyzing and learning the intermediate language of this compiler is undoubtedly an important means.

Since the intermediate language is like a “bridge” between the front end and back end of the compiler, if we want to carry out the llVM-based back-end migration, we undoubtedly need to develop the compiler back end corresponding to the target platform. To successfully complete this work, a thorough understanding of the LLVM intermediate language is undoubtedly a very necessary work.

Compared with GCC, LLVM greatly improves the generation efficiency and readability of the intermediate language. The intermediate language of LLVM is a format between C language and assembly language. It not only has the readability of high-level language, but also can comprehensively reflect the operation and transmission of the underlying data of the computer, which is refined and efficient.

1.1.1 MIR

MIR is an abstract data structure based on the Control Flow Graph (CFG), which contains all possible flows during program execution in the form of a directed Graph (DAG). So the mir-based borrowing check is called the non-lexical scope life cycle.

MIR consists of the following key components:

  • Basic Block (BB)It is the basic unit of control flow diagrams.

ø Statement

ø Terminator

  • The local variable, occupy the memory location, such as function parameters, local variables, etc.
  • Place (s)To identify an unknown expression in memory.
  • RValueProduces an expression of the value.

See pages 158 and 159 of the Rust Code for details of how this works.

MIR code can be generated in play.runst-lang.org.

1.1 Rust installation

ø Method 1: See the official Installation section of Rust.

Can be installed to invoke the command is actually: curl https://sh.rustup.rs – sSf | sh

ø Method 2: Download the offline Installation package to install it. For details, see the Other Rust Installation Methods section.

1.2 Rust compile & Run

1.2.1 Cargo Packet Management

Cargo is a package management tool in Rust, and a third-party package is called Crate.

Cargo did four things:

  • L Use two metadata (metadata) file to record various project information
  • Obtain and build project dependencies
  • L called with the correct argumentsrustcOr other build tools to build the project
  • L recommends a standardized workflow for Rust ecosystem development

Cargo documents:

  • Cargo.lock: Only the dependent package details are logged, not maintained by the developer, but automatically maintained by Cargo
  • Cargo.toml: Describes the various information required for the project, including third-party package dependencies

Cargo compilation defaults to Debug mode, where the compiler does not make any optimizations to the code. You can also use the –release parameter to use the publish mode. In release mode, the compiler will optimize the code to make the compilation time slow, but the code will run faster.

The official compiler, RUSTC, is responsible for compiling rust source code into executable files or other files (.a,.so,.lib, etc.). For example, rustc box.rs

Rust also provides Cargo, a package manager, to manage the entire workflow. Such as:

  • l``cargo new``first_pro_create: Creates a project named first_pro_CREATE
  • l``cargo new --lib``first_lib_create: creates a library project for the command first_lib_create
  • l``cargo doc
  • l``cargo doc --open
  • l``cargo test
  • l``cargo test -- --test-threads=1
  • l``cargo build
  • l``cargo build --release
  • l``cargo run
  • l``cargo install --path
  • l``cargo uninstall``first_pro_create
  • L ` ` cargo new - bin use_regex

1.2.2 Using third-party Packages

Rust can use third-party packages by adding packages you want to depend on under [dependencies] in Cargo. Toml.

Rs and/or SRC /lib.rs file, using the extern Crate command declaration to import the package.

Such as:

Note that extern Crate declares the package name to be linked_list, using the underscore “_” instead of the hyphen “-” used in Cargo. Toml. Cargo converts a hyphen to an underscore by default.

Rust also does not recommend package names with suffixes of “-rs” or “_rs”, and it is mandatory to remove such suffixes.

See page 323 of Rust The Tao of Programming.

1.4 Rust Common Commands

1.5 Rust Command specifications

ø Function: snake_case, e.g. Func_name ()

ø File names: snake_case, such as file_name.rs, main.rs

ø Temporary variable names: Snake_case

ø Global variable name:

Struct FirstName {name: String} struct FirstName {name: String}

ø ENUM Type: large hump naming method.

ø Associated constants: Constant names must be all uppercase. What is an associative constant? See page 221 of the Rust Tao of Programming.

ø Cargo converts a hyphen “-” to an underscore “_” by default.

ø Rust does not recommend package names with suffixes of “-rs” or “_rs”, and it is mandatory to remove such suffixes.

Rust grammar

2.1 Questions & Conclusions

2.1.1 Copy semantics && Move semantics (Move semantics must transfer ownership)

Types are getting richer and richer, and value types and reference types can’t describe the whole picture, so we introduced:

ø Value Semantic

After replication, the storage space of two data objects is independent of each other.

The basic primitive types are all value semantics, and these types are also known as PODS (Plain Old Data). POD types are all value semantics, but value semantic types are not always POD types.

A primitive type with value semantics that is bitwise copied by the compiler when assigned as an rvalue.

ø Reference Semantic

After replication, the two data objects alias each other. Manipulating one of these data objects affects the other.

The smart pointer Box

encapsulates the native pointer and is a typical reference type. Box

does not implement Copy, which means that it is flagged by Rust to prohibit bitwise copying for reference semantics.

Copy cannot be implemented by reference semantic types, but the Clone method of Clone can be implemented to achieve deep Copy.

In Rust, you can distinguish between the value semantics and reference semantics of a data type by whether or not you implement Copy traits. But to be more precise, Rust also refers to new semantics: Copy semantics and Move semantics.

ø Copy semantics: corresponding value semantics, that is, the type that implements Copy is safe for bitwise copying.

ø Move semantics: corresponding reference semantics. Bitwise replication is not allowed in Rust, only moving ownership is.

2.1.2 Which copies are implemented

ø Structs: Copy is not automatically implemented when members are of duplicate semantic types.

ø Enumeration body: Copy is not automatically implemented when members are of duplicate semantic types.

Struct body && enumerator:

  1. Add the attribute #[derive(Debug,Copy,Clone)] to Copy when all members are replication semantic types.

  2. Copy cannot be implemented if there are members of the move semantic type.

ø Tuple type: Copy itself. The default is bitwise copy if the elements are all of replication semantic type, otherwise move semantics are performed.

String literals & STR: bitwise copy is supported. For example, c = “hello”; C is a string literal.

2.1.3 Which Copy Is Not Implemented

String object String: to_string() converts String literals to String objects.

2.1.4 Which implements the Copy trait

ø Native integer types

For Copy types, the clone method simply implements bitwise Copy.

2.1.5 Which Traits are Not Copied

Ø Box < T >

What does Copy trait do when implemented?

The type that implements Copy traits also has Copy semantics, which make bitwise copies by default when performing operations such as assigning values or passing in functions.

For types that can be safely replicated on the stack by default, only bitwise replication is required, which is also convenient for memory management.

ø For data that can only be stored on the heap by default, deep copy must be performed. Deep replication requires the reprovisioning of space in heap memory, which incurs additional performance overhead.

2.1.6 Which ones are on the stack? Which ones are on the heap?

2.1.7 the let binding

ø Binding declared by Rust is immutable by default.

ø If you need to change it, you can use muT to declare that the binding is mutable.

2.2 Data Types

Data types in many programming languages fall into two categories:

Ø value type

Generally refers to the type of data that can be stored in the same location. Values, booleans, structs, and so on are all value types.

The value types are:

  • L ' 'native type
  • L ` ` structure
  • L ` ` enumeration

ø Type of reference

There will be a pointer to the actual storage area. For example, some reference types usually store data in the heap, and the stack only holds addresses (Pointers) to the data in the heap.

The reference types are:

  • L ordinary reference type
  • Native pointer type

2.2.1 Basic Data types

Boolean type

Bool has only two values: true and false.

Basic numeric type

Focus on the range of values, as described on page 26 of Rust The Tao of Programming.

Character types

Use single quotes to define character (char) types. A character type represents a Unicode scalar value of 4 bytes each.

An array type

The type signature of the array is [T; N]. T is a generic tag that represents a specific type of an element in an array. N represents the length of the array, whose value must be determined at compile time.

Array features:

  • L Fixed size
  • L elements are of the same type
  • L Immutable by default

Slice type

The Slice type is a reference fragment of an array. At the bottom, the slice represents a pointer to the start of the array and the length of the array. Use the [T] type to represent a continuous sequence, then the slice types are &[T] and &mut[T].

See page 30 of the Rust Tao of Programming.

STR string type

The string type STR, usually in the form of immutable borrowing, is &str (string slice).

Rust divides strings into two types:

1) &str: a fixed-length string

2) String: The length can be changed at will.

The &str string type consists of two parts:

  1. A pointer to a string sequence;

  2. Record the length value.

&str is stored on the stack, and STR string sequences are stored in the program’s static read-only data segment or heap memory.

&STR is a fat pointer.

Never type

The never type, that is! . This type is used to represent the type of computation that can never return a value.

Others (this part is not a base data type)

This part does not belong to the basic data type, but is put here for the time being due to formatting problems.

Fat pointer

Fat pointer: a pointer that contains dynamic size type address information and length information.

See page 54 of the Rust Tao of Programming.

Zero size type

The characteristic of Zero sized types (ZST) is that their values are themselves and do not take up memory at runtime.

The size of the cell type and cell structure is zero, as is the size of the array composed of the cell types.

The ZST type stands for empty.

At the end of the type

The base type is actually the never type introduced, with an exclamation mark (!). Said. Its features are:

  • L have no value
  • L is a subtype of any other type

If the ZST type means “empty”, the base type means “none”.

The base type has no value, and it can be equivalent to any type.

See page 57 of the Rust Tao of Programming.

2.2.2 Compound data types

tuples

Rust provides four composite data types:

  • L ' 'Tuple
  • L 'Struct
  • L ' 'enumeration body (Enum)
  • L 'Union

Let’s start with tuples. A tuple is a heterogeneous finite sequence of form (T,U,M,N). Isomeric means that elements in a tuple can be of different types. By finite, we mean that tuples have fixed length.

  • L Empty tuple:(a)
  • If l has only one value, use a comma:(0),

The structure of the body

Rust provides three constructs:

  • L ' 'named structure
  • L ' 'tuple structure
  • L ' 'unit structure

Such as:

ø Named structure:

Struct People {name: & ‘static STR,}

ø Tuple struct: field has no name, only type:

struct Color(i32, i32, i32);

When a tuple structure has only one field, it is called the New Type pattern. Such as:

struct Integer(u32);

ø Unit structure: a structure without any fields. An instance of a unit structure is itself.

struct Empty;

The structure updates the syntax

Update syntax with Struct (..) Create a new instance from another instance. You can use the struct Update syntax when the new instance uses most of the values of the old instance. Such as:

#[derive(Debug,Copy,Clone)] struct Book< ‘a> {name: &’ a STR, isbn: i32, version: i32,} let Book = Book {name: “The Rust Way of Programming”, ISBN: 20181212, version: 1}; let book2 = Book {version: 2, .. book};

Note:

  • L Copy is not allowed if the structure uses member fields with move semantics.
  • L Rust does not allow a Copy of a structure that contains a String field.
  • Updating the syntax transfers field ownership.

The enumeration of body

This type covers all possible cases and effectively prevents users from providing invalid values. Such as:

enum Number { Zero,

One,

}

Rust also supports enumerations that carry type parameters. Such enumerated values are essentially function types that can be converted to function pointer types by explicitly specifying the type. Such as:

enum IpAddr {

V4(u8, u8, u8, u8), V6(String),

}

Enumeration bodies are one of the most important types in Rust. For example, Option indicates the enumeration type.

A consortium

2.2.3 Common Collection types

Linear sequence: vector

There are four common collection types in the STD :: Collections module of the Rust library, as follows:

  • Linear sequence:Vector (Vec),Two-ended queue (VecDeque),LinkedList
  • Key-value mapping table:Unordered hash table (HashMap),Ordered mapping table (BTreeMap)
  • Collection types:Unordered set (HashSet),Ordered set (BTreeSet)
  • Priority queue:BinaryHeap

See pages 38 and 271 of the Rust Code.

A vector is also an array, different from arrays in basic data types in that vectors grow dynamically.

Example:

let mut v1 = vec! []; let mut v2 = vec! [0, 10]; let mut v3 = Vec::new();

vec! Is a macro that creates vector literals.

Linear sequence: a double-ended queue

A double-ended Queue (Deque) is a data structure that has both Queue (first-in, first-out) and stack (last-in, first-out) properties.

Elements in a two-ended queue can pop up from both ends, and inserts and deletions are restricted to both ends of the queue.

Example:

use std::collections::VecDeque; let mut buf = VecDeque::new(); buf.push_front(1);

buf.get(0);

buf.push_back(2);

Linear sequences: linked lists

The linked list provided by Rust is a bidirectional list, allowing elements to be inserted or popped at either end. It is best to use Vec or VecDeque types, which are faster and more efficient to access memory than linked lists.

Example:

use std::collections::LinkedList; let mut list = LinkedList::new(); List. Push_front (‘ a ‘); list.append(&mut list2); List. The push_back (” b “);

Key-value mapping table: HashMap and BTreeMap

  • HashMap<K, V>= > disorderly
  • BTreeMap<K, V>= > in order

Where HashMap requires that the key be a hashable type and BTreeMap keys must be sortable.

Value must be a type whose size is known at compile time.

Example:

use std::collections::BTreeMap; use std::collections::HashMap; let mut hmap = HashMap::new(); let mut bmap = BTreeMap::new(); Hmap. Insert (1, “a”); Bmap. Insert (1, “a”);

Collections: HashSet and BTreeSet

HashSet

and BTreeSet

are just specific types of HashMap

and BTreeMap

that set Value to an empty tuple.
,>
,>

  • L the elements in the set should be unique.
  • HashSetThe elements in are all types that can be hashed,BTreeSetElements in must be sortable.
  • HashSetIt should be disordered,BTreeSetIt should be in order.

Example:

use std::collections::BTreeSet; use std::collections::HashSet; let mut hset = HashSet::new(); let mut bset = BTreeSet::new(); Hset. Insert (” This is a hset. “); Bset. Insert (” This is a bset “);

Priority queue: BinaryHeap

The priority queue provided by Rust is implemented based on the Binary Heap.

Example:

use std::collections::BinaryHeap; let mut heap = BinaryHeap::new(); heap.peek(); => peek takes out the largest element in the heap heap.push(98);

Capacity (Size/Len)

With these collection container types, whether Vec or HashMap, the most important thing is to understand Capacity and Size.

Capacity refers to the amount of memory allocated for the collection container.

Size refers to the number of elements contained in the collection.

2.2.4 Rust String

Rust strings fall into the following types:

  • str: indicates a string of fixed length
  • String: indicates a growing string
  • CStr: represents a string assigned by C and borrowed by Rust. This is for Compatibility with Windows.
  • CString: represents a C string allocated by Rust that can be passed to C functions for use, also used to interact with THE C language.
  • OsStr: Indicates a string related to the operating system. This is for Compatibility with Windows.
  • OsString: indicates a variable version of OsStr. And Rust strings can be interchangeable.
  • Path: represents the path, defined in the STD ::path module. Path wraps OsStr.
  • PathBuf: Pairs with Path and is a variable version of Path. PathBuf wraps OsString.

STR is a dynamic size type (DST) and cannot be sized at compile time. So the most common type in programs is the Slice type of STR & STR.

& STR represents an immutable UTF-8 byte sequence that cannot be appended or changed once created. Strings of type & STR can be stored anywhere:

ø Static storage area

Ø heap allocation

Ø stack allocation

See page 249 in Rust The Tao of Programming.

The String type is essentially a structure whose member variables are of type Vec< U8 >, so it stores the character content directly in the heap.

The String type consists of three parts:

Execute pointer to byte sequence in heap (as_ptr method)

ø The length of the sequence of bytes in the record heap (len method)

ø Heap-allocated capacity (Capacity method)

2.2.4.1 Processing methods of Strings

Strings in Rust cannot be accessed using indexes, and the bytes and chars methods return iterators that iterate by byte and by character, respectively.

Rust provides two other methods: get and get_mut to get a slice of a string by specifying an index range.

See page 251 of the Rust Tao of Programming.

2.2.4.2 Modifying strings

ø Append strings: push and PUSH_str, and extend iterators

ø Insert strings: insert and insert_str

ø Connect strings: String implements Add<& STR > and AddAssign<& STR > traits, so you can connect strings using “+” and “+=”

Update strings: via iterators or some unsafe methods

ø Delete strings: remove, POP, TRUNCate, clear, and drain

See page 255 in Rust The Tao of Programming.

2.2.4.3 Searching for strings

Rust provides a total of 20 methods covering the following string matching operations:

ø Existential judgment

ø Position matching

ø Split string

ø Capture match

ø Delete a match

ø Substitution matching

See page 256 of the Rust Tao of Programming.

2.2.4.4 Type Conversion

ø Parse: Converts a string to a specified type

Ø format! Macros: Convert other types into strings

2.2.5 Formatting rules

  • L Padding string width: {:5},5 means the width is 5

  • L Intercept string: {:.5}

  • L Align strings: {:>}, {:^}, {:<}, indicating left, center, and right, respectively

  • L ‘ ‘{:*^5} fills with * instead of the default space

  • L symbol + : the positive and negative symbol of the forced output integer

  • L symbol # : Prefix used to display the base. For example, hexadecimal 0x

  • L Digit 0: Used to replace the default filled space with digit 0

  • L {:x} : converts to hexadecimal output

  • L {:b} : converts to binary output

  • L ‘ ‘{:.5} : specifies that the significant digits after the decimal point are 5

  • L {:e} : scientific notation

See page 265 of the Rust Tao of Programming.

2.2.6 Native String declaration syntax:R "..."

Native string declaration syntax (r “…” ) can preserve special symbols in the original string.

See page 270 of the Rust Tao of Programming.

2.2.7 Global Type

Rust supports two global types:

  • Constant
  • Static variable

The difference between:

  • L are inCompile timeEvaluated, so cannot be used to store types that require dynamic memory allocation
  • L Ordinary constantCan be inlinedIt has no fixed memory address and is immutable
  • L Static variablesCannot be inlinedIt has a precise memory address and a static life cycle
  • L Static variables can be created by internally containing containers such as UnsafeCellAchieve internal variability
  • Static variables have other limitations, as described on page 326 of Rust The Tao of Programming
  • L Ordinary constants cannot refer to static variables

Static variables are used when the stored data is large, addresses need to be referenced, or there is variability. Otherwise, ordinary constants should be preferred.

However, there are situations where these two global types cannot be satisfied, such as when you want to use a global HashMap, in which case the lazy_static package is recommended. The lazy_static package allows you to defer defining global static variables until runtime, not compile time.

2.3 trait

Traits are abstractions of type behavior. Traits, the cornerstone of Rust’s zero-cost abstraction, have the following mechanisms:

  • The L trait is Rust’s only interface abstraction;
  • L Can be distributed statically or dynamically.
  • L can be used as a tag type that has some specific behavior.

Example:

struct Duck; struct Pig; trait Fly { fn fly(&self) -> bool; } impl Fly for Duck { fn fly(&self) -> bool { return true; } } impl Fly for Pig { fn fly(&self) -> bool { return false; }}

A detailed introduction to static and dynamic distribution can be found on page 46 of the Rust Tao of Programming.

Trait is limited

The following points are necessary for further understanding and conclusion of Chapter 3. It will be added later.

Trait object

Label trait

Copy trait

Deref solutions for reference

The as operator

From and Into

2.4 a pointer

2.3.1 Reference Reference

Create with the & and & mut operators. Subject to Rust’s safety inspection rules.

A reference is a pointer semantics provided by Rust. A reference is a pointer based implementation. The difference between a pointer and a pointer is that a pointer stores the address it points to in memory, while a reference can be regarded as the Alias of a block of memory.

In an ownership system, the Borrowing of &x can also be called of X. Ownership rentals are done with the & operator.

2.3.2 Native Pointer (Raw Pointer)

Const T and *mut T. You can use it freely under the unsafe block and is not restricted by Rust security check rules.

2.3.3 Smart Pointers

It’s actually a structure that behaves like a pointer. Smart Pointers are a layer of encapsulation of Pointers that provides additional functionality, such as automatic release of heap memory.

Smart Pointers differ from regular constructs in that they implement the traits Deref and Drop.

ø Deref: Provide the ability to understand and reference

ø Drop: Provides automatic destructor capability

2.3.3.1 What are the smart Pointers

A smart pointer has ownership of a resource, whereas a plain reference simply borrows ownership.

Values in Rust are allocated to stack memory by default. Values can be boxed (allocated in heap memory) via Box.

Ø String

Ø Vec

Both String and Vec values are allocated to heap memory and return Pointers. Deref and Drop are implemented by encapsulating the returned Pointers.

Ø Box < T >

Box is a smart pointer to the heap memory allocation value of type T. When a Box goes out of scope, its destructor is called to destroy internal objects and automatically free memory in the heap.

Ø Arc < T >

Ø RC < T >

Single thread reference count pointer, not a thread-safe type.

Multiple ownership can be shared with multiple variables, and the count is increased each time an ownership is shared. See page 149 of the Rust Tao of Programming.

Ø Weak < T >

Is another version of RC<T>.

Ownership of references shared through the Clone method is called a strong reference, and RC<T> is a strong reference.

Weak

Shared Pointers have no ownership and are Weak references. See page 150 of the Rust Tao of Programming.

Ø Cell < T >

Implement field level internal variable cases.

Suitable for copying semantic types.

Ø RefCell < T >

Suitable for moving semantic types.

Cell<T> and RefCell<T> are not smart Pointers per se, but containers that provide internal immutability.

Cell<T> and RefCell<T> are most commonly used in conjunction with read-only references.

See page 151 of the Rust Tao of Programming.

Ø Cow < T >

Copy on write: A smart pointer to an enumeration body. Cow stands for “borrowing” and “owning” of ownership. Cow provides immutable access to borrowed content and a copy of the data when variable borrowing or ownership is needed.

Cow is designed to reduce replication operations and improve performance. Cow is generally used in scenarios where there are many reads and few writes.

Another use for Cow is to unify implementation specifications.

2.3.4 Derefing deref

Dereference takes ownership.

Dereference operator: *

What implements the DERef method

ø Box

: See page 147 of the Rust Code.

ø Cow

: means that immutable methods containing data can be called directly. Specific points can be found on page 155 of Rust The Tao of Programming.

Ø

Box supports dereference movement, Rc and Arc smart Pointers do not.

2.4 Ownership Mechanism:

Each piece of memory allocated in Rust has an owner who is responsible for freeing and reading and writing the memory, and there can be only one owner for each value at a time.

At assignment time, ownership does not change for the Copy semantic type that can implement Copy. For a compound type, whether to copy or move depends on the type of its member.

For example, if the elements of an array are all basic numeric types, then the array is copy-semantic, and will be copied bitwise.

2.4.1 Lexical Scope (life cycle)

Match, for, loop, while, if let, while let, curly braces, functions, and closures all create new scopes and transfer ownership of the binding, as you can see on page 129 of the Rust Code.

The function body itself is a separate lexical scope:

ø When the semantic type is copied as a function parameter, it is copied bitwise.

ø If move semantics are used as function parameters, ownership will be transferred.

2.4.2 Non-lexical scope declaration cycle

Borrowing rule: The life of the borrower cannot be longer than the life of the lender. Use cases can be found on page 157 in Rust The Tao of Programming.

Because the above rules often lead to inconvenience for actual development, non-lexical scope life cycle (NLL) was introduced to improve it.

MIR is an abstract data structure based on the Control Flow Graph (CFG), which contains all possible flows during program execution in the form of a directed Graph (DAG). So the mir-based borrowing check is called the non-lexical scope life cycle.

2.4.2 Borrowing of ownership

To use variable borrowing, the binding variable that lends ownership must be a variable binding.

In an ownership system, the Borrowing of &x can also be called of X. Ownership rentals are done with the & operator. So the reference does not cause a transfer of ownership of the bound variable.

When a reference leaves scope, it is returned to ownership.

ø Immutable borrowings (references) cannot be lent again as mutable borrowings.

ø Immutable borrowing can be lent more than once.

ø Variable borrowing can only be lent once.

ø Immutable borrowing and variable borrowing cannot exist at the same time, for the same binding.

ø The life cycle of the loan should not be longer than the life cycle of the lender. For an example, see page 136 of the Rust Tao of Programming.

Core principle: Share immutable, variable do not share.

Because dereference takes ownership, special care needs to be taken when moving semantic types such as &String need to be dereferenced.

2.4.3 Life cycle Parameters

The compiler’s borrowing checking mechanism cannot check borrowing across functions because the validity of borrowing is currently dependent on lexical scope. Therefore, the borrowed lifecycle parameters need to be explicitly labeled by the developer.

2.4.3.1 Explicit life cycle Parameters

ø Life cycle parameters must begin with a single quotation mark;

Parameter names are usually lowercase letters, such as ‘a ‘;

ø Life cycle parameters are placed after the reference symbol & and separated by Spaces between life cycle parameters and types.

The annotation of the lifecycle parameters is due to the borrowed Pointers. Since there are any borrowed Pointers, when the function returns a borrowed pointer, you need to pay attention to the lifetime of the borrowed memory to ensure memory security.

Marking the lifecycle parameter does not change the lifetime of any reference; it is only used as a compiler borrow check to prevent dangling Pointers. That is, the purpose of the lifecycle parameter is to help the borrow checker validate valid references and eliminate dangling Pointers.

Such as:

&i32; ==> quote &’a i32; ==> References to annotated life cycle parameters &’a mut i32; ==> Mutable references to annotated life cycle parameters allow the use of &’a STR; Where, use &’static STR; It’s also legal.

For ‘static: when pointing to a static object, you need to declare ‘static Lifetime. Static STRING: &’static STR = “bitString “;

2.4.3.2 Life cycle parameters in function signatures

fn foo<'a>(s: &'a str, t: &'a str) -> &'a str;

The <‘a> after the function name is the declaration of the life cycle parameter. The lifetime of a function or method parameter is called the input lifetime, and the lifetime of a return value is called the output lifetime.

Rules:

Do not return a reference without any input parameters, because it will cause a dangling pointer.

If you return (output) a reference from a function, its lifecycle parameters must match the function’s parameters (input). Otherwise, it is meaningless to annotate the lifecycle parameters.

In the case of multiple input parameters, you can also annotate different life cycle parameters. For an example, see page 139 of the Rust Tao of Programming.

2.4.3.3 Life cycle parameters in the structure definition

Structs that contain reference type members also need to annotate life cycle parameters, otherwise compilation will fail.

Such as:

struct Foo<‘a> { part: &’a str, }

Here, the life cycle parameter marker actually specifies a rule with the compiler:

The lifetime of a structure instance should be shorter than or equal to that of any member.

2.4.3.4 Life cycle parameters in method definitions

Life cycle parameters that need to be annotated when a structure contains a reference type member also need to be declared after the IMPL keyword and used after the structure name.

Such as:

Impl < a > ‘Foo < a > {fn split_first (s:’ & ‘a STR) – > & a STR {… }}

After adding the life cycle parameter ‘a, the end of the input reference has a life length longer than that of the structure Foo instance.

Note: Enumerators and structors treat life cycle parameters the same way.

2.4.3.5 Static Life cycle Parameters

Static Lifecycle ‘static: Is a special lifecycle built into Rust. The ‘static life cycle ‘lives for the entire duration of the program. All string literals have a lifetime and are of type & ‘static STR.

String literals are globally statically typed, and their data is stored along with the program code in the executable’s data segment, whose address is known at compile time and is read-only and cannot be changed.

2.4.3.6 Omit the life cycle parameter

If the following three rules are met, you can omit the life cycle parameter. In this scenario, it is hardcoded into the Rust compiler so that the life cycle parameters in the function signature can be automatically completed at compile time.

Life cycle ellipsis rule:

  • L Each life cycle omitted at the input position becomes a different life cycle parameter. That corresponds to a unique lifecycle parameter.
  • L If there is only one input lifecycle position (omitted or not omitted), that lifecycle is assigned to the output lifecycle.
  • L If there are multiple input life cycle positions, including &self or &mut self, then the life cycle of self is assigned to the output life cycle.

If you don’t fully understand these rules, continue to read page 143 of Rust The Tao of Programming.

2.4.3.7 Life cycle qualification

Lifecycle parameters can be used as a qualification for generics, like traits, in two forms:

  • T: 'a, indicating that any reference in type T “gets” and'aThe same length.
  • T: Trait + 'a, indicating that the T type must implement the Trait, and that any references in the T type must be “live” and'aThe same length.

For an example, see page 145 of Rust The Tao of Programming.

2.4.3.8 Life cycle of trait objects

For an example, see page 146 of Rust The Tao of Programming.

2.4.3.9 Advanced life cycle

Rust also provides a higher-ranked Lifetime solution, also known as higher-ranked Trait Bound (HRTB). This scenario provides the for<> syntax.

The for<> syntax as a whole means that this lifecycle parameter applies only to the “object” that follows it.

See this on page 192 of Rust The Tao of Programming.

2.5 Concurrency Security and Ownership

2.5.1 Label traits: Send and Sync

ø If T implements Send: it tells the compiler that instances of the type can safely pass ownership between threads.

ø If type T implements Sync, it indicates to the compiler that instances of this type cannot cause memory insecurity in multithreaded concurrency and can be shared safely across threads.

2.5.2 What Types of Send Are Implemented

2.5.3 Which types implement Sync

2.6 Native Types

There are several primitive types built into Rust:

  • Boolean type: has two values true and false.
  • Character types: Represents a single Unicode character, stored in 4 bytes.
  • Numeric types: Includes signed integers (i8, i16, i32, i64, isize), unsigned integers (U8, U16, U32, U64, usize), and floating point numbers (f32, f64).
  • String type: At the bottom is the indeterminate type STR, more commonly used are String slice. STR and heap-allocated String, where String slices. are statically allocated, have a fixed size and are immutable, while heap-allocated strings are mutable.
  • An array of: has a fixed size and all elements are of the same type, which can be expressed as [T; N].
  • slice: Refers to part of an array without copying and can be represented as &[T].
  • tuples: An ordered list of fixed size, each element has its own type, and the value of each element can be obtained by deconstruction or indexing.
  • Pointer to theThe underlying bare Pointers, const T and mut, are unsafe to dereference and must be placed in the unsafe block.
  • functionA variable with a function type is essentially a function pointer.
  • Yuan type: is (), whose only value is also ().

2.7 the function

2.7.1 Function Parameters

  • L When function arguments are passed by value, ownership is transferred or Copy semantics are performed.
  • L When function parameters are passed by reference, ownership does not change, but life cycle parameters are required (no display is required for compliance with the rule).

2.7.2 Matching Function Parameter Modes

  • ref: Uses pattern matching to get immutable references to parameters.
  • ref mut: Uses pattern matching to get mutable references to parameters.
  • L In addition to ref and ref mut, function arguments can also be ignored using wildcards.

See page 165 in Rust The Tao of Programming.

2.7.3 Generic functions

Function parameters do not specify specific types. Instead, they use the generic T, which only has a Mult trait qualification for T, that is, only types that implement Mul can be used as parameters, thus ensuring type safety.

Generic functions do not specify specific types, but are inferred automatically by the compiler. If you are using primitive primitive types, the compiler can easily infer. If the compiler cannot infer automatically, it needs to explicitly specify the type of the function call.

2.7.4 Methods and Functions

Methods represent the behavior of an instance object, and functions are simply pieces of code that can be called by name. A method is also called by name, but it must be associated with a method recipient.

2.7.5 Higher-order functions

A higher-order function is a function that takes a function as an argument or return value. It is the most basic feature of functional programming languages.

See page 168 of Rust The Tao of Programming.

2.8 the Closure Closure

A closure, usually a lexical closure, is a function that holds external environment variables.

The external context is the lexical scope in which the closure is defined.

External environment variables, also known as free variables in the functional programming paradigm, are variables that are not defined inside a closure.

A function that binds a free variable to itself is a closure.

The size of a closure is unknown at compile time.

2.8.1 Basic syntax for closures

Closures consist of pipe characters (two symmetrical vertical lines) and curly braces (or parentheses).

ø The pipe character is the parameter of the closure function. You can add the type annotation after the colon as normal function parameters, or omit it. For example: the let add = | a, b | – > i32 {} a + b;

The curly braces contain the body of the closure function. The curly braces and return value can also be omitted.

For example: the let add = | a, b | a + b;

ø If the closure function has no parameters but only captured free variables, the parameters in the pipe can also be omitted.

For example: the let add = | | a + b;

2.8.2 Implementation of closures

Closures are a syntactic sugar. Closures are not one of the basic syntax elements provided by Rust, but rather provide a layer of developer programming syntax on top of the basic syntax functions.

The difference between closures and normal functions is that closures can capture free variables in the environment.

The fact that closures can be used as function parameters directly improves the abstract expressiveness of Rust. When passed as a function parameter, it can be used as a trait qualifier for generics or directly as a trait object.

Closures cannot be returned directly from functions; if you want to use closures as return values, you must use trait objects.

2.8.3 Closures and ownership

The compiler automatically translates a closure expression into an instance of a structure and implements one of Fn, FnMut, or FnOnce for it.

  • l``FnOnce: transfers ownership of method recipients. It has no ability to change the environment and can only be called once.
  • FnMutVariable borrowing is made to the method receiver. The ability to change the environment can be called multiple times.
  • Fn: immutable borrowing is made to method recipients. No ability to change the environment and can be called multiple times.

If Fn is to be implemented, FnMut and FnOnce must be implemented.

ø If you want to implement FnMut, you must implement FnOnce;

ø If you want to implement FnOnce, you do not need to implement FnMut and Fn.

2.8.3.1 How to capture environment variables

  • L forCopy semanticsTypes toImmutable reference (&t)To capture.
  • L forMobile semanticType, perform move semantics,Transfer of ownershipTo capture.
  • L forThe variable bindings, and contains operations to modify it in the closureMutable reference (&mut T)To capture.

See page 178 of the Rust Tao of Programming.

Rust uses the move keyword to force free variables in the context defined by the closure to be moved to the closure.

2.8.3.2 Rule Summary

  • L If no environment variables are captured in the closure, this is done automatically by defaultFn.
  • L If an environment variable that replicates the semantic type is captured in the closure:

ø If there is no need to modify the environment variable, Fn will be automatically implemented regardless of whether the move keyword is used.

FnMut is automatically implemented if environment variables need to be modified.

  • L If an environment variable of the move semantic type is captured in the closure, then:

ø If there is no need to modify the environment variable and the move keyword is not used, FnOnce will be automatically implemented.

ø Fn will be automatically implemented if there is no need to modify the environment variable and the move keyword is used.

FnMut is automatically implemented if environment variables need to be modified.

  • FnMutClosure is in usemoveKeyword, the closure automatically implements Copy/Clone if the captured variable is of copy-semantic type. Closures do not automatically implement Copy/Clone if the captured variable is of mobile semantic type.

2.9 the iterator

Rust uses an external iterator, also known as a for loop. External iterator: External can control the entire traversal process.

Traits are used in Rust to abstract the iterator pattern. The Iterator trait is an abstract interface to the Iterator pattern in Rust.

Iterators mainly contain:

  • Next method: Iterates over its internal elements
  • Associative type Item
  • Size_hint methodThe: return type is a tuple that represents boundary information about the remaining length of the iterator.

Example:

let iterator = iter.into_iter();

let size_lin = iterator.size_hint();

let mut counter = Counter { count: 0}; counter.next();

Iter iterator. Next returns Option<&[T]> or Option<&mut [T]>. The for loop automatically calls the iterator’s next method. The loop variable in the for loop gets a value of type &[T] or type &mut [T] from Option<&[T]> or Option<&mut [T]> returned by Next via pattern matching.

Iterators of Iter type generate loop variables in the for loop as references.

The next method on an IntoIter iterator returns Option

, and the loop variables produced in the for loop are values, not references.

Example:

let v = vec! [1, 2, 3]. For I in v {… }

let v = vec! [1, 2, 3]. For I in v {… }

let v = vec! [1, 2, 3]. For I in v {… }

let v = vec! [1, 2, 3]. For I in v {… }

To ensure that the size_hint method gets accurate information about the length of an Iterator, Rust introduces two traits that are children of Iterator, both defined in the STD ::iter module.

  • ExactSizeIterator: provides two additional methodslenandis_empty.
  • TrustedLen: Like a tag trait, its size_hint is trusted as long as the iterator to TrustLen is implemented. Completely avoiding container capacity checks improves performance.

2.9.1 IntoIterator trait

If you want to iterate over an element in a collection container, you must convert it to an iterator.

Rust provides two traits, FromIterator and IntoIterator, that operate opposite to each other.

  • FromIterator: can be converted from an iterator to a specified type.
  • IntoIterator: can be converted from the specified type to an iterator.

Intoiter can use methods like into_iter to get an iterator. The into_iter argument, self, means that the method transfers ownership of the method receiver. There are two other iterators that do not transfer ownership. The details are as follows:

  • Iter: Gets immutable borrowing, corresponding to &self
  • IterMut: Get variable borrowing, correspondence &mut slef

2.9.2 Which implements the types of Iterator?

Only types that implement Iterator can be used as iterators.

A collection container that implements IntoIterator can be converted into an iterator using the into_iter method.

The collection containers that implement IntoIterator are:

  • l``Vec<T>
  • L ` ` & '[T] a.
  • L ` ` & a mut '[T]= > no[T] typeImplement IntoIterator
  • l

2.9.3 Iterator Adapter

The adapter pattern is used to transform one interface into another interface as needed. The adapter pattern enables incompatible types of interfaces to work together.

Adapters are also called wrappers.

Iterator adapters, both defined in the STD ::iter module:

  • Map: Generates a new iterator by calling the specified closure for each element in the original iterator.
  • Chain: Creates a new iterator by concatenating two iterators
  • Cloned: Creates a new iterator by copying all the elements of the original iterator.
  • Cycle: Creates an iterator that iterates forever, then returns the first element to start the iteration.
  • Enumerate: Creates a count iterator that returns a tuple (I,val), where I is of type usize, is the current index of the iteration, and val is the value returned by the iterator.
  • Filter: Creates an iterator that filters elements by chance predicate judgment.
  • FlatMap: Creates an iterator with a map-like structure, but without any nesting.
  • FilterMap: is equivalent to the effect of Filter and Map iterators used once.
  • Fuse: Creates an iterator that can be quickly traversed. If None is returned once while iterating through the iterator, all subsequent iterations will be None. The iterator adapter can be used for optimization.
  • Rev: Creates an iterator that can iterate backwards.

See page 202 of Rust The Tao of Programming.

Rust can customize iterator adapters, as described on page 211 of the Rust Way of Programming.

Consumption is 2.10

Iterators do not automatically traverse and need to call the next method to consume their data. The most direct way to consume iterator data is to use the for loop.

Rust provides an alternative to the for loop for consuming data within an iterator, called a Consumer.

A common consumer in STD ::iter::Iterator:

  • any: Looks for elements in the container that meet the criteria.
  • fold: This method takes two arguments, the first an initial value and the second a closure with two arguments. The first argument to the closure is called an accumulator, which accumulates the result of each iteration of the closure as the return value of the fold method.
  • collect: is used specifically to convert iterators to the specified collection type.
  • l``all
  • l``for_each
  • l``position

2.11 the lock

  • RwLock read-write lock: is a multi-read single-write lock, also known as a shared exclusive lock. It allows multiple threads to read and a single thread to write. However, only one thread can hold the write lock while writing. While reading, any thread is allowed to acquire the read lock. Read locks and write locks cannot be acquired at the same time.
  • Mutex Mutex: Allows only a single thread to read and write.

3. Rust

ø #[lang = “drop”] : Mark drop as a language item

Ø # [derive (Debug)] :

Ø # [derive (Copy, Clone)] :

Ø # [derive (Debug, Copy, Clone)] :

ø #[lang = “owned_box”] : Unlike the native type, Box does not have a type name. It represents the particularity of the smart pointer with unique ownership, which needs to be identified using the Lang Item.

ø #[lang = “fn/fn_mut/fn_once”] : indicates that it is a language item, and the three traits are searched with the names fn, fn_mut, and fn_once respectively.

L fn_once: Transfers ownership of method recipients

L FN_ MUT: Variable borrowing is made to method recipients

L fn: immutable borrowing is made to the method receiver

ø #[lang = “rust_pareen_SUGAR”] : indicates special handling of parentheses call syntax.

Must_use = “iterator adaptors are lazy…” : is used to warn developers that the iterator adapter is lazy.

4. Memory management

4.1 Reclaiming Memory

Drop-flag: Automatically inserts a Boolean flag in the function call stack to indicate whether to call the destructor for variables leaving scope, so that the destructor can be called at run time based on the flag made at compile time.

Copy implements a type that has no destructor. Because a type that implements Copy copies, its life cycle is unaffected by the destructor.

Chapter 4 needs further understanding and summary, to be added later.

Five, the unicode

The Unicode character set is equivalent to a table, with each character corresponding to a non-negative integer, which is called a Code Point.

These code points are also divided into different types:

  • L a scalar value
  • L Proxy to code point
  • L Non-character code point
  • L Reserve code points
  • L Private code point

Scalar values refer to actual code points corresponding to characters, which range from 0x0000 to 0xD7FF and 0xE000 to 0x10FFFF.

Each character in the Unicode character set consists of 4 bytes and is stored as a sequence of Code units.

A code element is the smallest combination of bits used to process and exchange encoded text.

Unicode character encoding table:

  • UTF-8= >1The bytecode yuan
  • UTF-16= >2The bytecode yuan
  • UTF-32= >4The bytecode yuan

The default text encoding format for RS is UTF-8.

Vi. Rust Appendix

A common method for string objects

Click to follow, the first time to learn about Huawei cloud fresh technology ~