The login
registered
Write an article
Home page
Download the APP
Introduction to Rust (Rust Rocks)
lambeta
Introduction to Rust (Rust Rocks)
- origin
- Practice is the mother of wisdom
- Quick access to
- Clarify the concepts
- Ownership
- Move
- Reference
- Mutable reference
- Explain the error
- Data race conditions
- Build a tree structure
- Render the tree structure
- conclusion
- Source making
TLDR;
The following is my introduction to the topic of internal sharing
Rust is a systems programming language and the underlying programming language of many blockchains, from the old Parity to the upstart Libra; Both Microsoft and core Linux developers love it.
Rust has some fascinating features, such as AOT, memory security, null pointer security, a rich type system, and a large community ecosystem. There are other languages as well, but none as thorough.
The characteristics of Rust are also very prominent, Ownership and Borrow, references and shadow, which are dazzling. The most interesting thing is that when you write a program, the compiler will blame you a lot, and you will feel that you are fighting with the compiler.
Learning Rust isn’t just a fad (I heard about it back in 2015), it’s also an attitude that embraces change and, importantly, makes looking at lots of blockchain code less scary.
Language trends mirror the predictions of the mainstream developer community of the future, and Rust is certainly a late comer.
In this topic, I will explain how I learned Rust and how to be more efficient when learning a new programming language. I will also use a small program called Tree to demonstrate its differences.
origin
Rust is a programming language that almost everyone in blockchain knows about and is very popular with developers at the bottom of blockchain. Oddly enough, Rust originated with Mazilla, and its only mass adoption is Firefox, a niche language that has caught fire in blockchain circles. It has something to do with ethereum founder Govin Wood’s Parity project, an Ethereum client written in Rust.
I first came into contact with Rust in 2015, when a colleague sent me an email asking if anyone was interested in the Rust programming language. At that time, I was young and immature, and thought this language was particularly suitable for practice because it was popular among a small number of people. So I responded to the email with passion. As a result, there was no follow-up.
The second focus on Rust is that Chen Tian mentions the language on his official account. I appreciate Chen Tian. I was also influenced by Him when I learned Elixir, so I followed his steps to listen to Zhang Handong’s Zhihu Live, and then joined his readers (charm Rust). I have been diving in this group for more than half a year and have been amazed at the activity of this group.
In 2019, one of the biggest events in blockchain circles was Facebook’s launch of Libra, a non-sovereign currency, followed by the Move programming language based on Rust. This Move is essentially a DSL of Move, or in more academic terms, denotational semantics, in which a simple compiler translates the syntax of Move into the syntax of Rust and then uses Rust’s compiler to generate binary code. There are no surprises in this process, but the Move language is clearly borrowed from Rust’s concept of Move Ownership, which represents the fact that digital assets can only have one owner, and once they Move, there will be a transfer of sovereignty and the previous owner will lose the sovereignty. This idea fits well with Rust’s governance of sovereignty, so it’s easy to see why the Libra team copied the name. Of course, Libra’s underlying blockchain also uses Rust. This big event, coupled with Parity’s success, is sure to increase the enthusiasm for learning Rust for programmers, who are born to love new things.
It was probably under this opportunity that I began to learn Rust. As usual, I’ll start with tree, a command-line program, and gradually learn Rust by trial and error. Including its basic data types, composite data types, control flow, module (function) and file and collection operations, and the most critical application of Ownership.
Practice is the mother of wisdom
One of the most profound experiences of learning Rust, and one of the most common complaints I hear, is the compilers. I think a lot of novices get upset when they see so many warnings or errors and don’t say anything about them. But it’s also a design philosophy Rust prides himself on.
Each new language has a Rationale or design philosophy, such as Clojure, which is from the Lisp family, with its Elegance and familiarity that are orthogonal. Back to ancient times, Java Write Once, Run Anywhere bold words; Rust’s basic design philosophy is If it compiles, then it Works. Just think about how harsh this condition is — the gradual convergence of dynamically weakly typed languages to statically strongly typed languages has basically declared the victory of the type system.
Even so, modern software engineering is all about writing tests to ensure that code runs correctly — from unit tests to integration tests, from smoke tests to regression tests, from Profiling to performance tests. These testing methods and tools have penetrated into every aspect of software engineering, but all kinds of software are still full of bugs. Rust’s high-sounding pronouncements smack of hubris. But the programming world is a magical circle where concepts can be created and implemented, with a high profile bull blowing. Moreover, Rust illustrates the core idea of modern programming languages: to restrain programmers is not to persuade you, but to suffocate you.
In my book How I Learned a New Programming Language, I talked about how the best way to learn directories is through purposeful trial and error. The program I used to practice it was called tree-list contents of directories in a tree-like format. The basic Rust components needed for this program are:
1. Variable - let 2. Ownership Borrow - & 3. String - String::from("") // non-basic type 2. Slice - "" or vec[..] Vec<_> -vec ::new() // Considering that the collection needs to be automatically expanded 2. Iter () 3.. map() 4.. enumerate() 5.. flatten() 6. .collect() 7.. extend() 1. If Expressions {} else {} 2. Recursions module 1. String) -> Vec<String> Functional component 1.path 2.fs 3. envCopy the code
When trying to find these elements, I found that Rust, and compiled languages like it, have an uncomfortable aspect — the pre-validation step takes too long. Because there is no REPL, to learn how to use some of the concepts, you have to create another project (I don’t want to contaminate the current project’s code) and write experiments in its main function, which is much slower than a repL with its fast feedback capabilities. But Rust also has one significant advantage: when a compilation error occurs, the compiler can not only explain the cause, but also recommend potential changes, which is much clearer and more sophisticated than a dynamic language like Javascript. Use the built-in Assert_eq! And so on the predicate function to predict the result, and write test separately less trouble. So, on the whole, the learning process is very enjoyable.
Quick access to
Here’s an example. To understand how to concatenate two sets, you need to know a few questions:
- The construction of the set?
- Concatenation of sets?
- The assertion of the result?
In the absence of a REPL, the only quick tool to get started is documentation, a detailed explanation of Struct STD ::vec:: vec can be found in the official library at doc.rust-lang.org/std/.
From the example program, you can quickly see that the collection is constructed as follows:
let mut v = vec! [1, 2, 3]. v.reverse(); assert_eq! (v, [3, 2, 1]);Copy the code
vec! Macros can quickly construct a collection and test its reverse method. So how do we concatenate sets? To answer this question, I usually use a search engine, or dig into documents, looking for keywords like concat, append, etc., and always find something.
Without considering non-functional requirements, we will use the most straightforward method, such as the extend method shown in the documentation
let v = vec! [1, 2, 3]. v.extend([1, 2, 3].iter().cloned()); // Error compilingCopy the code
Notice that the compilation failed. The Rust compiler will give you a straightforward error message.
error[E0596]: cannot borrow `v` as mutable, as it is not declared as mutable --> src/main.rs:13:5 | 12 | let v = vec! [1, 2, 3]. | - help: consider changing this to be mutable: `mut v` 13 | v.extend([1, 2, 3].iter().cloned()); | ^ cannot borrow as mutableCopy the code
The error message reveals that our program is trying to borrow an immutable variable. Borrow and mutable are new concepts. For new concepts, we habitually use familiar analogies. Applying the immutable nature of functional programming, you can roughly guess that variables in Rust are immutable by default. However, cannot borrow as mutable is a bit beyond the scope of recognition. So it’s very necessary to know the definition.
Clarify the concepts
One of the most important things in language learning is to clarify concepts. When we encounter a new concept, we have to stop to fill in the knowledge and then come back to understand and solve the actual problem. Because each programming language has its own philosophical principles, it has extracted the results of many theories and practices, so it is necessary to learn these concepts. The process of learning is actually a process of gradually clarifying concepts.
In the process of learning (trying to define) Borrow, I came into contact with the concepts of ownership, move, reference, and mutable reference successively. So I defined these concepts:
Ownership
A variable has ownership of the value it refers to. In Rust, a variable owns the value it refers to, that is, a variable is the owner of its value. A value can have only one owner at a time, and its value will be destroyed once the owner leaves the scope.
Move
The act of reassigning the value of one variable to another. According to the definition of Ownership, a value can only have one owner at a time, so the Ownership of this value will be transferred to another variable at this time, and the original variable will lose the Ownership of this value, resulting in the direct effect that this variable will no longer be available.
Reference
A variable refers to a value rather than the state of ownership of the value. In many assignment scenarios, including variable assignment or function parameter assignment, we do not want the original variable to be unusable later on. We can create a reference to the value with ampersands. When assigning a reference to the value, no Move will occur, so the original variable will still be available. This assignment behavior is called borrow. As a matter of fact, things we own can be lent to others, and they have the Possession, not the Ownership.
Mutable reference
Identifies that the value of the reference is mutable.
In many cases, we want the values passed by references to be changeable. At this point we must identify the reference by &mut, otherwise the modification is not allowed to take place. Note that the &mut flag requires that the original variable must also be muT, which makes sense; mutable variable references must also be mutable. And to prevent data race conditions, there can only be one reference to &mut in the same scope, because once multiple mutable references occur, you run the risk of non-repeatable reads (note that Rust guarantees there is no risk of parallel modifications). Also, references to &mut and & of the same value cannot coexist, because we do not want a read-only & value to be written &mut at the same time, which would lead to ambiguity.
Explain the error
With the necessary concepts clarified, let’s review the code above. Let’s look at the extend function definition:
fn extend<I>(&mut self, iter: I)
where
I: IntoIterator<Item = T>,
Extends a collection with the contents of an iterator...
Copy the code
V.xtend is just a syntactic sugar. The real method call passes self as the first argument to extend(&mut self, iter: I). Mutable references are assigned as function arguments, so naturally the original variable must also be declared mutable.
So let’s follow its instructions and fix it as follows:
let mut v = vec! [1, 2, 3]. / / add a mut modifier v.e xtend ([1, 2, 3]. Iter (). Cloned ());Copy the code
This time the compiler stops using assert_eq! , we verify that the extend operation is correct.
assert_eq! (v, [1, 2, 3, 1, 2, 3]);Copy the code
It is also worth noting that Rust differs a little from the familiar functional programming in that concatenation of collections does not create a new collection, but rather modifices the existing collection. In general, we’re wary of race conditions where data might be present — what if multiple threads write to the collection? With that in mind, let’s reflect on what race conditions are for data.
Data race conditions
The necessary conditions for data race conditions to occur are:
- Multiple references point to the same data simultaneously;
- At least one reference is writing data;
- There is no synchronization mechanism for data access.
Examine 1 and 2: Suppose there are two references to the same collection, as follows:
let mut v = vec! [1, 2, 3]. let r1 = &mut v; let r2 = &mut v; assert_eq! (r1, r2);Copy the code
The compiler immediately gives you a compilation error
error[E0499]: cannot borrow `v` as mutable more than once at a time --> src/main.rs:13:10 | 12 | let r1 = &mut v; | ------ first mutable borrow occurs here 13 | let r2 = &mut v; | ^^^^^^ second mutable borrow occurs here 14 | assert_eq! (r1, r2); | ------------------- first borrow later used hereCopy the code
That is, there can only be one mutable reference in the specified scope. Why is it designed this way? This does not appear to be a data race problem under single threads [1]. But consider the semantics of the following scenario.
let mut v = vec! [1, 2, 3]. let r1 = &mut v; let r2 = &mut v; assert_eq! (r2[1], 2); *r1 = vec! [0] assert_eq! (r2[1], 2); / / failureCopy the code
Once R1 is allowed to change data, for R2, its previously held data will have changed or even become invalid, and it will be problematic to use it again. In the above example, * R1 is re-assigned after dereferencing, resulting in the change of v value, but R2 still uses R2 [1] without knowing it, which leads to the transgression. The problem is similar to the isolation level of non-repeatable reads (commit reads) for transactions in a database, but this is not a sufficient reason under a single thread, except that it is slightly unnatural at the semantic level and will be left for further study.
The odd thing is that if I put two mutable references in different functions, the same logic can bypass the compiler error.
fn main() { let mut v = vec! [1, 2, 3]. mut1(&mut v); mut2(&mut v); } fn mut1(v: &mut Vec<i32>) { *v = vec! [0]. } fn mut2(v: &mut Vec<i32>) { println! ("{}", v[1]); // panicked at 'index out of bounds' runtime error}Copy the code
As you can see, the above argument does not explain the root cause of limiting multiple mutable references under the same scope under a single thread.
The same can be said for &mut and &. So &mut and & cannot coexist in the same scope as Rust.
Inspection 3: As to whether data race conditions can occur in a multithreaded environment, we have to look at the limitations of Rust on thread usage. In Rust’s context, use of the Thread::spawn Thread must Move ownership [2] because, in Rust’s opinion, the LifeTime of the Thread is longer than the LifeTime of the function calling it. The data in the thread will free up the memory of the variable after the function is called, making the data in the thread invalid. So, this restriction is necessary, but on the other hand, once the ownership of the data is transferred, the possibility of multiple threads modifying the same data in parallel is eliminated.
Build a tree structure
struct Entry {
name: String,
children: Vec<Entry>
}
fn tree(path: &Path) -> Entry {
Entry{
name: path.file_name()
.and_then(|name| name.to_str())
.map_or(String::from("."), |str| String::from(str)),
children: if path.is_dir() {
children(path)
} else {
Vec::new()
}
}
}
Copy the code
Since it is a tree structure, the structure defined is recursive. Struct Entry {} is a recursive structure. The tree structure I want to implement is roughly as follows:
entry :: {name, [child]}
child :: entry
Copy the code
There is no explicit return in Rust, and the result of the last expression is treated as a return value, so the entire Entry structure should be returned here.
path.file_name()
.and_then(|name| name.to_str())
.map_or(String::from("."), |str| String::from(str)),
Copy the code
This code looks complicated, but what it does is simple: it gets the name of the current file. So why is the logic so convoluted? This is a problem caused by multiple string representations in Rust. Let’s look at the definitions of each function.
The definition of Path. File_name
pub fn file_name(&self) -> Option<&OsStr>
Copy the code
And_then is the name given in Rust to our common flat_map operation to convert between the two options.
The definition of OsStr. To_str
pub fn to_str(&self) -> Option<&str>
Copy the code
The above path. File_name (.) and_then (| name | name. To_str ()) eventually turned into Option < & STR >, on which the call Option. The map_or method and provide a default value: a string “. “. Why provide defaults? This is closely related to the OsStr to Str conversion. When we pass the “.” argument, path. file_name actually returns None.
After building the parent tree structure, we need to complete the child tree structure as well, and finally build an in-memory directory tree through recursion.
fn children(dir: &Path) -> Vec<Entry> { fs::read_dir(dir) .expect("unable to read dir") .into_iter() .map(|e| e.expect("unable to get entry")) .filter(|e| is_not_hidden(&e)) .map(|e| e.path()) .map(|e| tree(&e)) .collect() } fn is_not_hidden(entry: &DirEntry) -> bool { entry .file_name() .to_str() .map(|s| ! s.starts_with(".")) .unwrap_or(false) }Copy the code
There are also quite a few conversion operations, which we will explain.
fs::read_dir(dir).expect("unable to read dir")
Copy the code
Expect is used because fs::read_dir returns a Result
, and calling expect on it tries to untangle the value and throws an error if there is an error. The unwrapped Result type is ReadDir, which is an iterator for IO ::Result
, that is, all the categories in a directory on which into_iter() can be called to create an iterator that can be consumed.
.map(|e| e.expect("unable to get entry"))
.filter(|e| is_not_hidden(e))
.map(|e| e.path())
.map(|e| tree(&e))
Copy the code
Then, after unpacking Result
, we filter out the hidden file because filter receives a closure whose type declaration is P: FnMut(&Self::Item) -> bool, so all elements received by filter are references, so it is not necessary to declare is_not_hidden(&e).
We then use e.path() to get the full path of each file and give it to tree in turn to build recursively. An in-memory directory tree is constructed by alternating recursion of the tree and children functions.
With the tree structure in memory, we can now render the structure. The specific approach is as follows:
- For a tier 1 directory name, if it is the last directory, the prefix is
L_branch = "└ ─ ─"
; Instead, decorate intoT_branch = "├ ─ ─"
. - For subdirectories, if the parent directory is the last directory of the parent directory, the prefix decoration is
SPACER = " "
; Instead, the prefix is decorated asI_branch = "│"
.
Render the tree structure
fn render_tree(tree: &Entry) -> Vec<String> { let mut names = vec! [tree.name]; // error let children = &tree.children; let children: Vec<_> = children .iter() .enumerate() .map(|(i, child)| decorate(children.len() - 1 == i, render_tree(child))) .flatten() .collect(); names.extend(children); names }Copy the code
There will be a compilation error with the following error message:
error[e0507]: cannot move out of `tree.name` which is behind a shared reference --> src/main.rs:48:26 | 48 | let mut names = vec! [tree.name]; | ^^^^^^^^^ move occurs because `tree.name` has type `std::string::string`, which does not implement the `copy` traitCopy the code
Since Tree. name is not a Scalar Type, it does not copy a trait (see hint), and since the tree itself is a Compound Type, a Move to tree.name will cause problems for the tree containing it. To avoid this, we have to refer to &tree.name. Once a reference is added, however, a type mismatch compilation error occurs.
59 | names
| ^^^^^ expected struct `std::string::String`, found reference
|
= note: expected type `std::vec::Vec<std::string::String>`
found type `std::vec::Vec<&std::string::String>`
Copy the code
We expect Vec
, not Vec<&String>, so we need to reconstruct a String. You can use the String::from(&String) method
let mut names = vec! [String::from(&tree.name)];Copy the code
This modification down, in order to ensure that the compilation is complete. But Rust actually provides us with a much more convenient way to write
let mut names = vec! [tree.name.to_owned()]Copy the code
Using to_owned() is the same as reconstructing a String.
Combination of call
use std::env; use std::path::Path; use std::fs::{self, DirEntry}; fn main() { let args: Vec<String> = env::args().collect(); println! ("{}", render_tree(&tree(Path::new(&args[1]))).join("\n")); }Copy the code
Render_tree returns Vec
, so to print it out, we join all elements together with “\n”.
. ├ ─ ─ Cargo. Toml ├ ─ ─ Cargo. Lock └ ─ ─ the SRC └ ─ ─ main. RsCopy the code
conclusion
Some of the subjective feelings I have learned are that the concepts in Rust are complex, and some of the design is really confusing. Coupled with the large number of types (e.g., OsStr, String), the code is hard to write intuitively, requiring extensive documentation to shut down the compiler. So the learning curve is relatively steep.
However, more language constraints are, in some ways, a boon for programmers. If it compiles, then it works. In front of the philosophy of learning road resistance and long, efforts to add meals.
Note that common scalar types implement the Copy trait.
- All integers, such as U32
- Boolean type, such as true or false
- The character type, for example, char
- Floating point type, for example, F64
- If and only if all elements are tuples of Copy, as in :(i32, i32) is Copy, but (i32, String) is not Copy.
On 22 September 2019
-
www.reddit.com/r/rust/comm… ↩
-
Squidarth.com/rc/rust/201… ↩
programming
lambeta
advertising
advertising
lambeta
Introduction to Rust (Rust Rocks)
Facebook Libra in tech eyes
On computer numerical storage and display from complement code
Wonderful to continue
The only Chinese film in bin Laden’s hard drive was this cartoon
This man is a liangshan mole? After The death of Song Jiang, he did the best job and served Emperor Huizong closely
advertising