If the front of the pit we have been digging with a small shovel, then today’s pit is dug with a excavator.
Today I will introduce a core concept of Rust: Ownership. The whole paper is divided into two parts: what is Ownership and the transmission type of Ownership.
What is the Ownership
Each programming language has its own approach to memory management. Some require explicit allocation and reclamation of memory (such as C), and some languages rely on garbage collectors to reclaim unused memory (such as Java). Rust does not belong to any of the above categories. It has its own memory management rules, called Ownership.
Before introducing Ownership in detail, I would like to make a disclaimer. The Rust Pit Guide: General Routine is a data type described in this article where data is stored on a stack. For complex data structures like Strings or some custom data structures (which we’ll cover in more detail later), the data is stored in heap memory. Having made this clear, let’s look at the rules of Ownership.
The rules of Ownership
- In Rust, each value has a corresponding variable called the owner of the value
- A value can have only one owner at a time
- When owner goes out of scope, the value is destroyed
These three rules are very important and remembering them will help you understand this article better.
Variable scope
In the Ownership rule, there is a rule that the value will be destroyed after the owner exceeds the range. So how is the scope of owner defined? In Rust, curly braces are often used to indicate the scope of a variable. Most commonly in a function, the range of the variable S takes effect from the definition until the function ends and the variable becomes invalid.
fn main() { // s is not valid here, it’s not yet declared
let s = "hello"; // s is valid from this point forward
// do stuff with s
} // this scope is now over, and s is no longer valid
Copy the code
This is a lot like most other programming languages, where you start with a variable definition and you allocate memory for the variable. Reclaiming memory is a magic trick. For languages that rely on GC, there is no need to worry about memory reclamation. Some languages require explicit memory reclamation. Explicit collection has certain problems, such as forgetting to collect or repeating the collection. To be more developer-friendly, Rust uses an automatic memory reclamation method, which reclaims the memory allocated for a variable when it goes out of scope.
Ownership of mobile
As mentioned earlier, curly braces are usually a sign of variable scope isolation (i.e., Ownership failure). In addition to curly braces, there are other situations that make Ownership change. Let’s look at two pieces of code first.
let x = 5;
let y = x;
println!("x: {}", x);
Copy the code
let s1 = String::from("hello");
let s2 = s1;
println!("s1: {}", s1);
Copy the code
Author’s note: The double colon is a sign of a function reference in Rust, referring to the from function in String, which is usually used to build a String object.
The only difference between these two pieces of code seems to be the type of the variable. The first paragraph uses integer type and the second paragraph uses string type. The result is that the first paragraph prints the value of x normally, but the second paragraph reports an error. What is the reason for this?
Let’s take a look at the code. For the first piece of code, we take an integer value 5, assign it to x, copy the value of x, and assign it to y. Finally, we succeeded in printing x. It seems logical. In fact, Rust operates this way.
This is what we might expect for the second piece of code, but Rust doesn’t do it that way. Here’s why: For larger objects, such copying is a waste of space and time. So what does Rust actually look like?
First, we need to understand the structure of the String type in Rust:
On the left is the structure of a String, including Pointers to content, length, and capacity. It has the same length and the same volume, so let’s not worry about that for now. The difference will be mentioned later when we look at strings in more detail. This is all stored in stack memory. The right-hand part is the contents of the string, which is stored in heap memory.
Some of you might think, since copying content is a waste of resources, I’ll just copy the structure part, no matter how much content I have, the length of the content I copy is controllable, and it’s copied in the stack, just like the integer type. That sounds like a good idea. Let’s analyze it. The memory structure would look something like this.
What’s wrong with that? Remember the rule of Ownership? When owner goes out of scope, it reclaims the memory occupied by its data. In this example, when both s1 and S2 are out of scope at the end of the function execution, the right block of memory in the figure above is freed twice. This can also lead to unpredictable bugs.
Rust solves this problem by executing let s2 = s1; In this code, s1 is considered to be out of scope, that is, the owner of the content on the right has become S2, or the ownership of S1 is transferred to S2. That is, the picture below.
Another implementation: Clone
If you do need deep copy, that is, copying the data in heap memory. Rust can also do this, providing a common method called Clone.
let s1 = String::from("hello");
let s2 = s1.clone();
println!("s1 = {}, s2 = {}", s1, s2);
Copy the code
After the clone method is executed, the memory structure is shown as follows:
Interfunction transfer
What we talked about earlier is that Ownership transfers between strings, and the same thing happens between functions.
fn main() {
let s = String::from("hello"); // the s scope starts
takes_ownership(s); // the value of s's enters the function
/ /... S is no longer valid here
} // s has expired before that
fn takes_ownership(some_string: String) { // some_string scope starts
println!("{}", some_string);
} // some_string goes out of scope and calls drop
// Memory is freed
Copy the code
Is there any way to make S continue to take effect after executing the takes_ownership function? In general we want to return the ownership in the function. And then it’s natural to think about the return value of the function we talked about earlier. Since passing parameters can transfer ownership, the return value should also be able to. So we can do this:
fn main() {
let s1 = String::from("hello"); // s2 comes into scope
let s2 = takes_and_gives_back(s1); // s1 is transferred to the function
/ / takes_and_gives_back,
Return the ownership to S2
} // s2 goes out of scope, memory is reclaimed, and S1 has expired before
// Takes_and_gives_back takes a string and returns one
fn takes_and_gives_back(a_string: String) - >String { // a_string starts the scope
a_string // A_string is returned, ownership is transferred out of the function
}
Copy the code
This would have served our needs, but it was a bit of a hassle, and Rust, fortunately, agreed. It provides us with another method: References.
Quote and borrow
The method of reference is very simple, just need to add an ampersand.
fn main() {
let s1 = String::from("hello");
let len = calculate_length(&s1);
println!("The length of '{}' is {}.", s1, len);
}
fn calculate_length(s: &String) - >usize {
s.len()
}
Copy the code
This form allows access to a value without ownership. Its principle is shown as follows:
This example is very similar to the one we wrote earlier. A closer look reveals some clues. There are two main differences:
- S1 is preceded by an ampersand when the argument is passed in. This means that we create a reference to S1, which is not the owner of the data, and therefore does not destroy the data if it goes out of scope.
- When the function receives arguments, it also prefixes the String with an ampersand. This means that the parameter is to receive a reference object to a string.
We call the parameters in a function that receive a reference borrowed. Just like in real life, WHEN I finish my homework, I can lend it to you to copy, but it does not belong to you, and you have to give it back to me. (Friendly reminder: do not copy homework unless it is an emergency)
In addition, you can copy my homework, but you can not correct my homework, I originally wrote right you corrected me wrong, how can I lend to you in the future? So, in calculate_length, s is not modifiable.
Modifiable reference
What if I find I made a mistake and ask you to correct it for me? I authorize you to help modify, and you need to say that you can help me modify. Rust also has a solution. Remember the mutable and immutable variables we talked about earlier? Similarly with references, we can use the MUT keyword to make references modifiable.
fn main() {
let mut s = String::from("hello");
change(&mut s);
}
fn change(some_string: &mut String) {
some_string.push_str(", world");
}
Copy the code
This way, we can modify the value of the reference in the function. It is important to note, however, that there can only be one modifiable reference to the same value in the same scope. This is also because Rust does not want concurrent changes to the data.
If we need to use multiple modifiable references, we can create new scopes ourselves:
let mut s = String::from("hello");
{
let r1 = &mut s;
} R1 is out of scope
let r2 = &mut s;
Copy the code
Another conflict is the “read/write conflict,” the limitation between immutable and mutable references.
let mut s = String::from("hello");
let r1 = &s; // no problem
let r2 = &s; // no problem
let r3 = &mut s; // BIG PROBLEM
println!("{}, {}, and {}", r1, r2, r3);
Copy the code
Such code will also report errors at compile time. This is because immutable references do not want their reference values to change before they are used. Here’s a little bit of work:
let mut s = String::from("hello");
let r1 = &s; // no problem
let r2 = &s; // no problem
println!("{} and {}", r1, r2);
// R1 and R2 are no longer used
let r3 = &mut s; // no problem
println!("{}", r3);
Copy the code
The Rust compiler will determine after the first print statement that R1 and R2 are no longer in use, since R3 has not been created and their scopes will not intersect. So this code is legitimate.
Null pointer
Perhaps the biggest headache for a programming language that manipulates Pointers is null Pointers. Typically, after memory is reclaimed, Pointers to that memory are used. Rust’s compiler helps us avoid this problem (thanks again to the Rust compiler).
fn main() {
let reference_to_nothing = dangle();
}
fn dangle() - > &String {
let s = String::from("hello");
&s
}
Copy the code
Take a look at the example above. In dangle, the return value is a reference to the string s. But at the end of the function, s’s memory has been reclaimed. So the reference to S is a null pointer. The Expected lifetime parameter compiler error is expected.
Another reference: Slice
In addition to references, there is another data type without ownership called Slice. Slice is a reference that uses a sequence in a collection.
Here is a simple example to illustrate the use of Slice. Suppose we need to get the first word in the string that’s given to you. What would you do? It’s actually quite simple, iterating through each character, and if you encounter a space, returning the collection of previously iterated characters.
The as_bytes function splits a string into byte arrays, iter is the method that returns each element in a collection, and enumerate is the method that extracts those elements and returns binary groups (element positions, element values). So I can write that down.
fn first_word(s: &String) - >usize {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return i;
}
}
s.len()
}
Copy the code
Take a look at this example. Although it returns the position of the first space, it still works as long as you can intercept it. But do not reveal the plot string interception, otherwise it will not reveal the problem.
What’s the problem with writing this? Take a look at the main function.
fn main() {
let mut s = String::from("hello world");
let word = first_word(&s);
s.clear();
}
Copy the code
After retrieving the space position, the string s is cleared. But word is still 5, and there will be a problem if we try to intercept the first 5 characters of S. Some of you may think you’re better than that, but are you willing to believe that your zhu and Dui partner wouldn’t do the same? I don’t believe it. So what to do? That’s where Slice comes in.
Use slice to retrieve a sequence of characters from a string. For example &s[0..5] can get the first five characters of the string s. 0 is the subscript of the starting character position, and 5 is the subscript of the ending character position plus 1. In other words, slice’s interval is a left-closed and right-open interval.
Slice has a few more rules:
- If the starting position is 0, it can be omitted. That is to say,
&s[0..2]
and&s[..2]
equivalent - If the starting position is the end of the set sequence, it may also be omitted. namely
&s[3..len]
and&s[3..]
equivalent - According to the above two, we can also conclude that
&s[0..len]
and&s[..]
equivalent
The important thing to note here is that when we intercept strings, the bounds must be UTF-8 characters.
Slice solves our problem
fn first_word(s: &String) - > &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
Copy the code
Now when we clear s in main, the compiler disapproves. That’s right, the all-purpose compiler again.
Slice can operate on collections other than strings, such as:
let a = [1.2.3.4.5];
let slice = &a[1.3];
Copy the code
We’ll talk more about sets in the future.
conclusion
Ownership characteristics introduced in this paper are very important for understanding Rust. We introduced the definition of Ownership and transfer of Ownership, as well as the data types of Reference and Slice that do not occupy Ownership.
How’s that? Do you feel the pit is very powerful today? If it was on the first floor, it’s now on the third floor. So please be safe and orderly.