Smart Pointers in Rust: What, why, and how?

TL; DR: I’ll introduce some of Rust’s smart Pointers: Box, Cell, RefCell, Rc, Arc, RwLock, and Mutex.

Obviously, the smart pointer is… Pointer smart. But what exactly does “smart” mean? When should we use them? How do they work?

These are the questions I will begin to answer here. That’s it: the beginning of the answer, nothing more. I hope this article gives you a “context for understanding” (similar to the concept of “familiarity”) that will help you really understand the topic, which will come from reading official documentation and of course practicing.

If you are already familiar with it, you can use this article as a list of related readings. Look for “helpful links” at the beginning of each section.

Index:

  1. [box](file:///D:/ desktop /Smart_Pointers_in_Rust _What, _why_and_how.html #box)
  2. [cell](file:///D:/ desktop /Smart_Pointers_in_Rust _What, _why_and_how.html #cell)
  3. [reference cell](file:///D:/ desktop /Smart_Pointers_in_Rust _What, _why_and_how.html #refcell)
  4. [RC](file:///D:/ desktop /Smart_Pointers_in_Rust _What, _why_and_how.html # RC)
  5. [arc](file:///D:/ desktop /Smart_Pointers_in_Rust _What, _why_and_how.html #arc)
  6. [lock](file:///D:/ desktop /Smart_Pointers_in_Rust _What, _why_and_how.html #rwlock)
  7. [mutex](file:///D:/ desktop /Smart_Pointers_in_Rust _What, _why_and_how.html #mutex)

General intelligence point

As explained in The Book, a pointer is a variable that contains an address that “points to” some other data. The usual pointer in Rust is a reference (&). Smart Pointers are Pointers that “have additional metadata and capabilities,” such as counting the number of times a value is borrowed, providing a way to manage read and write locks, and so on.

String and Vec are also technically smart Pointers, but I won’t cover them here because they are common and are often thought of as types rather than Pointers.

Also note that from this list, only Arc, RwLock, and Mutex are thread-safe.

box

Useful links: book; File; Boxes, stacks, and heaps.

What?

Box

allows you to store T on the heap. So if you have, say, a u64 that’s going to be stored in the stack, Box

stores it on the heap.

If you are not satisfied with the concept of stacks and heaps, read this.

Why is that?

The value stored in the stack cannot grow because Rust needs to know its size at compile time. The best example I know of how this might affect your programming is The Book: Recursion. Consider the following code (and its comments).

// This will not compile. The list contains itself, and // is recursive, so it has an infinite size. Enumerated lists {disadvantages (i32, list), none,} // This does compile, because the size of the pointer // does not change based on the size of the point value. Enumeration List {disadvantage (i32, Box < List >), none,}Copy the code

Be sure to read this section in The Book for details.

More generally, Box is useful to have when your value is too big to stay on the stack or when you need it.

How to?

To get the value of Box

in T, you just cancel it.

= Box :: new (11); Assert_eq! (* Box, 11)Copy the code

cells

Useful links: module documentation; Pointer document.

What?

Cell

gives a shared reference to T and allows you to change t. This is one of the “shareable mutable containers” provided by the module STD :: Cell.

Why is that?

In Rust, shared references are immutable. This ensures that when you access the internal value, you don’t get anything different than you expected, and that you don’t try to access the value after release (which is a big part of it) 70% security vulnerability memory security issues.

How to?

What Cell

does provide control over our access T. You can find them here, but for our explanation we only need two: get() and set().

Basically, Cell

allows you to freely change T and t.set () because when you use t.set (), you retrieve the Copy of T without reference. That way, even if you change T, the copy value you get get() will stay the same, and if you destroy T, no pointer will hang.

One final note is that T must perform Copy as well.

Use STD :: cell :: cell; Let the container = Cell :: new (11); Let 11 = container. Get (); {the container. Setting (12); } let twelve = container. Get (); Assert_eq! (11, 11); Assert_eq! (12, 12);Copy the code

Reference cell

Useful links: book; The document.

What?

RefCell

also provides a shared reference T, but while the Cell is statically checked (Rust checks it at compile time), RefCell

is dynamically checked (Rust checks it at run time).

Why is that?

Because the Cell uses duplicate operations, you should limit yourself to small values, which means you need references again, which brings us back to the problem the Cell solved.

The path RefCell handles it by keeping track of who is reading and who is writing T. That’s why the RefCell

dynamic check: because you will code this check. But don’t worry, Rust will still make sure you don’t mess up at compile time.

How to?

RefCell

has a method T that borrows mutable or immutable references; If this operation is not safe, you are not allowed to do so. With cells, there are several RefCell methods, but these two suffice to illustrate the concept: Borrow (), which gets an immutable reference; And Borrow_mut (), which gets a mutable reference. The logic RefCell used looks like this:

  • ifThere is no reference(variable or immutable) toT, you might get a mutable or immutable reference;
  • If you already have oneThe variable referenceTYou may not get anything until the reference is deleted;
  • If there is one or moreImmutable referenceT, you might get an immutable reference.

As you can see, you cannot get both mutable and immutable references to T at the same time.

Please remember:notThread-safe. When I say “impossible,” I mean a single thread.

Another way to think about it is:

  • Immutable references are shared references;
  • Mutable references are exclusive references.

It is worth mentioning that the functions mentioned above all have variants that do not panic, but return Result instead: try_borrow() and try_borrow_mut();

Use STD :: cell :: RefCell; Let the container = RefCell :: new (11); {let _c = container. Borrow (); // You can borrow immutably as many times as you need... Claims! (container. Try_borrow (). Is_ok ()); / /... But it cannot be borrowed as mutable, because // it is already borrowed as immutable. Claims! (container. Try_borrow_mut (). Is_err ()); } // After the first time as a variable borrowing... Let's make _c equal to the container. Borrow _mut (); / /... You can't borrow in any way. Claims! (container. Try_borrow (). Is_err ()); Claims! (container. Try_borrow_mut (). Is_err ());Copy the code

RC

Useful links: book; Module documentation; Pointer document; The example of Rust.

What?

I’ll refer to the documentation for this document:

Mode Rc

provides shared ownership T of type values, which are allocated on the heap. The clone call to Rc produces a new pointer to the same allocation on the heap. When the last Rc pointer to a given allocation is destroyed, the value stored in that allocation (often referred to as the “internal value”) is also deleted.

So, like a Box

, Rc

allocates T on the heap. The difference is that cloning Box

gives you another T and cloning Box

gives you another Rc pair with T.



Another important comment is that we have no internal variability Rc as we do in Cell or RefCell.

Why is that?

You want to share access to a value (without copying it), but you want to release it once it is no longer in use, that is, when there is no reference to it.

Since there is no internal variability Rc, you usually use Cell or RefCell with it, for example, Rc

>.

How to?

And Rc

, you are using the clone() method. Behind the scenes, it counts the number of references you have, and when it goes to zero, it drops by T.

Use STD :: rc :: rc; Make mut c = Rc :: new (11); {// After borrowing to immutable... Let's make _first = c. Clone (); / /... You can no longer borrow as mutable... Assert_eq! ( Rc :: get_mut (& mut c ), None ); / /... But it can still be borrowed as immutable. Let's make _second = c. Clone (); // Here we have three Pointers (" c ", "_first", and "_second"). Assert_eq! ( Rc :: strong_count (& c ), 3 ); } // After we remove the last two, we are left with "C" itself. Assert_eq! ( Rc :: strong_count (& c ), 1 ); // Now we can borrow it as mutable. Let z = Rc :: get_mut (&mut c). Untie (); * z + = 1 ; Assert_eq! ( * c , 12 );Copy the code

arc

Useful links: documents; The example of Rust.

What?

Arc is a thread-safe version of Rc because its counters are managed through atomic operations.

Why is that?

I think the reason you would use Arc instead of Rc is clear (thread safety), so the relevant question becomes: Why not just use Arc every time? The answer is that these additional controls come with an overhead cost from Arc.

How to?

Just like Rc, and Arc

you’ll use clone() to get the pointer T to the same value, which will be destroyed once the last pointer is deleted.

Use STD :: sync :: Arc; Use STD :: Thread; Let val = Arc :: new (0); Because I'm at 0.. 10 {make val = Arc :: clone (&val); / / you can't use "Rc" to do this thread: : spawn (mobile | | {print! Value: {:? } / strong_count (&val);} / strong_count (&val); }); }Copy the code

The lock

Useful links: documentation.

RwLock is also made by the parking_lot box.

What?

As a read/write lock, RwLock

will only allow access to T once you hold one of the locks: read or write, according to these rules:

  • read: If you want to read a lock, just, you can get itDidn’t writeHolds the lock; Otherwise, you must wait until it is discarded;
  • Write to write: If you have a lock, you might get it as long asNo man or woman, reader or author, hold lock; Otherwise, you must wait until they are discarded;

Why is that?

RwLock allows you to read and write the same data from multiple threads. Unlike Mutex (see below), it differentiates the type of lock, so you may have several reads as long as you don’t lock write locks.

How to?

When you want to read an RwLock, you must use this function read()- or try_read() – which will return a LockResult containing an RwLockReadGuard. If successful, you can access the values inside the RwLockReadGuard by using deref. If the author holds the lock, the thread is blocked until it can hold the lock.

Something similar happens when you try write()- or try_write(). The difference is that it waits not only for the writer who holds the lock, but also for any reader who holds the lock.

Use STD :: sync :: RwLock; Let lock = RwLock :: new (11); {let _r1 = lock. Read (). Untie (); // You can stack as many read locks as you need. Claims! (lock. Try_read (). Is_ok ()); // But you can't write. Claims! (lock. Try_write (). Is_err ()); // Note that if you use "write()" instead of "try_write()" // it will wait until all other locks are released // (in this case, never). } // If you grab the write lock, you can easily change it to make mut l = lock. Write (). Untie (); * 1 + = 1; Assert_eq! ( * l , 12 );Copy the code

If a thread holding the lock is out of order, further attempts to acquire the lock will return a PoisonError, meaning that every attempt to read RwLock from then on will return the same PoisonError. You may recover from poisoning with an RwLock using into_inner().

Use STD :: sync ::{Arc, RwLock}; Use STD :: Thread; Let lock = Arc :: new (RwLock :: new (11)); Make c_lock = Arc :: clone (&lock); Make _ = thread: : spawn (mobile | | {let _lock = c_lock. Write (). Untie (); Panic! (a); // lock poisoning}). To join (); Let read = match lock. Read () {good (l) => * L, error (poisoning) => {let r = poisoning. into_inner (); * r + 1 } }; // It will be 12 because it is assert_eq recovered from poisoned lock! (Reading, 12);Copy the code

mutexes

Useful links: book; The document.

Mutex is also made of parking_lot boxes.

What?

Mutex is similar to RwLock, but it allows only one lock holder, either a reader or an author.

Why is that?

One of the reasons to like Mutex over RwLock is that RwLock might cause writer hunger (when readers pile up and the writer never gets a chance to get the lock and waits forever), which doesn’t happen with Mutex.

Of course, we’re diving into deeper waters here, so real life choices depend on higher-level considerations like how many readers you expect to have at the same time, how the operating system will implement locking, etc…

How to?

Mutex and RwLock work in a similar way, except that because Mutex doesn’t differentiate between readers and authors, you simply use lock() or try_lock to get MutexGuard. The logic of poisoning also happens here.

Use STD :: sync :: Mutex; Let guard = Mutex :: new (11); Let mut lock = guard. The lock (). Untie (); // Whether you lock the mutex to read or write, // you can only lock it once. Claims! (defender. Try_lock (). Is_err ()); // You can change it as you would with RwLock * lock += 1; Assert_eq! (* lock, 12);Copy the code

You can handle poisoning Mutex just like you would with poisoning RwLock.

Thank you for reading!