Rust Data Structure - Hands-on implementation of Vec

The body of the

Bullshit stage

The Vec of Rust is a dynamic array. Dynamic arrays are built into many languages, such as JavaScript and Python, and have been chosen by the standard library for memory-controlling languages like Rust.

Basic usage

let mut v1 : Vec<i32> = vec![]; dbg! (v1.len(), v1.capacity());for i in 1.10{ v1.push(i); } dbg! (&v1);Copy the code

layout

Now write an intuitive implementation of Rust dynamic arrays

pub struct MyVec<T> {
  ptr: *mut T,
  cap: usize,
  len: usize,}Copy the code

Naked pointer *mut T *mut *mut *mut *mut *mut *mut *mut Drop Check is helped by its internal PhantomData.

Unique

could encapsulate the naked pointer

TIs variable

Can be donedropcheck

TTo achieve theSend/Sync, the pointer is also availableSend/SyncFeature, equal to talking about thread safety

Pointers are nonnull

Send/Sync: Send/Sync: Send/Sync

use std::marker::PhantomData;

struct MyUnique<T: ?Sized> {
  ptr: *const T,
  _marker: PhantomData<T>,
}

unsafe impl<T: Send+?Sized> Send for MyUnique<T> {}

unsafe impl<T: Sync+?Sized> Sync for MyUnique<T> {}

impl<T: ?Sized> MyUnique<T> {
  #[inline]
  pub fn new(ptr: *mut T) -> Option<Self> {
    if! ptr.is_null() {Some(unsafe {
        MyUnique {
          ptr: ptr as _,
          _marker: PhantomData,
        }
      })
    } else {
      None}}#[inline]
  pub const fn as_ptr(&self) - > *mut T {
    self.ptr as *mut T
  }
}
Copy the code

We see a? The trait bound that specifies the size of generics is compile-time Sized. By default T ever Sized. Adding a question mark relaxed the constraint, and compile-time sizes were accepted. The attribute #[inline] is used to indicate inline functions. Since these functions may be used a lot, inlining can speed things up a bit.

Memory allocation

Then we need to initialize the container. If the container has something in it, it will open up memory, but the container should be empty. Since it is empty, it will not allocate memory, so we need to create an empty object with MyUnique

imp<T: ?Sized> MyUnique<T> {
  #[inline]
  pub const unsafe fn new_unchecked(ptr: *mut T) -> Self {
    MyUnique {
      ptr: ptr as _,
      _marker: PhantomData,
    }
  }
}

impl<T> MyUnique<T> {
  pub const fn empty() - >Self {
    unsafe { MyUnique::new_unchecked(mem::align_of::<T>() as *mut T) }
  }
}

pub struct MyVec<T> {
  ptr: MyUnique<T>,
  cap: usize,
  len: usize,}impl<T> MyVec<T> {
  fn new() - >Self {
    assert_ne!(mem::size_of::<T>(), 0."We're not ready to handle ZSTs");
    MyVec {
      ptr: MyUnique::empty(),
      len: 0,
      cap: 0,}}}Copy the code

I’m going to write some code for memory allocation, and since we need to allocate memory, there’s really nothing here, just read the document for memory allocation and use it, and of course think about alignment. This is actually a capacity expansion process

use std::alloc::{handle_alloc_error, realloc, Layout};
use std::mem;

impl<T> MyVec<T> {
  fn grow(&self) - >Self {
    unsafe {
      let (new_cap, ptr) = if self.cap == 0 {
        let ptr = alloc(Layout::array::<T>(1).unwrap());
        (1, ptr)
      } else {
        let new_cap = self.cap * 2;
        let layout = Layout::array::<T>(self.cap).unwrap();
        let ptr = realloc(self.ptr.as_ptr() as *mut _, layout, layout.size());
        if ptr.is_null() {
          handle_alloc_error(Layout::from_size_align_unchecked(
            new_cap * elem_size,
            mem::align_of::<T>(),
          ));
        }
        (new_cap, ptr)
      };

      Self {
        ptr: MyUnique::new_unchecked(ptr as *mut _),
        cap: new_cap,
        len: self.len,
      }
    }
  }
}
Copy the code

push & pop

The above has been able to allocate memory, the next natural is to achieve the basic function.

We’ll start with a push that adds elements to a dynamic array. If the array is full, we’ll need to reallocate memory. This is just a call to grow. The write behavior is handled by STD :: PTR’s write function, and the corresponding pop is well understood as long as the last one is read out and the length is -1

use std::{mem, ptr};

impl<T> MyVec<T> {
  pub fn ptr(&self) - > *mut T {
    self.ptr.as_ptr()
  }

  pub fn push(&mut self, element: T) {
    if self.len == self.cap {
      self.grow();
    }
    unsafe {
      ptr::write(self.ptr().add(self.len), element);
    }
    self.len += 1;
  }

  pub fn pop(&mut self) - >Option<T> {
    if self.len == 0 {
      None
    } else {
      self.len -= 1;
      unsafe { Some(ptr::read(self.ptr().add(self.len))) }
    }
  }
}
Copy the code

recycling

Rust’s mechanism makes dealing with this very simple, just implementing trait Drops

impl<T> Drop for MyVec<T> {
  fn drop(&mut self) {
    let elem_size = mem::size_of::<T>();
    ifelem_size ! =0 {
      while let Some(_) = self.pop() {}
      unsafe {
        dealloc(self.ptr() as *mut _, Layout::array::<T>(self.cap).unwrap()); }}}}Copy the code

Reference solution

So far, we have implemented a simple data structure, but we can’t communicate with Slice yet. Real Vec doesn’t work that way, so we should implement automatic dereference now. Once we have implemented these things, we can use the interface provided by Slice

use std::ops::{Deref, DerefMut};

impl<T> Deref for MyVec<T> {
  type Target = [T];

  fn deref(&self) -> &[T] {
    unsafe { std::slice::from_raw_parts(self.ptr(), self.len) }
  }
}

impl<T> DerefMut for MyVec<T> {
  fn deref_mut(&mut self) - > &mut [T] {
    unsafe { std::slice::from_raw_parts_mut(self.ptr(), self.len) }
  }
}
Copy the code

Insert and delete

insert

Insert moves all elements to the right after the current position. For example, in an array [1, 2, 3], I want to insert 10 at index 1 (element 2), so the index of 10 is 1, and the indexes of elements 2 and 3 are 2 and 3. So it’s going to be 1, 10, 2, 3.

impl<T> MyVec<T> {
  // ...
  pub fn insert(&mut self, index: usize, element: T) {
    if self.cap == self.len {
      self.grow();
    }
    unsafe {
      if index < self.len {
        ptr::copy(self.ptr().add(index), self.ptr().add(index + 1), self.len - index);
      }
      ptr::write(self.ptr().add(index), element);
      self.len += 1; }}}Copy the code

delete

Remove the array [1, 10, 2, 3] by moving all subscripts to the left. For example, delete the array [1, 10, 2, 3] by moving all subscripts to the left. The index of 10 is 1. And then minus one is done.

impl<T> MyVec<T> {
  // ...
  pub fn remove(&mut self, index: usize) -> T {
    assert!(index < self.len, "index out of bounds");
    unsafe {
      self.len -= 1;
      let result = ptr::read(self.ptr().add(index));
      ptr::copy(self.ptr().add(index + 1), self.ptr().add(index), self.len - index);
      result
    }
  }
}
Copy the code

IntoIter

Let’s start with Vec iterators. You can use slice iter and iter_mut as long as you implement auto-dereference traits, but slice doesn’t have into_iter, so we’ll have to implement that. Now the question is, why implement the IntoIter when we already have the iterator functionality for Slice? We can see that a Vec can be iterated directly with for, because as long as a custom type implements IntoIter, it has the ability to be iterated by for

let v = vec![1.2.3];
for i inv { dbg! (i); }Copy the code

We can now handle iterator operations with two Pointers, one at the beginning and one after the end, indicating the end of the iteration as long as the beginning pointer has the same address as the pointer after the end.

[1, 2, 3, 4, 5, sth]
 ^              ^
start           end
Copy the code

Now let’s create an iterator structure that looks something like this

struct IntoIter<T> {
  start: *const T,
  end: *const T,
}
Copy the code

Of course, we will have to deal with memory later, so we should save the space information allocated by Vec and of course convert MyVec into IntoIter

struct IntoIter<T> {
  start: *const T,
  end: *const T,
  buf: MyUniuqe<T>,
  cap: usize,}impl<T> MyVec<T> {
  fn into_iter(self) -> IntoIter<T> {
    let MyVec { ptr, cap, len } = self;
    mem::forget(self);
    unsafe {
      IntoIter {
        buf: ptr,
        cap,
        start: ptr.as_ptr(),
        end: if cap == 0 { ptr.as_ptr() } else { ptr.as_ptr().add(len) },
      }
    }
  }
}
Copy the code

Another iterator, size_hint, is a copy of the standard library and is used to indicate a lower or upper bound on the number of iterable elements left


impl<T> Iterator for IntoIter<T> {
  type Item = T;
  fn next(&mut self) - >Option<T> {
    if self.start == self.end {
      None
    } else {
      unsafe {
        let result = ptr::read(self.start);
        self.start = self.start.offset(1);
        Some(result)
      }
    }
  }

  fn size_hint(&self) - > (usize.Option<usize>) {
    let len = (self.end as usize - self.start as usize) / mem::size_of::<T>();
    (len, Some(len))
  }
}
Copy the code

There is also the next_back operation


impl<T> DoubleEndedIterator for IntoIter<T> {
  fn next_back(&mut self) - >Option<T> {
    if self.start == self.end {
      None
    } else {
      unsafe {
        self.end = self.end.offset(-1);
        Some(ptr::read(self.end))
      }
    }
  }
}
Copy the code

To handle the memory-related, we’ll implement the Drop trait for IntoIter

impl<T> Drop for IntoIter<T> {
  fn drop(&mut self) {
    if self.cap ! =0 {
      for _ in &mut *self {}
      unsafe {
        dealloc(self.buf.as_ptr() as *mut _, Layout::array::<T>(self.cap).unwrap()); }}}}Copy the code

RawVec

Now to continue refactoring the code, since we have implemented the Drop for IntoIter and MyVec, it is necessary to refactor the code

pub struct MyVec<T> {
  buf: RawVec<T>,
  len: usize,}impl<T> MyVec<T> {
  pub fn push(&mut self, element: T) {
    if self.len == self.cap() {
      self.buf.grow();
    }
    unsafe {
      ptr::write(self.ptr().add(self.len), element);
    }
    self.len += 1;
  }

  pub fn pop(&mut self) - >Option<T> {
    if self.len == 0 {
      None
    } else {
      self.len -= 1;
      unsafe { Some(ptr::read(self.ptr().add(self.len))) }
    }
  }

  pub fn insert(&mut self, index: usize, element: T) {
    assert!(index <= self.len, "index out of bounds");
    if self.cap() == self.len {
      self.buf.grow();
    }
    unsafe {
      if index < self.len {
        ptr::copy(self.ptr().add(index), self.ptr().add(index + 1), self.len - index);
      }
      ptr::write(self.ptr().add(index), element);
      self.len += 1; }}pub fn remove(&mut self, index: usize) -> T {
    assert!(index < self.len, "index out of bounds");
    unsafe {
      self.len -= 1;
      let result = ptr::read(self.ptr().add(index));
      ptr::copy(self.ptr().add(index + 1), self.ptr().add(index), self.len - index);
      result
    }
  }
}

impl<T> Drop for MyVec<T> {
  fn drop(&mut self) {
    while let Some(_) = self.pop() {}
  }
}

struct IntoIter<T> {
  start: *const T,
  end: *const T,
  _buf: RawVec<T>,
}

impl<T> Iterator for IntoIter<T> {
  type Item = T;
  fn next(&mut self) - >Option<T> {
    if self.start == self.end {
      None
    } else {
      unsafe {
        let result = ptr::read(self.start);
        self.start = self.start.offset(1);
        Some(result)
      }
    }
  }

  fn size_hint(&self) - > (usize.Option<usize>) {
    let len = (self.end as usize - self.start as usize) / mem::size_of::<T>();
    (len, Some(len))
  }
}

impl<T> DoubleEndedIterator for IntoIter<T> {
  fn next_back(&mut self) - >Option<T> {
    if self.start == self.end {
      None
    } else {
      unsafe {
        self.end = self.end.offset(-1);
        Some(ptr::read(self.end))
      }
    }
  }
}

impl<T> Drop for IntoIter<T> {
  fn drop(&mut self) {
    for _ in &mut *self{}}}struct RawVec<T> {
  ptr: MyUnique<T>,
  cap: usize,}impl<T> RawVec<T> {
  fn new() - >Self {
    RawVec {
      ptr: MyUnique::empty(),
      cap: 0,}}fn grow(&self) - >Self {
    unsafe {
      let (new_cap, ptr) = if self.cap == 0 {
        let ptr = alloc(Layout::array::<T>(1).unwrap());
        (1, ptr)
      } else {
        let new_cap = self.cap * 2;
        let layout = Layout::array::<T>(self.cap).unwrap();
        let ptr = realloc(self.ptr.as_ptr() as *mut _, layout, layout.size());
        if ptr.is_null() {
          handle_alloc_error(Layout::from_size_align_unchecked(
            new_cap * mem::size_of::<T>(),
            mem::align_of::<T>(),
          ));
        }
        (new_cap, ptr)
      };

      Self {
        ptr: MyUnique::new_unchecked(ptr as *mut _),
        cap: new_cap,
      }
    }
  }
}

impl<T> Drop for RawVec<T> {
  fn drop(&mut self) {
    if self.cap ! =0 {
      unsafe {
        dealloc(self.ptr.as_ptr() as *mut _, Layout::array::<T>(self.cap).unwrap()); }}}}Copy the code

Extraction iteration operation

Now that we have the basic Vec structure, we will do a package like RawVec before.

struct RawValIter<T> {
  start: *const T,
  end: *const T,
}

impl<T> RawValIter<T> {
  unsafe fn new(slice: &[T]) -> Self {
    RawValIter {
      start: slice.as_ptr(),
      end: if slice.len() == 0 {
        slice.as_ptr()
      } else {
        slice.as_ptr().offset(slice.len() as isize)},}}}impl<T> Iterator for RawValIter<T> {
  type Item = T;

  fn next(&mut self) - >Option<T> {
    if self.start == self.end {
      None
    } else {
      unsafe {
        let result = ptr::read(self.start);
        self.start = self.start.offset(1);
        Some(result)
      }
    }
  }

  fn size_hint(&self) - > (usize.Option<usize>) {
    let len = (self.end as usize - self.start as usize) / mem::size_of::<T>();
    (len, Some(len))
  }
}

impl<T> DoubleEndedIterator for RawValIter<T> {
  fn next_back(&mut self) - >Option<T> {
    if self.start == self.end {
      None
    } else {
      unsafe {
        self.end = self.end.offset(-1);
        Some(ptr::read(self.end))
      }
    }
  }
}
Copy the code

Then modify the iterator so that it now simply calls the RawValIter implementation inside each function

struct IntoIter<T> {
  _buf: RawVec<T>,
  iter: RawValIter<T>,
}

impl<T> Drop for MyVec<T> {
  fn drop(&mut self) {
    while let Some(_) = self.pop() {}
  }
}

impl<T> Iterator for IntoIter<T> {
  type Item = T;
  fn next(&mut self) - >Option<T> {
    self.iter.next()
  }

  fn size_hint(&self) - > (usize.Option<usize>) {
    self.iter.size_hint()
  }
}

impl<T> DoubleEndedIterator for IntoIter<T> {
  fn next_back(&mut self) - >Option<T> {
    self.iter.next_back()
  }
}

impl<T> Drop for IntoIter<T> {
  fn drop(&mut self) {
    for _ in &mut self.iter {}
  }
}
Copy the code

Deal with a Zero – Sized Types

Normally Rust does not need to deal with zero-sized Types, but now we have a lot of naked pointer operations in our code. Passing ZST to the allocator would result in undefined behavior. Offsetting a ZST naked pointer is a no-op behavior.

If T is a size_of, usize::MAX will negate the size_of bit. If T is a size_of bit, usize::MAX will negate the size_of bit. Logically no memory footprint.

impl<T> RawVec<T> {
  fn new() - >Self {
    let cap = if mem::size_of::<T>() == 0{!0 } else { 0 };
    RawVec {
      ptr: MyUnique::empty(),
      cap,
    }
  }
}
Copy the code

And then we do the grow function as well

impl<T> RawVec<T> {
  // ...
  fn grow(&self) - >Self {
    unsafe {
      let elem_size = mem::size_of::<T>();
      assert_ne!(elem_size, 0."capacity overflow");

      let (new_cap, ptr) = if self.cap == 0 {
        let ptr = alloc(Layout::array::<T>(1).unwrap());
        (1, ptr)
      } else {
        let new_cap = self.cap * 2;
        let layout = Layout::array::<T>(self.cap).unwrap();
        let ptr = realloc(self.ptr.as_ptr() as *mut _, layout, layout.size());
        (new_cap, ptr)
      };

      if ptr.is_null() {
        handle_alloc_error(Layout::from_size_align_unchecked(
          new_cap * elem_size,
          mem::align_of::<T>(),
        ));
      }

      Self {
        ptr: MyUnique::new_unchecked(ptr as *mut _),
        cap: new_cap,
      }
    }
  }
}
Copy the code

RawVec drops also need to be handled, as does the previous implementation, but pretend to align them.

impl<T> Drop for RawVec<T> {
  fn drop(&mut self) {
    let elem_size = mem::size_of::<T>();
    if self.cap ! =0&& elem_size ! =0 {
      unsafe {
        dealloc(
          self.ptr.as_ptr() as *mut _,
          Layout::from_size_align_unchecked(self.cap * elem_size, mem::align_of::<T>()), ); }}}}Copy the code

Then there is RawValIter’s ZST treatment

impl<T> RawValIter<T> {
  unsafe fn new(slice: &[T]) -> Self {
    RawValIter {
      start: slice.as_ptr(),
      end: if mem::size_of::<T>() == 0 {
        ((slice.as_ptr() as usize) + slice.len()) as *const_}else if slice.len() == 0 {
        slice.as_ptr()
      } else {
        slice.as_ptr().offset(slice.len() as isize)},}}}Copy the code

Iterators also handle size_hint divisor zero

impl<T> Iterator for RawValIter<T> {
  type Item = T;

  fn next(&mut self) - >Option<T> {
    if self.start == self.end {
      None
    } else {
      unsafe {
        let result = ptr::read(self.start);
        self.start = if mem::size_of::<T>() == 0{(self.start as usize + 1) as *const_}else {
          self.start.offset(1)};Some(result)
      }
    }
  }

  fn size_hint(&self) - > (usize.Option<usize>) {
    let elem_size = mem::size_of::<T>();
    let len = (self.end as usize - self.start as usize)/(if elem_size == 0 { 1 } else { elem_size });
    (len, Some(len))
  }
}

impl<T> DoubleEndedIterator for RawValIter<T> {
  fn next_back(&mut self) - >Option<T> {
    if self.start == self.end {
      None
    } else {
      unsafe {
        self.end = if mem::size_of::<T>() == 0{(self.end as usize - 1) as *const_}else {
          self.end.offset(-1)};Some(ptr::read(self.end))
      }
    }
  }
}
Copy the code

with_capacity

So we’re almost done implementing this, so let’s change MyVec

impl<T> MyVec<T> {
  // ...
  pub fn with_capacity(capacity: usize) -> MyVec<T> {
    MyVec {
      buf: RawVec::with_capacity(capacity),
      len: 0}}}Copy the code

Then add an interface to RawVec

impl<T> RawVec<T> {
  fn new() - >Self {
    let cap = if mem::size_of::<T>() == 0{!0 } else { 0 };
    RawVec {
      ptr: MyUnique::empty(),
      cap,
    }
  }

  pub fn with_capacity(cap: usize) - >Self {
    RawVec::allocate_in(cap, None)}fn allocate_in(cap: usize, p: Option<MyUnique<T>>) -> Self {
    unsafe {
      let elem_size = mem::size_of::<T>();
      assert_ne!(elem_size, 0."capacity overflow");

      let (new_cap, ptr) = if cap == 0 {
        let ptr = alloc(Layout::array::<T>(1).unwrap());
        (1, ptr)
      } else {
        if let Some(some_p) = p {
          let new_cap = cap * 2;
          let layout = Layout::array::<T>(cap).unwrap();
          let ptr = realloc(some_p.as_ptr() as *mut _, layout, layout.size());
          (new_cap, ptr)
        } else {
          let ptr = alloc(Layout::array::<T>(cap).unwrap());
          (cap, ptr)
        }
      };

      if ptr.is_null() {
        handle_alloc_error(Layout::from_size_align_unchecked(
          new_cap * elem_size,
          mem::align_of::<T>(),
        ));
      }

      Self {
        ptr: MyUnique::new_unchecked(ptr as *mut _),
        cap: new_cap,
      }
    }
  }

  fn grow(&self) - >Self {
    RawVec::allocate_in(self.cap, Some(self.ptr))
  }
}
Copy the code

We’ve now extracted grow and called with_capacity

Drain

It is to be decided at present. This is a bit more troublesome. I haven’t thought about how to say it.

Git hosting

Finally, I’ll put the code up

E.coding.net/limitLiu/al…

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Rust Data Structure – Hands-on implementation of Vec

The body of the

Bullshit stage

Basic usage

layout

Memory allocation

push & pop

recycling

Reference solution

Insert and delete

insert

delete

IntoIter

RawVec

Extraction iteration operation

Deal with a Zero – Sized Types

with_capacity

Drain

Git hosting

Rust Data Structure – Hands-on implementation of Vec

The body of the

Bullshit stage

Basic usage

layout

Memory allocation

push & pop

recycling

Reference solution

Insert and delete

insert

delete

IntoIter

RawVec

Extraction iteration operation

Deal with a Zero – Sized Types

with_capacity

Drain

Git hosting

Related Posts

A serious front-end learning open source repository (Day 3, April)

React Native App Stomp pits

JSI small test – Native synchronous call JS code