Datenlord | Rust unlocked the Crossbeam of programming Epoch parsing algorithm
Author: Shi Jicheng
The last article introduced EBR, a memory management mechanism for lock-free data structures, which performs more efficiently than other memory management mechanisms. However, due to the complexity of the concept, the implementation of EBR is not easy, and it is necessary to implement EBR from scratch for every unlocked data structure. Therefore, it is natural for everyone to consider extracting the core concept of EBR epoch into a library for everyone to reuse. Crossbeam-epoch is a mature and widely used EBR library. In this paper, the implementation principles of crossbeam-Epoch will be analyzed in detail and carried out in the process.
Guard is just the outermost shell
As mentioned earlier, people generally only use guard when interacting with crossbeam-epoch, as shown below:
/// delete data
{
let guard = epoch::pin();
guard.defer(move || mem.release());
}
/// get data for reading
{
let guard = epoch::pin();
let value = map.get(&key, &guard);
/ / /... Use the value
}
Copy the code
When reading data, Guard acts as a life-cycle guardian, ensuring that the retrieved data reference (value in the code above) has a lifetime no longer than Guard, and that when Guard is destroyed, the value is also destroyed. Guard plays a more complex role in deleting the data, registering the destruction function on the queue that Defer. When the destruction function is called requires a deeper understanding of its internal implementation details.
What exactly did Pin do
What the epoch:: Pin () did, as explained in the official code comments, was to pin the current thread in order to place a pointer to the data on the heap on the stack. In fact, this explanation only explains the above content of reading data. The underlying operations are as follows:
- The current thread is registered with the Global collector only once per thread.
- Gets the current global Epoch and sets it to the current thread, indicating which Epoch the current thread belongs to while the PIN is in effect.
- Record the number of times the current thread has been pined. When the number reaches a certain number, try to make the Global collector push the Epoch growth and recycle garbage.
- Increment the guard_count count to record how many guards have been created and not yet destroyed.
Since pin() can be called repeatedly, two consecutive calls to epoch::pin() are also allowed. Except for the first call, none of the other calls do anything except increase the guard_count count. See pin method of Local struct in internal. Rs file for details.
The Global collector mentioned here is responsible for all resource collection, collecting garbage that needs to be collected from various threads and collecting it when appropriate.
Epoch propulsion
The Epoch Number needs to be continuously iterated forward, and in the process of iteration, the garbage collector will recycle the recyclable garbage belonging to the old Epoch Number. Each time the Global collector wants to reclaim garbage, it will try to advance the Epoch Number. If the following conditions are met, the Epoch Number will advance successfully:
- All registered threads are in the current Epoch, that is, there are no threads in the previous Epoch.
If the conditions are not met, the Epoch advance fails. See the try_advance method of the Global struct in the internal.rs file for an implementation.
Garbage collection mechanism
If all threads register garbage to be collected with the Global collector, then there is a very large race, and the more threads you have, the more frequently you operate, the greater the performance impact. To resolve the contention caused by shared data structures, each thread maintains its own garbage collection queue, which is 62 in length. When the queue is full, the thread uniformly moves the data from the local queue to the Global collector and places it in the garbage linked list of the collector. It is worth noting here that in addition to the garbage collection function, the linked list also contains the Epoch Number generated by the garbage, which is used to determine whether the corresponding garbage can be collected.
There are two triggers for garbage collection, one active and one passive. The active trigger is the Guard flush method, which causes the Global collector to attempt to collect garbage from the garbage linked list. The passive trigger is the PIN method of Guard, which triggers a garbage collection every 128 calls to pin.
Garbage that meets the following conditions can be recycled:
- (Global Epoch Number) > ((Garbage Epoch Number) + 1) means that the Garbage Epoch is at least two generations older than the current Epoch.
For specific implementation, see collect method of Global struct in internal. Rs file.
Internal data structure
It’s worth mentioning that there are two internal data structures, a List and a Queue. Lists are used to manage registered threads and queues are used to manage garbage waiting to be collected. What these two data structures have in common is that they are manipulated by multiple threads at the same time. For efficient implementation, CrossBeam does not use locks to protect the data structure, but instead implements an internal lockless data structure.
List
The List has a new element insert method, which is to insert the new element into the head of the List. The implementation uses the CAS atomic operation. When multiple threads are inserting simultaneously, only one operation will succeed at a time, and the failed operation will get the new header value and try again.
The List deletion operation does not actually remove the element, but marks the element, and is finally deleted in an Iteration process that also uses the CAS atomic operation. If multiple threads are trying to remove the same element, only one will succeed. If an element is deleted and its predecessor is marked as removable, the Iteration caller is told to iterate over it again. Of course, the caller can iterate over it as the case may be and leave it to someone else.
See the list.rs file for the implementation.
Queue
The insertion of Queue is similar to that of List except that the insertion point is tail. The deletion of a Queue is called a POP, and data is ejected from the head of the Queue. If the ejected data fails, another thread is ejected, and you need to try again using the location of the head.
What about the elements that are deleted from lists and queues? Crossbeam takes a bootstrap approach, which also places it in a garbage collection queue and waits for a later operation to trigger a garbage collection.
conclusion
Crossbeam-epoch provides an extremely convenient tool to hide the implementation details of the Epoch in the library and expose the user with a very simple interface, so that people can pay more attention to the details of the data structure when implementing lockless data structure and leave the memory recycling work to crossbeam-Epoch.