Abstract
For locked caches, StripedMap can help speed up access.
Typical application scenario: SideTable.
What is a StripedMap?
As you can see from the comments, it can be used to optimize locked caches.
// StripedMap<T> is a map of void* -> T, sized appropriately
// for cache-friendly lock striping.
// For example, this may be used as StripedMap<spinlock_t>
// or as StripedMap<SomeStruct> where SomeStruct stores a spin lock.
template<typename T>
class StripedMap {
#ifTARGET_OS_IPHONE && ! TARGET_OS_SIMULATOR
enum {StripeCount = 8}; // For iPhone, there can be 8 sidetables
#else
enum {StripeCount = 64};
#endif
struct PaddedT {
T value alignas(CacheLineSize);
};
PaddedT array[StripeCount];
// The index returned by the address is used to value from the array
static unsigned int indexForPointer(const void *p) {
uintptr_t addr = reinterpret_cast<uintptr_t>(p);
// The core line of code
return ((addr>> 4) ^ (addr >> 9)) % StripeCount;
}
// There are several lock methods behind it. }Copy the code
I think of it as the “manager” of the cache.
Mainly used to solve this problem:
With a locked cache, access speeds are correspondingly slow, especially in the case of high concurrency.
StripedMap provides the following solution:
Directly more than a few caches, when the use of redistribution can be appropriate.
Application scenarios
Typical scenario SideTable.
For example, it is present in the function that holds the weak relationship.
static StripedMap<SideTable>& SideTables() { return *reinterpret_cast<StripedMap<SideTable>*>(SideTableBuf); } storeWeak(id *location, objc_object *newObj) { ... retry: if (haveOld) { oldObj = *location; oldTable = &SideTables()[oldObj]; // use StripedMap} else {oldTable = nil; } if (haveNew) { newTable = &SideTables()[newObj]; } else { newTable = nil; }... }Copy the code
How did you do that?
(addr>> 4) ^ (addr>> 9)) % StripeCount; .
When I searched for relevant information, I came across this answer:
The system can maintain 8 weak_table_t cocurrently at most for 8 thread.
The misconception is that a StripedMap guarantees one cache per thread.
To do this, I wrote the following code test:
((addr>> 4) ^ (addr>> 9)) % StripeCount
dispatch_async(dispatch_get_global_queue(0, 0), ^{ { NSObject *obj = [NSObject new]; uintptr_t addr = &obj; int r = ((addr>> 4) ^ (addr >> 9)) % 8; NSLog(@"%@, %p -> %d", [NSThread currentThread], obj, r); } { NSObject *obj = [NSObject new]; uintptr_t addr = &obj; int r = ((addr>> 4) ^ (addr >> 9)) % 8; NSLog(@"%@, %p -> %d", [NSThread currentThread], obj, r); }});Copy the code
Print the following result, as you can see, although the same thread, the calculated index is different.
<NSThread: 0x600000024c40>{number = 6, name = (null)}, 0x600001768180 -> 7
<NSThread: 0x60000002b800>{number = 5, name = (null)}, 0x600001748000 -> 0
Copy the code
So, I think it’s just a diversion so that concurrent threads “try” not to pile into the same cache.
Why would a simple line ((addr>> 4) ^ (addr>> 9)) % StripeCount do this?
Does it have something to do with memory address rules? For example, there’s a pattern to the address of objects that are created close together, right?
I really did not understand, hope to know the reader can inform, very grateful!
summary
StripedMap “speeds up access” to a locked cache.
Internally, it subtly allows concurrent threads to access different caches “as much as possible” depending on the object address.
The typical application scenario is to store SideTable of refCount “reference count” and weakTable “weak reference relation”.
I also think that in server development, if the amount of data is too large to be supported by a single machine, the data will be divided into multiple machines, I have to say that the approach here has the same idea.