In iOS development, if we want to access an object safely, in order to avoid the error caused by multi-threaded access, the first thing we think of is to use NSLock for locking. There is a class like this:
ThreadSafeQueue.m
@interface ThreadSafeQueue(a)
@property (nonatomic.assign) NSInteger sum;
@property (nonatomic.strong) NSLock *lock;
@end
@implementation ThreadSafeQueue
- (instancetype)init
{
self = [super init];
if (self) {
_lock = [[NSLock alloc] init];
}
return self;
}
- (void)add:(NSInteger)num {
[_lock lock];
_sum += num;
[_lock unlock];
}
Copy the code
This avoids multithreaded access issues, and we often use the @synchronized keyword to simplify such operations, such as:
- (void)add:(NSInteger)num {
@synchronized (self) { _sum += num; }}Copy the code
In this way, we do not need to create our own lock object, and use the system function, more efficient and reliable. So how exactly does the @synchronized method work?
We can use the Clang tool to compile Objective-C code into C++ code to see how the system works.
$ clang -rewrite-objc ThreadSafeQueue.m
After execution, a threadSafequeue. CPP file is generated in the current directory. Open the file to find the code for the add function:
ThreadSafeQueue.cpp
static void _I_ThreadSafeQueue_push_(ThreadSafeQueue * self, SEL _cmd, id element)
{
{
id _rethrow = 0;
id _sync_obj = (id)element;
objc_sync_enter(_sync_obj);
try {
struct _SYNC_EXIT {
_SYNC_EXIT(id arg) : sync_exit(arg) {}
~_SYNC_EXIT() {
objc_sync_exit(sync_exit);
}
id sync_exit;
} _sync_exit(_sync_obj);
((void (*)(id, SEL, ObjectType))(void *)objc_msgSend)((id)(*(NSMutableArray **)((char *)self + OBJC_IVAR_$_ThreadSafeQueue$_elements)), sel_registerName("addObject:"), (id)element);
}
catch (id e) {
_rethrow = e;
}
{ struct _FIN {
_FIN(id reth) : rethrow(reth) {}
~_FIN() {
if(rethrow) objc_exception_throw(rethrow); } id rethrow; } _fin_force_rethow(_rethrow); }}}Copy the code
From the C++ code converted, you can see that the compiler converts @synchronized to locking using the Runtime codes objc_sync_enter and objc_sync_exit.
We can look at the internal implementation here. Let’s dive into the Runtime code:
objc-sync.mm
// Begin synchronizing on 'obj'.
// Allocates recursive mutex associated with 'obj' if needed.
// Returns OBJC_SYNC_SUCCESS once lock is acquired.
int objc_sync_enter(id obj)
{
int result = OBJC_SYNC_SUCCESS;
if (obj) {
SyncData* data = id2data(obj, ACQUIRE);
assert(data);
data->mutex.lock();
} else {
// @synchronized(nil) does nothing
if (DebugNilSync) {
_objc_inform("NIL SYNC DEBUG: @synchronized(nil); set a breakpoint on objc_sync_nil to debug");
}
objc_sync_nil();
}
return result;
}
Copy the code
One thing the objc_sync_enter method does is lock based on arguments. So how does it lock? Let’s move on.
Here through the ID2DATA (ID object, enum usage why) method to obtain SyncData a node data, method according to the obj parameter to find the Cache object, of course, did not hit the Cache will create a concurrent Cache, the next look at the specific Cache search process.
objc-sync.mm
static SyncData* id2data(id object, enum usage why)
{
spinlock_t *lockp = &LOCK_FOR_OBJ(object);
SyncData **listp = &LIST_FOR_OBJ(object);
SyncData* result = NULL;
#if SUPPORT_DIRECT_THREAD_KEYS
// Check per-thread single-entry fast cache for matching object
bool fastCacheOccupied = NO;
SyncData *data = (SyncData *)tls_get_direct(SYNC_DATA_DIRECT_KEY);
if (data) {
fastCacheOccupied = YES;
if (data->object == object) {
// Found a match in fast cache.
uintptr_t lockCount;
result = data;
lockCount = (uintptr_t)tls_get_direct(SYNC_COUNT_DIRECT_KEY);
if (result->threadCount <= 0 || lockCount <= 0) {
_objc_fatal("id2data fastcache is buggy");
}
switch(why) {
case ACQUIRE: {
lockCount++;
tls_set_direct(SYNC_COUNT_DIRECT_KEY, (void*)lockCount);
break;
}
case RELEASE:
lockCount--;
tls_set_direct(SYNC_COUNT_DIRECT_KEY, (void*)lockCount);
if (lockCount == 0) {
// remove from fast cache
tls_set_direct(SYNC_DATA_DIRECT_KEY, NULL);
// atomic because may collide with concurrent ACQUIRE
OSAtomicDecrement32Barrier(&result->threadCount);
}
break;
case CHECK:
// do nothing
break;
}
returnresult; }}#endif
// Check per-thread cache of already-owned locks for matching object
SyncCache *cache = fetch_cache(NO);
if (cache) {
unsigned int i;
for (i = 0; i < cache->used; i++) {
SyncCacheItem *item = &cache->list[i];
if(item->data->object ! = object)continue;
// Found a match.
result = item->data;
if (result->threadCount <= 0 || item->lockCount <= 0) {
_objc_fatal("id2data cache is buggy");
}
switch(why) {
case ACQUIRE:
item->lockCount++;
break;
case RELEASE:
item->lockCount--;
if (item->lockCount == 0) {
// remove from per-thread cache
cache->list[i] = cache->list[--cache->used];
// atomic because may collide with concurrent ACQUIRE
OSAtomicDecrement32Barrier(&result->threadCount);
}
break;
case CHECK:
// do nothing
break;
}
return result;
}
}
// Thread cache didn't find anything. // Walk in-use list looking for matching object // Spinlock prevents multiple threads from creating multiple // locks for the same new object. // We could keep the nodes in some hash table if we find that there are // more than 20 or so distinct locks active, but we don't do that now.
lockp->lock();
{
SyncData* p;
SyncData* firstUnused = NULL;
for(p = *listp; p ! = NULL; p = p->nextData) {if ( p->object == object ) {
result = p;
// atomic because may collide with concurrent RELEASE
OSAtomicIncrement32Barrier(&result->threadCount);
goto done;
}
if ( (firstUnused == NULL) && (p->threadCount == 0) )
firstUnused = p;
}
// no SyncData currently associated with object
if ( (why == RELEASE) || (why == CHECK) )
goto done;
// an unused one was found, use it
if( firstUnused ! = NULL ) { result = firstUnused; result->object = (objc_object *)object; result->threadCount = 1; gotodone;
}
}
// malloc a new SyncData and add to list.
// XXX calling malloc with a global lock held is bad practice,
// might be worth releasing the lock, mallocing, and searching again.
// But since we never free these guys we won't be stuck in malloc very often. result = (SyncData*)calloc(sizeof(SyncData), 1); result->object = (objc_object *)object; result->threadCount = 1; new (&result->mutex) recursive_mutex_t(); result->nextData = *listp; *listp = result; done: lockp->unlock(); if (result) { // Only new ACQUIRE should get here. // All RELEASE and CHECK and recursive ACQUIRE are // handled by the per-thread caches above. if (why == RELEASE) { // Probably some thread is incorrectly exiting // while the object is held by another thread. return nil; } if (why ! = ACQUIRE) _objc_fatal("id2data is buggy"); if (result->object ! = object) _objc_fatal("id2data is buggy"); #if SUPPORT_DIRECT_THREAD_KEYS if (! fastCacheOccupied) { // Save in fast thread cache tls_set_direct(SYNC_DATA_DIRECT_KEY, result); tls_set_direct(SYNC_COUNT_DIRECT_KEY, (void*)1); } else #endif { // Save in thread cache if (! cache) cache = fetch_cache(YES); cache->list[cache->used].data = result; cache->list[cache->used].lockCount = 1; cache->used++; } } return result; }Copy the code
objc-sync.mm
typedef struct SyncData {
struct SyncData* nextData;
DisguisedPtr<objc_object> object;
int32_t threadCount; // number of THREADS using this block
recursive_mutex_t mutex;
} SyncData;
typedef struct {
SyncData *data;
unsigned int lockCount; // number of times THIS THREAD locked this block
} SyncCacheItem;
typedef struct SyncCache {
unsigned int allocated;
unsigned int used;
SyncCacheItem list[0];
} SyncCache;
/*
Fast cache: two fixed pthread keys store a single SyncCacheItem.
This avoids malloc of the SyncCache for threads that only synchronize
a single object at a time.
SYNC_DATA_DIRECT_KEY == SyncCacheItem.data
SYNC_COUNT_DIRECT_KEY == SyncCacheItem.lockCount
*/
struct SyncList {
SyncData *data;
spinlock_t lock;
SyncList() : data(nil) { }
};
// Use multiple parallel lists to decrease contention among unrelated objects.
#define LOCK_FOR_OBJ(obj) sDataLists[obj].lock
#define LIST_FOR_OBJ(obj) sDataLists[obj].data
static StripedMap<SyncList> sDataLists;
Copy the code
Let’s take a look at the data structures involved
The process can be seen in the code:
- Pay attention to
static StripedMap<SyncList> sDataLists
This property stores the global lock data structure for each object that needs to be locked.StripedMap
The corresponding data structure will be mapped according to the memory address of the incoming object, which will be used after the shift operation of the memory address. This is a global static property that ensures that every thread access is valid.
-
If you support SUPPORT_DIRECT_THREAD_KEYS, in this case you will only use SyncData and update it constantly.
- I’ll look in the cache first
SyncData
If the cache is found, use thewhy
Parameter to update cached datalockCount
Field indicating how many threads are waiting for the lock - Otherwise, proceed and search the thread cache
- Finally using
SyncData
thepthread_mutex_t
To lock,pthread_mutex_t
Initialization method definition ofThis is used hereNested locks
This option means to cache SyncData quickly in memory. I think that because it uses only one SyncData object to cache, the cache is refreshed through the lockCount field, which is faster than the following single-linked list cache method, which has the overhead of look-up tables. It is stored in a manner similar to that of NSUserdefault, which uses a fixed Key to fetch the cache. Of course, this uses the Linux kernel’s cache**address_space**, which is available here or in Resources
- I’ll look in the cache first
-
If the fast Thread cache is not found, the update continues against the Thread stored cache object SyncCache, which is stored in the Thread cache data structure _objC_pthread_data. The list property in SyncCache holds SyncData, and each object in the list property represents an object that needs to be protected, like @synchronized in the code above The self object (self), which has a SyncCache per thread, but only one copy of the same object, stored in a thread’s SyncCache, is guaranteed by the global sDataLists. If no cache is found, continue
-
To start creating the SyncData object, Os_unfair_lock specifies the os_UNFAIR_LOCK mutex lock, which prevents multiple threads from accessing multiple cache data for an object.
-
The SyncData object to be created is stored in the fast thread cache and thread cache. For next use to update the data inside.
-
When the external objc_sync_enter method obtains SyncData, it directly uses the nested lock of pthread_mutex_T to lock and protect multithreaded access
-
After the external operation is complete, the objc_sync_exit method is called to unlock the object. The internal implementation is also very simple, like the lock step, first find the SyncData corresponding to the object, then unlock the nested lock, update the data (update the number of threads used by the data and lock times).
Note static StripedMap
sDataLists; This global variable, which holds a pointer to the data being locked for the thread.
By looking at the source code, can generally understand the locking process and how to use the cache mechanism to improve performance, the following brief overview of the overall process
- Add for object A
@synchronized
methods - call
objc_sync_enter
methods - If fast thread cache is used, all are accessed in fast thread cache
- If fast thread caches are not used, internal thread caches are used, one for each thread
SyncCache
Cache object, store protected objects, but the same object can only be stored in a thread internal cache, by the globalsDataLists
The array stores the corresponding values of all protected objectsSyncData
Data memory address, to find it to update, before and after the update will be usedos_unfair_lock
Mutex plus lock - Returns when the cache object is found
SyncData
- use
SyncData
thepthread_mutex_t
Nested locks add locks - Called after the operation on object A
objc_sync_exit
Lock operation