This is the 8th day of my participation in the August Text Challenge.More challenges in August

Various locking machine performance

In our cognition, do we think that @synchronized has the highest performance cost? In actual measurement, we find that it is not so.

/** OSSpinLock performance */
        OSSpinLock kc_spinlock = OS_SPINLOCK_INIT;
        double_t kc_beginTime = CFAbsoluteTimeGetCurrent(a);for (int i=0 ; i < kc_runTimes; i++) {// kc_runTimes = 100000
            OSSpinLockLock(&kc_spinlock);          / / unlock
            OSSpinLockUnlock(&kc_spinlock);
        }
        double_t kc_endTime = CFAbsoluteTimeGetCurrent() ;
        KCLog(@"OSSpinLock: %f ms",(kc_endTime - kc_beginTime)*1000);
/ * *... * /
Copy the code

The real machine verification is as follows:The simulator verification is as follows:So in the real world@synchronizedNot the worst performance. The one we use most frequently is @synchronized

Analysis on the principle of @synchronized

  @synchronized (self) {}Copy the code

The argument that we normally pass in is self, so what does self mean? So let’s xcrun, Or the console type clang-x objective-c-rewrite-objc-isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk Main. m now has a.cpp file, open search and navigate directly to the main function

So let’s just get rid of the catch, and focus on the block inside the try, and format it a little bit

id _rethrow = 0;
id _sync_obj = (id)appDelegateClassName;
objc_sync_enter(_sync_obj);
try {
    struct _SYNC_EXIT {_SYNC_EXIT(id arg) : sync_exit(arg) {} ~_SYNC_EXIT() {objc_sync_exit(sync_exit); } id sync_exit; } _sync_exit(_sync_obj); }Copy the code

Here _SYNC_EXIT is a structure containing a constructor and a destructor. This structure can be pulled out of the method, and you get

objc_sync_enter(_sync_obj);
_sync_exit(_sync_obj);
Copy the code

Replace the destructor to get:

objc_sync_enter(_sync_obj);
objc_sync_exit(_sync_obj);
Copy the code

Here you have the two key functions for @synchronized. Let’s search in the LibObjc source code

objc_sync_enter

int objc_sync_enter(id obj)
{
    int result = OBJC_SYNC_SUCCESS;

    if (obj) {
        SyncData* data = id2data(obj, ACQUIRE);
        ASSERT(data);
        data->mutex.lock();
    } else {
        // @synchronized(nil) does nothing
        if (DebugNilSync) {
            _objc_inform("NIL SYNC DEBUG: @synchronized(nil); set a breakpoint on objc_sync_nil to debug");
        }
        objc_sync_nil();
    }

    return result;
}
Copy the code

As you can see from the comments above, if we pass nil as an argument, nothing is done.

objc_sync_exit

int objc_sync_exit(id obj)
{
    int result = OBJC_SYNC_SUCCESS;   
    if (obj) {
        SyncData* data = id2data(obj, RELEASE); 
        if(! data) { result = OBJC_SYNC_NOT_OWNING_THREAD_ERROR; }else {
            bool okay = data->mutex.tryUnlock();
            if(! okay) { result = OBJC_SYNC_NOT_OWNING_THREAD_ERROR; }}}else {
        // @synchronized(nil) does nothing
    }
    return result;
}
Copy the code

From the source can also see that these two are a pair of functions, one in one out. SyncData* data = ID2Data (obj, RELEASE). Click on SyncData and you can see that this is a linked list structure.

typedef struct alignas(CacheLineSize) SyncData {
    struct SyncData* nextData;
    DisguisedPtr<objc_object> object; // Associate objects
    int32_t threadCount;  // Number of THREADS using this block
    recursive_mutex_t mutex; / / recursive locking
} SyncData;
Copy the code

SyncList analysis

What is SyncData fetched from a global static table called sDataLists?

spinlock_t *lockp = &LOCK_FOR_OBJ(object);
SyncData **listp = &LIST_FOR_OBJ(object);
Copy the code
#define LOCK_FOR_OBJ(obj) sDataLists[obj].lock
#define LIST_FOR_OBJ(obj) sDataLists[obj].data
static StripedMap<SyncList> sDataLists;
Copy the code

StripedMapIt’s actually a hash table, and we can break p here

When the obJ objects are the same, use the zipper method to store SyncData, which is added when locked, because there is no need to query, so using this data storage structure is the most clever.

Analysis of @synchronized principle

Id2data analysis folds up the conditional judgment first

#if SUPPORT_DIRECT_THREAD_KEYS
#endif
Copy the code

The logic is the same, if TLS is available, look for TLS and if not, look for Cache. There was no data when we first came in. The first thing that comes in is this line of code

posix_memalign((void **)&result, alignof(SyncData), sizeof(SyncData));
result->object = (objc_object *)object;
result->threadCount = 1;
new (&result->mutex) recursive_mutex_t(fork_unsafe_lock);
result->nextData = *listp;
*listp = result; 
Copy the code

The list is a head plug. Create it first, and then store it in the table. The next time you come in,data has a value

 SyncData *data = (SyncData *)tls_get_direct(SYNC_DATA_DIRECT_KEY);
Copy the code
if (data->object == object)
Copy the code

If it is the same object, it determines how many times the current object is locked, and lockCount changes accordingly. If the current lockCount=0 indicates that the current unlock is complete, it is removed from the current cache; Create a new SyncList if it is not the same object

Conclusion: The most important thing about @synchronized are two identities. ThreadCount indicates how many threads have locked the current object, and lockCount indicates how many times the object has been locked in the same thread. These two flags indicate that @synchronized is multithreaded recursively.

@ synchronized summary

  1. SyncListIs a hash table, using the zipper method of storagesyncData
  2. sDatalistarrayThe store issynclist(objc)
  3. The underlying function is thetaobjc_sync_enter/exitSymmetric recursive lock
  4. Two types of storage are used: TLS /cache
  5. Not the first timesyncDataWhen, through the head plug method linked list creation, tagthreadcount=1
  6. The second entry determines if it is the same object, not recreating the tag
  7. If TLS is found ->lock++
  8. TLS could not be found to recreate onesyncDataAnd forthreadCount++
  9. If it isexitfunctionlock-- threadCount--

Synchronized: objects with reentrant recursive multithreaded locks are not null

One of the reasons we usually add self is because life cycle management ensures that locked objects are not accidentally released, and the other reason is because we know that the syncList value is objc, and we use a self on the same page, zipping only the current self for easy storage and release. In addition, at the beginning of the experiment, we found that the synchronized performance of the simulator and the real machine was quite different

template<typename T>
class StripedMap {
#ifTARGET_OS_IPHONE && ! TARGET_OS_SIMULATOR
    enum { StripeCount = 8 };
#else
    enum { StripeCount = 64 };
#endif

Copy the code

If it’s a real machine, count = 8 means you can have 8 threads.