The concept of iOS multi-threaded security can be encountered in many places. Why it is not safe and how to define it is actually a topic worth studying deeply.

Shared state, multi-threaded joint access to the property of an object, in iOS programming is a very common use scenario, we will start from the multi-threaded property security.

Property

When we discuss multi-threaded property security, many people know that adding atomic attribute to the property can guarantee multi-threaded security to a certain extent, like:

@property (atomic, strong) NSString*                 userName;Copy the code

Things are not as simple as they seem. To analyze how properties behave in multithreaded scenarios, you need to distinguish the types of properties.

We can simply classify property as a value type and an object type. The value type refers to the primitive type, including int, long, bool and other non-object types. The other type is an object type, which can be declared as a pointer to a memory area that meets the type definition.

In this code, userName is an object type. When we access userName, we may access userName itself, or we may access the memory area that userName points to.

Such as:

self.userName = @"peak";Copy the code

It’s assigning to the pointer itself. while

[self.userName rangeOfString:@"peak"];Copy the code

Is to access the memory region of the string to which the pointer points, which is different.

So we can roughly divide property into three categories:



After sorting, we need to understand the memory model for these three types of properties.

Memory Layout

When we talk about multithreading safety, we are talking about the safety of multiple threads accessing the same memory area at the same time. For the same area, we have two operations, load and store. When read and write occur in the same area at the same time, multithreading insecurity may occur. Therefore, before we start the discussion, we should first understand the memory model of the above three properties, which can be used as follows:



On a 64-bit system, for example, the pointer NSString* is an 8-byte memory region, int Count is a 4-byte memory region, and @ “Peak” is a memory region depending on the length of the string.

When we access property, we actually access the three memory areas shown above.

self.userName = @"peak";Copy the code

Is to modify the first field.

self.count = 10;Copy the code

It’s modifying the second area.

[self.userName rangeOfString:@"peak"];Copy the code

It’s reading the third block.

Definition of unsafe

Now that we know the types of property and their corresponding memory model, let’s look at the definition of unsafe. Wikipedia says this:

A piece of code is thread-safe if it manipulates shared data structures only in a manner that guarantees safe execution by multiple threads at the same time

While this definition may seem a bit abstract, we can interpret multi-threaded insecurity as: unexpected results when accessing multiple threads. This unexpected result includes several scenarios, not necessarily crash, which will be analyzed later.

Let’s take a look at how multiple threads can access memory simultaneously. Regardless of how many variables are cached by the CPU cache, memory access can be expressed as follows:



As you can see from the figure above, we only have one address bus and one memory. Even in a multithreaded environment, there can be no two threadsAt the same timeIn the case of accessing the same memory area, memory access must be sequential queued through an address bus, so before we proceed, we need to clarify a few conclusions:

Conclusion 1: Memory access is serial and will not lead to memory data disorder or application crash.

Conclusion 2: If the memory length of the load or store is less than or equal to the length of the address bus, then the load or store operation is atomic and completed at one time. For example bool, int, long on 64-bit systems a single read or write is atomic.

Next, let’s take a look at multi-threaded insecure scenarios one by one according to the classification of the three properties above.

The Property value type

Using the BOOL type as an example, when we have two threads accessing the following property:

@property (nonatomic, strong) BOOL    isDeleted;

//thread 1
bool isDeleted = self.isDeleted;

//thread 2
self.isDeleted = false;Copy the code

Thread 1 and thread 2, one load and one store, may have sequential accesses to BOOL isDeleted, but they must be queued sequentially. And since a BOOL is only 1 byte in size, the address bus of a 64-bit system supports 8 bytes for read/write instructions, so we can consider BOOL reads and writes to be atomic. So when we declare a BOOL property, from an atomic point of view, There is no real difference between using atomic and nonatomic (unless, of course, you override getters).

What if it’s an int?

@property (nonatomic, strong) int    count;

//thread 1
int curCount = self.count;

//thread 2
self.count = 1;Copy the code

In the same way that an int is 4 bytes long, both reads and writes can be performed by a single instruction, so theoretically both reads and writes are atomic. There is no difference between nonatomic and atomic in terms of access to memory.

What exactly is the use of atomic? As far as I know, there are two uses:

Use one: Generate getters and setters for atomic operations.

After atomic is set, the default generated getter and setter methods are atomic. That is, when we execute the getter on thread 1 (create the stack, return the address, unload the stack), thread B must wait for the getter to complete before executing the setter. For example, on a 32-bit system, if a 64-bit double is returned from a getter, and the address bus width is 32 bits, reading a double from memory cannot be done by atomic operations. Without atomic locking, setter operations may occur on other threads in the middle of reading. Hence the outliers. If such an outlier occurs, multithreaded insecurity occurs.

Set Memory barriers

For an Objective C implementation, almost all locking operations end up setting a memory barrier. Atomic operations essentially lock getters and setters, so they also set a memory barrier. The official document states as follows:

Note: Most types of locks also incorporate a memory barrier to ensure that any preceding load and store instructions are completed before entering the critical section.

What are memory barriers for?

Memory barriers ensure that memory operations are performed in the order in which our code is written. As weird as it may sound, the fact is that the compiler optimizes our code, changing the order of machine instructions that our code ultimately translates into when it sees fit. This means the following code:

self.intA = 0;  //line 1
self.intB = 1; //line 2Copy the code

The compiler may execute line2 before line1 in some scenarios because it assumes that there is no dependency between A and B, although there is some dependency between intA and intB in the other thread at the time of code execution and line1 must be required to execute before line2.

If the property is atomic, then the execution of line1 is guaranteed to precede the execution of line2. This is a very rare scenario where cross-thread dependencies and compiler optimizations occur. In this extreme case, atomic would make our code a little bit safer for multithreading, but I haven’t seen one so far in writing iOS code, and it’s likely that the compiler is smart enough to place memory barriers where we need them.

Is using atomic necessarily multithreaded safety? We can look at the following code:

@property (atomic, assign)    int       intA;

//thread A
for (int i = 0; i < 10000; i ++) {
    self.intA = self.intA + 1;
    NSLog(@"Thread A: %d\n", self.intA);
}

//thread B
for (int i = 0; i < 10000; i ++) {
    self.intA = self.intA + 1;
    NSLog(@"Thread B: %d\n", self.intA);
}Copy the code

Even if I declared intA to be atomic, the final result would not necessarily be 20000. The reason is because self.intA = self.intA + 1; It’s not atomic. The getters and setters of intA are atomic, but when we use intA, the whole statement is not atomic, and this line of assignment has at least three steps: load, +1, store, When the current thread stores, other threads may have executed stores several times, resulting in a smaller value than expected. This scenario can also be called multi-threaded insecurity.

Pointer to the Property

Pointer Property usually refers to an object, such as:

@property (atomic, strong) NSString*                 userName;Copy the code

Regardless of whether iOS is 32-bit or 64-bit, the value of a pointer can be loaded or stored by a single instruction. But unlike Primitive Type, object types also have memory management operations. In the MRC era, setters generated by the system are similar to the following by default:

- (void)setUserName:(NSString *)userName {
    if(_uesrName != userName) {
        [userName retain];
        [_userName release];
        _userName = userName;
    }
}Copy the code

Not only assignment, but also retain release calls. If the property is nonatomic, the above setter method is not atomic. We can assume a scenario where thread 1 gets the current _userName through the getter, and thread 2 calls [_userName release] through the setter. , the _userName held by thread 1 will become invalid address space. If you send messages to this address space again, it will cause crash, and the scene of multi-thread insecurity will occur.

By the time of ARC, Xcode already handled retain and release for us. Most of the time, we didn’t need to care about memory management, but retain and release actually existed in the last code to run. There is still a theoretical difference between atomic and nonatomic property declarations for object classes, but I have never experienced such multithreaded insecurity in my actual use of NSString* as nonatomic. It is highly likely that ARC’s memory management optimizations have already dealt with the above scenarios, so I personally feel that there is no real difference in multithreading safety if you just do read, write, atomic and nonatomic property on the object class.

The memory region to which the pointer Property points

This kind of multithreaded access scenario is where we can easily go wrong, even if we declare the property to be atomic. We are not accessing the pointer area of the property, but rather the memory area to which the property points. You can see the following code:

@property (atomic, strong) NSString*                 stringA;

//thread A
for (int i = 0; i < 100000; i ++) {
    if (i % 2 == 0) {
        self.stringA = @"a very long string";
    }
    else {
        self.stringA = @"string";
    }
    NSLog(@"Thread A: %@\n", self.stringA);
}

//thread B
for (int i = 0; i < 100000; i ++) {
    if (self.stringA.length >= 10) {
        NSString* subStr = [self.stringA substringWithRange:NSMakeRange(0, 10)];
    }
    NSLog(@"Thread B: %@\n", self.stringA);
}Copy the code

Self. stringA = @”a very long string”; self.stringA = @”a very long string”; The next time thread A fetches the substring, it has already set self.stringa = @”string”; Out of bounds, crash, multithreading is unsafe.

The same scenario applies to collection classes, such as:

@property (atomic, strong) NSArray*                 arr;

//thread A
for (int i = 0; i < 100000; i ++) {
    if (i % 2 == 0) {
        self.arr = @[@"1", @"2", @"3"];
    }
    else {
        self.arr = @[@"1"];
    }
    NSLog(@"Thread A: %@\n", self.arr);
}

//thread B
for (int i = 0; i < 100000; i ++) {
    if (self.arr.count >= 2) {
        NSString* str = [self.arr objectAtIndex:1];
    }
    NSLog(@"Thread B: %@\n", self.arr);
}Copy the code

Similarly, even if we do count before accessing objectAtIndex, thread B is still prone to crash because the memory region pointed to by arR between two lines of code has been modified by another thread.

So you see, what you really need to worry about is the access to this type of memory area. Even if declared as atomic, it is useless. Our common App appears inexplicable and difficult to repeat multi-threaded crash is mostly in this category, once accessing this type of memory area in multi-threaded scenario, we should mention extremely careful. How to avoid this type of crash will be discussed later.

Property Multithreading security summary:

In short, atomic locks getters and setters. Atomic guarantees that code is safe when it enters a getter or setter function. Once out of the getter or setter, multithreading is up to the programmer. So atomic attributes are not directly related to multithreading safety using properties. In addition, atomic locking will also bring some performance loss, so when we write iOS code, we generally declare property as nonatomic. In the case of multi-thread safety, we can add additional locks for synchronization.

How to achieve multi-threaded security?

The key word is atomicity. As long as you have atomicity, as small as accessing a primitive type variable, as large as executing a long piece of code logic, atomic properties ensure that the code is executed sequentially, so that the code can be executed halfway through. No other thread will intervene.

Atomicity is a relative concept, which can be large or small in granularity.

For example, the following code:

if (self.stringA.length >= 10) {
    NSString* subStr = [self.stringA substringWithRange:NSMakeRange(0, 10)];
}Copy the code

Nonatomic.

But with locks:

//thread A
[_lock lock];
for (int i = 0; i < 100000; i ++) {
    if (i % 2 == 0) {
        self.stringA = @"a very long string";
    }
    else {
        self.stringA = @"string";
    }
    NSLog(@"Thread A: %@\n", self.stringA);
}
[_lock unlock];

//thread B
[_lock lock];
if (self.stringA.length >= 10) {
    NSString* subStr = [self.stringA substringWithRange:NSMakeRange(0, 10)];
}
[_lock unlock];Copy the code

The whole code is atomic, and it’s considered multithreaded safe.

Such as:

if (self.arr.count >= 2) {
    NSString* str = [self.arr objectAtIndex:1];
}Copy the code

Nonatomic.

while

//thread A
[_lock lock];
for (int i = 0; i < 100000; i ++) {
    if (i % 2 == 0) {
        self.arr = @[@"1", @"2", @"3"];
    }
    else {
        self.arr = @[@"1"];
    }
    NSLog(@"Thread A: %@\n", self.arr);
}
[_lock unlock];

//thread B
[_lock lock];
if (self.arr.count >= 2) {
    NSString* str = [self.arr objectAtIndex:1];
}
[_lock unlock];Copy the code

It’s atomic. Note that locks are required for both reading and writing.

That’s why when we do multithreaded security, instead of adding the atomic keyword to the property, we declare the property as nonatomic (nonatomic has no getter, setter lock overhead) and lock it ourselves.

How to use which locks?

IOS locks code in a number of ways, commonly used are:

  • @synchronized(token)
  • NSLock
  • dispatch_semaphore_t
  • OSSpinLock

Each of these locks can lead to atomicity and a smaller performance loss from top to bottom.

My personal advice is to use whatever is comfortable except OSSpinLock when writing application-layer code. The correctness of the code logic is more important than the performance differences of these locks. And the performance differences between them are mostly invisible to users.

Of course, there will be a few scenarios where code performance needs to be pursued, such as writing a framework, or in scenarios where multi-threaded reads and writes share data frequently, and we need to get a rough idea of how much loss locks can cause.

According to the official document, using intel-based iMac with a 2 GHz Core Duo processor and 1 GB of RAM running OS X V10.5, the loss of mutex obtained is about 0.2ms. We can think of lock losses as being in the ms range.

Atomic Operations

In addition to various locks, there is another way to achieve atomicity on iOS. Using Atomic Operations is an order of magnitude less costly than locking, and can be seen in some high-performance third-party Framework code. The atomic operation can be found in/usr/include/libkern/OSAtomic h to:



Such as

_intA ++;Copy the code

Nonatomic.

while

OSAtomicIncrement32(&(_intA));Copy the code

It’s atomic, multithreaded safe.

Atomic Operation can only be used with 32-bit or 64-bit data types, and locks are still required in scenarios where multiple threads are using objects such as NSString or NSArray.

Most Atomic operations have two versions: OSAtomicXXX and OSAtomicXXXBarrier. A Barrier is a memory Barrier that can be used when multiple variables depend on each other. Ensures the correct order of dependencies.

For the preparation of application layer multithreaded security code at ordinary times, I still recommend you to use @synchronized, NSLock, or dispatch_semaphore_t, multithreaded security is more important than multithreaded performance, should be fully guaranteed in the former, there is spare time to pursue the latter.

Try to avoid multi-threaded designs

No matter how much code we write, we have to admit that multithreaded security is a complex problem, and that as programmers we should avoid multithreaded design whenever possible, rather than pursue clever locking skills.

I’ll write an article later on functional programming and its core ideas, even if we use a non-functional programming language, such as Objective C, it can greatly help us avoid multithreaded safety issues.

conclusion

This is the end of the analysis of multi-thread unsafe in iOS. How to write multi-thread safe code depends on the understanding of memory layout and atomicity. I hope this article will explain the real difference between atomic and nonatomic.