What is the CAS

CAS is an atomic operation that allows the CPU to Compare And Swap whether a value in memory is the same as the expected value. If the value is the same as the expected value, the CPU will update the value to the new value. If the value is not the same, the CPU will not update the value. Its implementation is accomplished by calling CPU instructions with the help of C/C++, so it is very efficient. The principle of CAS is simple and is described here using a piece of Java code

Public Boolean compareAndSwap(int value, int expect, int update) {// If the value in memory is the same as expected expect, update the value to the new value updateif (value == expect) {
        value = update;
        return true;
    } else {
        return false; }}Copy the code

The general process is to give the CPU the value in memory, our expected value, and the new value. If the value in memory is the same as our expected value, the value is updated to the new value. Otherwise, nothing is done. This process is done in the CPU. It is not easy to describe the CPU’s working process here, so let’s use Java code to describe it.

Broadening source code

Java implements CAS in the Unsafe(Sun.misc. Unsafe) class, whereas Java does not have direct access to the underlying APIS of the operating system (due to its cross-platform nature). Instead of implementing CAS directly in the Unsafe class, Java implements CAS via **JDI(Java Native Interface)** Native calls to C/C++. Unsafe has many ways to manipulate CAS, but here are a few examples

public final native boolean compareAndSwapObject(Object var1, long var2, Object var4, Object var5);

public final native boolean compareAndSwapInt(Object var1, long var2, int var4, int var5);

public final native boolean compareAndSwapLong(Object var1, long var2, long var4, long var6);
Copy the code

Public final Native Boolean compareAndSwapInt(Object var1, Long var2, int var4, int var5); This method compares whether a value in memory (integer) is the same as our expected value (var4). If so, we update the value in memory to var5. The parameter var1 is the object in which the value is located, var2 is the memory offset of the value in the object (var1). The parameters var1 and var2 are used to locate the memory where the value resides.

Unsafe. Java plays an important role here:

  1. Passes the object reference, the offset of the value in the object, the expected value, and the new value to be updated toUnsafe.cpp
  2. Returns if the value was updated successfullytrueTo the developer, return if there is no updatefalse

Unsafe. CPP plays an important role here:

  1. Accept fromUnsafeObject reference, offset, expected value, and new value to be updated according to object reference and offsetCalculates the address of the valueAnd passes the address of the value, the expected value, and the new value to update to the CPU
  2. Returns if the value was updated successfullytruetoUnsafe.javaIf no updates are availablefalse

The CPU plays a role here:

  1. Accept fromUnsafe.cppThe address passed, the expected value, and the new value to update, execute the instructioncmpxchgCompares whether the value in the address is the same as the expected value. If the value is the same, the value is updated to the new value. If the value is different, no operation is performed
  2. Returns the result of the operation toUnsafe.cpp

The disadvantage of the CAS

Although CAS achieves atomic operation efficiently, it also has some disadvantages, mainly in the following three aspects.

ABA problem

In multi-threaded scenarios, ABA problems may occur in CAS. For example, two threads perform CAS operations on the same value (the initial value is A) at the same time. These three threads are as follows

  1. Thread 1, expected value A, to update value B
  2. Thread 2, expected value A, to update value B

Thread 1 scoop CPU time slice, and thread 2 blocked because of other reasons, thread 1 values and expectations of A value comparison, found that equal and then update the value of B and this time the thread 3, expectations for B, to update the value of A, thread 3 values compared with the desired value B, found that equal to update the value of A thread 2 at this time to recover from blocking, At this time, thread 2’s value is compared with the expected value A, and the value is found to be equal to B. Although thread 2 has also completed the operation, thread 2 does not know that the value has gone through the process of A->B->A.

1) Thread 1 (ATM) : get the current value of 100, expect to update to 50, thread 2 (ATM) : Get current value 100, expect update to 50, thread 1 executes successfully, thread 2 blocks for some reason, then someone sends Xiaoming 50 thread 3 (default) : Thread 3 returns from the Block with a balance of 100. Thread 2 returns from the Block with a balance of 100. Thread 2 returns from the Block with a balance of 100. At this point, you can see that the actual balance should be 100 (100-50+50), but it is actually 50 (100-50+50-50), which is the ABA problem resulting in a successful submission.

Solution: Add the version number before the variable. Each time the variable is updated, the version number of the variable is +1, i.e. A->B->A becomes 1A->2B->3A.

Long cycle time and high overhead

If the CAS operation fails, it needs to loop through the CAS operation (loop through and update the expected value to the latest), which can cause a lot of CPU overhead if it fails for a long time.

This cycle is also called spin

Solution: Limit the number of spins to prevent an infinite loop.

Atomic operations of only one shared variable are guaranteed

Atomic operations on CAS can only be performed on one shared variable.

Solution: If you need to operate on multiple shared variables, you can use pessimistic locking to ensure atomicity, or you can combine multiple shared variables into a single shared variable for CAS operation.

The application of CAS

We know that CAS operations do not lock shared variables, which is a non-blocking synchronization mechanism, and CAS is an implementation of optimistic locking.

  1. Optimistic locking Optimistic locking always assumes the best case scenario. Every time a thread manipulates data, it assumes that the data will not be modified by another thread. Therefore, the data will not be locked every time it manipulates data. Only when the data needs to be updated does it determine whether the data has been modified by another thread. If the data has been modified, the operation is rejected and an error message is returned to the user.
  2. Pessimistic lock Pessimistic lock always assumes the worst case, every time to manipulate data, we assume that the data will be modified by the thread, so every time we manipulate data, the data will be locked, so that other threads can not manipulate the data, other threads will continue to block until the data lock. This can affect efficiency, for example, when one thread does a time-consuming operation, other threads just want to get the value of the data and wait a long time.

Java uses CAS optimistic locking and atomicity to efficiently solve the security problems of multithreading, such as ConcurrentHashMap, volatile keyword and ReentrantLock in JDK1.8.

reference

What is the ABA problem?

Original address: ddnd.cn/2019/03/13/…