The root of concurrency problems

The difference of CPU, memory and I/O device computing speed has always been the subject of computer optimization. We can vividly compare the difference of the three speeds as follows:

CPU is a day in the sky, memory is a year
Memory is a day in the sky, I/O devices are ten years on earth

In this case, when the CPU collaborates with memory and I/O devices, the CPU will spend a lot of time waiting. The CPU is the brain of the computer, and wasting a lot of time on waiting will undoubtedly reduce the overall performance and efficiency of the computer. To this end, many predecessors have made a lot of efforts to balance the differences among the three speeds, mainly as follows:

The CPU increased the cache to balance the speed difference with memory, as shown in the figure below

The operating system adds processes, threads, time-sharing multiplexing of cpus, and balancing the speed difference between CPUS and I/O devices
The compiler optimizes the order in which instructions are executed so that the cache can be used more efficiently

These operations improve the overall performance and efficiency of the computer, but they also bring the following problems

Memory visibility issues from caching

Take a look at an example of code that does not end up printing flag true to exit the loop

public class VolatileTest {
    public static void main(String[] args) {
        VolatileDemo volatileDemo = new VolatileDemo();
        Thread thread = new Thread(volatileDemo, "A thread." ");
        thread.start();

        while (true) {
          	// If flag is true
            if (volatileDemo.isFlag()) {
              	// Prints this sentence and exits the loop
                System.out.println("Flag to true");
                break; }}}}class VolatileDemo implements Runnable {

    private boolean flag = false;

    @Override
    public void run(a) {
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
      	// Set flag to true
        flag = true;
    }
    public boolean isFlag(a) {
        returnflag; }}Copy the code

Each CPU has its own cache. If CPU’s cache is not updated to main memory, the cache data is not visible to other cpus, as shown in the following figure

Atomicity issues with thread switching

Let’s start with a code example that theoretically prints values between 0 and 9, but eventually prints duplicate values

public class AtomicTest {

    public static void main(String[] args) {
        AtomicDemo atomicDemo = new AtomicDemo();
        for (int i = 0; i < 10; i ++) {
            newThread(atomicDemo).start(); }}}class AtomicDemo implements Runnable {

    private int serialNumber = 0;

    @Override
    public void run(a) {
        try {
            Thread.sleep(200);
        } catch (InterruptedException e) {

        }

        System.out.println(Thread.currentThread().getName() + ":" + getSerialNumber());
    }


    public int getSerialNumber(a) {
        return serialNumber += 1; }}Copy the code

The main problem is serialNumber += 1, which actually goes through three steps

Load the serialNumber from memory into the CPU register
Perform the +1 operation on the register
Write the result to memory

So we could have a situation where thread 1 reads serialNumber 0 from main memory, increments it by one, and thread 2 gets the CPU, reads serialNumber 0 from main memory, increments it by one, writes back to main memory, frees the CPU, and then does write back to main memory, SerialNumber goes through two increments, but the result is only one, as shown below. Note that this is a separate issue from memory visibility, involving read and write operations, and can occur even without memory visibility

Order problems with compiler optimization

Orderliness refers to the order in which code is executed, but compilers sometimes change the order in which code is executed to optimize performance, such as a=8; b=7; Will become b = 7; a=8;

Take a piece of code, the following code two threads respectively to a, b, x, y, assignment, because the thread of execution order, there will be various permutation and combination, but in theory there will be no x = y = 0, because x = y = 0 case condition is x = b, y = a these two lines of code to run to a = 1, b = 1 before the execution

public class ReOrderTest {

    private static int x = 0;
    private static int y = 0;
    private static int a = 0;
    private static int b = 0;


    public static void main(String[] args) throws InterruptedException {

        int i = 0;
        for (;;) {
            i ++;
            x = 0; y = 0;
            a = 0; b = 0;

            Thread one = new Thread(() -> {
                a = 1;
                x = b;
            });


            Thread two = new Thread(() -> {
                b = 1;
                y = a;
            });

            one.start();
            two.start();
            one.join();
            two.join();

            String result = "The first" + i + "Time," + "x = (" + x + "), y = (" + y + ")";
            if (x == 0 && y == 0) {
                System.out.println(result);
                break; }}}}Copy the code

In actual execution, instruction reordering does occur

Synchronized

The root cause of the atomic problem is thread switching, so we can solve the problem by preventing thread switching. In other words, only one thread is running at a time. This condition is called mutual exclusion, and Java provides Synchronized locks to implement mutual exclusion

use

Modifies a block of code that locks are objects configured in Synchronized parentheses

public class SynchronizedCodeBlock {

    public static void main(String[] args) {
        SynchronizedCodeBlockThread s1 = new SynchronizedCodeBlockThread();
        Thread a = new Thread(s1, "A thread." ");
        Thread b = new Thread(s1, Thread "B"); a.start(); b.start(); }}class SynchronizedCodeBlockThread implements Runnable {

    public static int count = 0;

    public void method(a) throws InterruptedException {
        synchronized (this) {
            for (int i = 0; i < 5; i++) {
                System.out.println(Thread.currentThread().getName() + ":" + (count ++));
                Thread.sleep(1000); }}}@Override
    public void run(a) {
        try {
            method();
        } catch(InterruptedException e) { e.printStackTrace(); }}}Copy the code

Modifier code block, lock is the Class object of the current Class

public class SynchronizedStaticTest {

    public static void main(String[] args) {
        SyncThread s1 = new SyncThread();
        SyncThread s2 = new SyncThread();
        Thread a = new Thread(s1, "A thread." ");
        Thread b = new Thread(s2, Thread "B"); a.start(); b.start(); }}class SyncThread implements Runnable {

    public static int count = 0;

    public synchronized static void method(a) throws InterruptedException {
        for (int i = 0; i < 5; i++) {
            System.out.println(Thread.currentThread().getName() + ":" + (count ++));
            Thread.sleep(1000); }}@Override
    public void run(a) {
        try {
            method();
        } catch(InterruptedException e) { e.printStackTrace(); }}}Copy the code

Modifies a normal synchronization method in which the lock is the current instance object

public class SynchronizedTest {

    public static void main(String[] args) {
        SynchronizedMethodThread s1 = new SynchronizedMethodThread();
        Thread a = new Thread(s1, "A thread." ");
        Thread b = new Thread(s1, Thread "B"); a.start(); b.start(); }}class SynchronizedMethodThread implements Runnable {

    public static int count = 0;

    public synchronized void method(a) throws InterruptedException {
        for (int i = 0; i < 5; i++) {
            System.out.println(Thread.currentThread().getName() + ":" + (count ++));
            Thread.sleep(1000); }}@Override
    public void run(a) {
        try {
            method();
        } catch(InterruptedException e) { e.printStackTrace(); }}}Copy the code

How to lock

When a thread accesses a block of synchronized code, it must acquire the lock and release it when it exits or throws an exception. Where does the lock exist? The layout of objects in memory is divided into three areas: the object header, the instance data, and the aligned fill object header, which is divided into markWord and Klass Pointer

The secret of the lock lies in Markword, which stores the HashCode, GC information and lock flag bits of the object

In our stereotype, synchronized is a very heavy lock, but in Java1.6, synchronization has been greatly optimized and lock performance has improved. In many frameworks, such as spring, synchronized is a more commonly used keyword. Synchronized has four states from low level to high level, namely, no lock, biased lock, lightweight lock and heavyweight lock. The lock can be upgraded but cannot be degraded, which means that the biased lock cannot be degraded after being upgraded to lightweight lock

Biased locking

When a thread access synchronized code block and acquire the lock, will head the object lock record store to thread ID, after the thread on the entry and exit the synchronized code block simply verifying whether stored in markword point to the current thread to lock, we can simply understood as biased locking is rent a Shared cycling, at the same time, Only you own the shared bike, and the process of scanning the code is the process of storing the thread ID

Why set bias lock?

HotSpot authors have done their research and found that in most cases the lock is not contested by multiple threads but is always acquired multiple times by the same thread. In this case, we stick a label on the lock to mark the shared bike as mine so that we don’t need to scan the code every time we use it

What happens if another thread accesses the synchronized method?

Bias locks will be revoked

Lightweight lock

When locks are contested, they are upgraded to lightweight locks

What happens if the lock is already a lightweight lock and another thread accesses the synchronized method?

The thread spins to acquire the lock resource

Heavyweight lock

In the case of a very competitive lock, which ballooned into a heavyweight lock, the thread went from the user state to the kernel state, and the thread that failed to acquire the lock was placed in the blocking queue, as shown below

Why do you need a heavyweight lock when you have a spin lock

Spin consumes CPU. In the case of long lock duration and fierce lock competition, CPU will be consumed a lot. Heavyweight locks have a waiting queue

Is bias locking necessarily more efficient than spin locking

Not necessarily, in the case of clear multi-threaded contention, biased lock will involve lock cancellation, and frequent locking will be less efficient to unlock. During the JVM startup process, there will be clear multi-threaded contention lock, do not open biased lock at this time, and open it after a period of time

What does the lock look like

Prepare the environment

	<dependency>
  		<groupId>org.openjdk.jol</groupId>
  		<artifactId>jol-core</artifactId>
  		<version>0.9</version>
  </dependency>
Copy the code

unlocked

public class ObjectHeaderTest {
    public static void main(String[] args) throws InterruptedException {
        Object o = newObject(); System.out.println(ClassLayout.parseInstance(o).toPrintable()); }}Copy the code

Note: Jol prints the object information in reverse order. We can clearly see that the lock flag bit is 001 (no lock state).

Biased locking

public class ObjectHeaderTest {
    public static void main(String[] args) throws InterruptedException {
      	// Thread sleeps for 5 seconds
        Thread.sleep(5000);
        Object o = newObject(); System.out.println(ClassLayout.parseInstance(o).toPrintable()); }}Copy the code

A very interesting phenomenon, after the thread sleep for 5 seconds, the object from the lock state to the partial lock state, why?

We have said, if can be expected from the start to a lock will be very competitive, with biased locking mechanism will become inefficient, because the biased locking undo operation, a lock and the JVM has just started, there are a large number of system classes initialization, this time will use synchronized object lock, lock will be more competitive, Biased lock is not suitable, so JVM will default to lazy loading biased lock, the delay time is about 4s. Although biased lock, but there is no specific bias to which thread, it is a special state of no lock

public class ObjectHeaderTest {
    public static void main(String[] args) throws InterruptedException {
        Thread.sleep(5000);
        Object o = new Object();
        System.out.println(ClassLayout.parseInstance(o).toPrintable());
        synchronized(o) { System.out.println(ClassLayout.parseInstance(o).toPrintable()); }}}Copy the code

This is where you can see the thread bias

Lightweight lock

public class ObjectHeaderTest {
    public static void main(String[] args) throws InterruptedException {
        Thread.sleep(5000);
        Object o = new Object();
      	// The lock is biased to a thread
        synchronized (o){
            System.out.println(ClassLayout.parseInstance(o).toPrintable());
        }
      	// Another thread contends for the lock
        for (int i = 0; i < 1; i ++) {
            Thread t = newThread(() -> { print(o); }); t.start(); }}public static void print(Object o) {
        synchronized(o){ System.out.println(ClassLayout.parseInstance(o).toPrintable()); }}}Copy the code

As you can see, the lock starts as a bias lock, and becomes a lightweight lock after another thread contention

Heavyweight lock

public class ObjectHeaderTest {

    public static void main(String[] args) throws InterruptedException {
        Thread.sleep(5000);
        Object o = new Object();
        synchronized (o){
            System.out.println(ClassLayout.parseInstance(o).toPrintable());
        }
      	// Use 10 threads to compete for locks
        for (int i = 0; i < 10; i ++) {
            Thread t = newThread(() -> { print(o); }); t.start(); }}public static void print(Object o) {
        synchronized(o){ System.out.println(ClassLayout.parseInstance(o).toPrintable()); }}}Copy the code

Lock very competitive situation, upgrade to heavyweight lock up

The lock upgrade process can be summarized in the following figure

Go a little deeper

Synchronized is implemented based on the entry and exit tube objects (Monitor). Each object instance has a Monitor object, which is created and destroyed together with Java objects. The Monitor object is implemented by C++. Check the bytecode of the above program to see monitorenter and Monitorexit

volatile

Atomicity can be solved by synchronized, but visibility and order can be solved by synchronized. At this point, volatile can be invoked

Volatile is lightweight synchronized (meaning that synchronized can also solve visibility and ordering problems, but not in the same way), and has two main roles in multithreaded development

Implement visibility of shared variables
Prevents reordering of instructions

use

The shared flag variable volatile is used to print flag as true

public class VolatileTest {
    public static void main(String[] args) {
        VolatileDemo volatileDemo = new VolatileDemo();
        Thread thread = new Thread(volatileDemo, "A thread." ");
        thread.start();

        while (true) {
            if (volatileDemo.isFlag()) {
                System.out.println("Flag to true");
                break; }}}}class VolatileDemo implements Runnable {
		// flag uses the volatile modifier
    private volatile boolean flag = false;

    @Override
    public void run(a) {

        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

        flag = true;
    }

    public boolean isFlag(a) {
        returnflag; }}Copy the code

Some of the concepts

happen-before

The result of an action happens -before any subsequent action, so that the result is visible to any subsequent action

as if serial

No matter how you reorder it, the single-thread execution doesn’t change, right

How to implement

These are some of the concepts or specifications for volatile. How do you implement the semantics of volatile?

Cache consistency protocol (addressing visibility issues)

When a computer has multiple cpus, if a copy of a shared variable exists in the cache of other cpus, a signal will be sent to inform other cpus to set the cache line of the variable to invalid state. After the cache line is invalid, other cpus will read the variable from the memory again, ensuring the visibility of the shared variable

Memory barriers (to solve instruction reordering problems)

To implement the memory semantics of volatile, the compiler inserts a memory barrier into the instruction sequence to prevent reordering of certain types of processors when generating bytecode, as shown in the figure below. In simple terms, volatile variables are written so that others can read them

The resources

B station horse soldier video tutorial

The art of Concurrent programming in Java

www.cnblogs.com/LemonFive/p…

Geek time Java concurrent programming

The root of concurrency problems