Writing in the front

Writing concurrent programs is difficult because concurrent programs are prone to bugs, some of which are weird, many of which are untraceable and difficult to reproduce.

To find and solve these problems quickly and accurately, the first is to understand the nature of concurrent programming, concurrent programming to solve what problems.

This article will give you an in-depth understanding of the three major problems to be solved in concurrent programming: atomicity, visibility, and orderliness.

Supplement knowledge

Hardware development, there has been a contradiction, CPU, memory, I/O device speed difference.

Sort by speed: CPU >> Memory >> I/O devices

In order to balance the speed differences among the three, the following optimization was made:

  • The CPU added a cache to balance the speed difference between memory and CPU.
  • The operating system added processes, threads, to time-sharing multiplexing CPU, and then balance the I/O device and CPU speed difference;
  • The compiler optimizes the order of instruction execution so that the cache can be used more rationally.

visibility

What is visibility?

Changes made by one thread to a shared variable can be immediately seen by another thread. This is called visibility.

Why is there a visibility problem

With today’s multicore processors, each CPU has its own cache, and the cache is only visible to the processor on which it resides. CPU cache and memory data are not easily consistent.

To avoid the delay that occurs when the processor pauses for two seconds to write data to memory, the processor uses a write buffer to temporarily hold data written to memory. The write buffer merges multiple writes to the same address and flushes in a batch manner, meaning that the write buffer does not flush data to main memory immediately.

The cache cannot be flushed in a timely manner, causing visibility problems

Examples of visibility problems

public class Test{
    public int a = 0;
    public void increase(){
        a++;
    }
    public static void main(String[] args){
        final Test test = new Test();
        for(int i=0; i<10; i++){ newThread(){
                public void run() {for(int j=0; j<1000; j++){ test.increase(); }}; }.start(); }while(Thread.activeCount()>2){ Thread.yield(); } Sytem.out.println(test.a); }}Copy the code

Objective: 10 threads to add A to 10000;

Results: Each run, the result is less than 10000;

Cause analysis:

If thread 1 and thread 2 start executing at the same time, a=0 will be read into the CPU cache for the first time. If thread 1 executes a++, a=1 will be read into the CPU cache for the first time.

Thread 1 and thread 2 each have a value of 1 in their CPU cache. Then thread 1 and thread 2 both write a=1 from their cache to memory, resulting in a=1 in memory instead of the expected 2. Therefore, the final value of A is less than 10000. This is the visibility of the cache.

atomic

What is atomicity?

The property of one or more operations being executed by the CPU without interruption is called atomicity.

In concurrent programming, atomicity should not be defined in the same way as atomicity in transactions (which can be rolled back if code runs abnormally). A piece of code, or an operation on a variable, cannot be executed by another thread until one thread has finished executing it.

Why is there an atomic problem?

Thread is the basic unit of CPU scheduling. The CPU schedules threads according to different scheduling algorithms and assigns time slices to threads. When a thread starts to execute after the time slice is used up, it loses the right to use the CPU. At this time, other threads can obtain the time slice to execute the code, resulting in multiple threads executing the same code, which is the atomic problem.

Switching threads creates atomicity problems

In Java, reads and assignments to variables of primitive data types are atomic, that is, they are uninterruptible and either performed or not performed.

I =0 // atomic operation j= I // non-atomic operation, which contains two operations, read I, assign I j I ++ // Non-atomic operation, which contains three operations, read I, add I by 1, assign I I I =j+1 // non-atomic operation, which contains three modules, read j, add j by 1, assign ICopy the code

Examples of atomicity problems

Once again, 10 threads increment a to 10000, assuming that the visibility is guaranteed, but the execution result is still not as expected due to atomicity problems.

public class Test{
    public int a = 0;
    public void increase(){
        a++;
    }
    public static void main(String[] args){
        final Test test = new Test();
        for(int i=0; i<10; i++){ newThread(){
                public void run() {for(int j=0; j<1000; j++){ test.increase(); }}; }.start(); }while(Thread.activeCount()>2){ Thread.yield(); } Sytem.out.println(test.a); }}Copy the code

Purpose: 10 threads to add A to 10000.

Result: Each run, the result is less than 10000

Cause analysis:

Let's start with a++, which actually has three operations: 1 reads a=0, 2 calculates 0+1=1; 3 Assigns 1 to A to ensure atomicity of a++, which ensures that the three operations cannot be performed by another thread until one thread has finished executing them.Copy the code

The actual execution sequence diagram is as follows:

Key step: when thread 2 reads the value of A, thread 1 has not completed the assignment of a=1, resulting in the calculation result of thread 2 is also a=1.

The problem is that the atomicity of the a++ operation is not guaranteed. If the atomicity of a++ is guaranteed, and thread 1 cannot execute a++ until three operations have been performed, then thread 2 can be guaranteed to read a=1 when executing a++, thus obtaining the correct result.

order

Orderliness: The order in which a program is executed is the order in which the code is executed.

The compiler sometimes changes the order of statements in a program to optimize performance. For example, in the program: “A =6; B = 7.” The compiler may be optimized to “b=7; A = 6;” In this case, the compiler adjusts the order of statements without affecting the final result of the program. However, compiler and interpreter optimizations can sometimes lead to unexpected bugs.

Examples of order problems

A classic example in Java is the use of double checking to create a singleton

poblic class Singleton{
    static Singleton instance;
    static Singleton getInstance() {if(instance == null){
            synchronized(Singleton.class){
                if(instance = null){instance = new Singleton(); }}}returninstance; }}Copy the code

In the method of getting instance getInstance(), we first determine if instance is empty, if so, lock the Singleton.class and check again if instance is empty, and create an instance of Singleton if it is still empty.

It seems to be perfect, which not only ensures the complete initialization of the thread singleton, but also uses synchronized to lock when instance is judged to be null. But there’s a problem!

instance = new Singleton(); There are three steps to create an object:

(1) Allocate memory space (2) initialize the Singleton (3) assign the address of the memory space to instance (3) initialize the Singleton (4)Copy the code

What would be the result? Thread A executes the getInstance() method first, and when the instruction ② is executed, A thread switch happens, switching to thread B. If thread B also executes the getInstance() method, then thread B will find instance! =null, so instance is returned directly, and the instance is not initialized. If we call the member variable of instance, we may raise the null-pointer exception.

Execution sequence diagram:

conclusion

The essence of concurrent programming is to solve three problems: atomicity, visibility and order. Atomicity: The ability of one or more operations to be performed without interruption by the CPU. Atomicity problems caused by thread switching that causes multiple threads to execute the same code at the same time.

Visibility: Changes made by one thread to a shared variable can be immediately seen by another thread. The cache cannot be flushed in a timely manner, causing visibility problems.

Orderliness: The order in which a program is executed is the order in which the code is executed. The compiler changes the order of statements in a program to optimize performance, resulting in orderliness problems.

Heuristic: Thread switching, caching, and compilation optimizations are all intended to improve performance, but cause concurrent programming problems. This also tells us that when technology solves one problem, it will inevitably bring another. We need to consider the problems brought by new technology in advance to avoid risks.


Reprinted from the Java Advanced Architect public account