Problems with multithreading

Why multithreading is needed

To put it plainly, times have changed. Today’s machines are multi-core, and in order to squeeze the last performance out of a machine we introduce single threads.

In order to make full use of CPU resources, in order to improve the CPU utilization rate, the use of multi-threading to complete several things at the same time without interfering with each other, in order to deal with a large number of IO operations or processing situations need to spend a lot of time and so on, such as: reading and writing files, video image collection, processing, display, save and so on.

Performance issues

Context switch

Threads in Java are one-to-one with CPU single-core execution, that is, a single processor can process only one thread at a time. While the CPU is through the time slice algorithm to perform the task, different threads active state is different, the CPU will switch between multiple threads, on the switch when saving state of a task, so that the next time you switch back to the task can be loaded into the state of the task, the task of save to load is a context switch. The more threads there are, the more context switches there are. Context switches lead to CPU state usage, which is why the system is slower when we have a large number of threads on.

In fact you see from the expression, the switching process is actually a thread to stop running, suppose you have a job like this there are 10 the same steps, each thread with each step with the time are the same, and we can only make one thread at the same time, the coordination between the multiple threads at this time, also is the scheduling will take up a lot of time here, Under the condition of equal public quantity, our single thread is certainly faster than multi-thread, but now our servers are multi-core, so multi-thread can speed up our processing speed, but this is based on the premise that the number of threads and the number of cores of our CPU.

We can reduce context switching in several ways:

  • Reduce lock wait: Lock wait means that threads frequently switch between active and wait states, adding context switch. Lock wait is caused by fierce competition for the same resource. In some scenarios, we can use some means to reduce lock competition, such as data fragmentation or data snapshot.
  • CAS algorithm: use Compare and Swap to avoid locking. The CAS algorithm will be described in subsequent chapters.
  • Use the right number of threads or coroutines: Use the right number of threads rather than more. In cpu-intensive systems, for example, we tend to start up at most twice the number of processor cores. Because coroutines naturally schedule multiple tasks in a single thread, they actually avoid context switching.

Cache invalidation

Not only context switching can cause performance problems, but cache invalidation can also cause performance problems. Since there is a high probability that the program will revisit the data it just accessed, caching is used to speed up the entire program so that we can retrieve data quickly if we use the same data. But once the thread scheduling, switch to the other threads, the CPU will to perform different code, the original cache is likely to fail, you need to cache the new data, it also causes a certain cost, so the thread scheduler in order to avoid frequent context switch, usually to be scheduled to thread set minimum execution time, That is, the next schedule can be executed only after this period of time, thus reducing the number of context switches.

Here the cache refers to the CPU cache, about the CPU cache is roughly as follows, there are multiple levels of cache, and main memory, the so-called main memory is our memory

  • The L1 cache is small but fast and sits right next to the CPU core that uses it.
  • L2 is larger, slower, and still can only be used by a single CPU core
  • L3 is more common in modern multicore machines, still larger, slower, and shared by all CPU cores on a single slot
  • Main memory, which holds all the data a program runs on, is larger, slower, and shared by all CPU cores on all slots
  • When the CPU performs an operation, it goes to L1 to find the data it needs, then L2, then L3, and finally to main memory if none of these caches are available

If a Cache line is 64 bytes long, the CPU will read the entire line instead of the entire line if there is data in the Cache. That is, the Cache line is our base unit

public class CacheLineEffect {
    // Consider a typical cache line size of 64 bytes, with a long type of 8 bytes
    static long[][] arr;

    public static void main(String[] args) {
        // Create an array
        arr = new long[1024 * 1024] [8];
        for (int i = 0; i < 1024 * 1024; i++) {
            for (int j = 0; j < 8; j++) {
                arr[i][j] = 1L; }}// Add all the data in the array for the first time
        long sum = 0L;
        long marked = System.currentTimeMillis();
        for (int i = 0; i < 1024 * 1024; i += 1) {
            for (int j = 0; j < 8; j++) {
                sum += arr[i][j];
            }
        }
        System.out.println("Loop times:" + (System.currentTimeMillis() - marked) + "ms sum result: " + sum);
        sum = 0L;
        // The second stack reads all the data in the array
        marked = System.currentTimeMillis();
        for (int i = 0; i < 8; i += 1) {
            for (int j = 0; j < 1024 * 1024; j++) {
                sum += arr[j][i];
            }
        }
        System.out.println("Loop times:" + (System.currentTimeMillis() - marked) + "ms sum result: "+ sum); }}Copy the code

What’s special about this code is that it doesn’t iterate the array the same way, the first stack reads row by row, the second stack reads column by column, and our first stack uses CPU cache because the size of the array is exactly 64 bytes, meaning that it reads from main memory once, The next 7 reads from the CPU cache are 1024 * 1024 reads in total, but the second read is 1024 * 1024 * 8 reads because the cache is not available

Loop times:12ms sum result: 8388608
Loop times:40ms sum result: 8388608
Copy the code

We see that there is a big difference, and here we see the importance of CPU caching, in the same way that switching between multiple threads can cause CPU caching to fail.

Collaboration overhead

Threading collaboration can also cause performance problems. Because if there is shared data between threads, it is possible to prevent data malfunctions and to ensure thread safety by preventing compiler and CPU optimizations such as reordering, or by repeatedly flushing data from thread working memory into main memory for synchronization purposes. Then refresh from the main memory to the working memory of other threads, and so on. These problems do not exist in single threads, but in multithreading, to ensure data correctness, we have to do this because thread-safety is a higher priority than performance, which indirectly degrades our performance.

There is also the overhead of adding thread-safe logic to your code implementation.

When should thread safety be considered

Access shared variables or resources

The first scenario is when accessing shared variables or shared resources. Typical scenarios include accessing properties of shared objects, accessing static static variables, accessing shared caches, and so on. Because this information can be accessed not only by one thread, but also by multiple threads at the same time, thread-safety issues can occur in the case of concurrent reads and writes.

Timing dependent operations

The second scenario we need to pay attention to is timing dependent operations. If the correctness of our operations is timing dependent and the execution order is not guaranteed to be the same as expected in the case of multiple threads, thread-safety problems will occur, as shown in the following code:

if (map.containsKey(key)) {

    map.remove(obj)

}
Copy the code

The code first checks the map to see if there is any element corresponding to the key. If there is, the remove operation continues. In this case, the combined operation is dangerous because it is checked before it is performed, and execution can be interrupted. If two threads enter an if() statement at the same time, and they both check for the presence of the key element, they both want to perform the following remove operation. Then one thread removes obj first, and the other thread has just checked for the presence of the key element. If condition is true. So it will continue to delete obJ as well, but in fact the obJ in the collection has already been deleted by the previous thread, in which case it can cause thread-safety issues.

There are many similar cases, for example, we first check x=1, if x=1 to modify the value of x, the code is as follows:

if (x == 1) {
    x = 7 * x;
}
Copy the code

The same goes for similar scenarios, where “check and execute” is not atomic and may be interrupted in the middle, and the result of the check may be out of date and invalid at execution time, in other words, getting the correct result depends on lucky timing. In this case, we need to lock it and other protective measures to ensure the atomicity of the operation.

The other party does not declare itself to be thread-safe

It is worth noting that when we use another class, if the other class has not declared itself to be thread-safe, then multi-threaded concurrent operations on other classes may cause thread-safe problems. For example, if we define an ArrayList, it is not thread-safe. If multiple threads concurrently read/write to an ArrayList, it may cause thread-safe problems, causing data errors, but the responsibility is not on the ArrayList. Because it is not concurrency safe per se, as the source notes:

Note that this implementation is not synchronized. If multiple threads
access an ArrayList instance concurrently, and at least one of the threads
modifies the list structurally, it must be synchronized externally.
Copy the code

If you use ArrayList in multithreaded scenarios, you need to manually use synchronized externally to ensure concurrency security.

So ArrayList is not suitable for concurrent reads and writes by default, and we use it incorrectly, causing thread-safety problems. So, when we use other classes if involve complicated scene, so be sure to first confirm clear, each other whether to support the concurrent operation, so that’s four need extra attention to the problem of thread safety scenario, we respectively are access to a Shared variable or resources, depend on the timing of the operation, the relationship between different data binding, and the other party didn’t declared himself to be thread-safe.

conclusion

When you think about multithreading you have to think about thread safety, so how do you find out where there are thread safety issues — where there are shared variables there are thread safety issues

We think there are two problems with the introduction of multithreading

  1. Thread safety

  2. Performance issues