1. Learning Objectives

1. Three sources of concurrency problems:

  • Visibility problem: In multithreading, variables are often not shared between threads because the CPU obtains data from the nearest and fastest CPU cache before retrieving data from memory.
  • Atomicity problem: Even if two threads are running on the same CPU core, avoiding visibility problems, another atomicity problem can still throw your concurrent code out of control.
  • Orderliness problem: Multi-threaded concurrent code execution produces unpredictable results. Refer to the atomicity problem in the previous section.

2. What is thread-safe about ConcurrentHashMap?

  • ConcurrentHashMap is thread-safe. ConcurrentHashMap can only provide atomic read and write operations that are thread-safe. That is, put() and get() operations are thread-safe. The two operations for multithreaded operations at the same time, between threads are visible, because ConcurrentHashMap. Node. Val and because ConcurrentHashMap baseCount is volatile.

3. How to use ConcurrentHashMap correctly

  • ConcurrentHashMap#putIfAbsent(), which implements atomic get() and put() operations because ConcurrentHashMap#putIfAbsent() contains synchronized locks

User registration simulates concurrency problems

In this example, we simulate the user registration behavior and define the rule that the same user name cannot be registered twice. We use ConcurrentHashMap to store the user information and simulate the concurrent registration action to reflect the concurrency problem.

The test class:

public class ConcurrentHashMapTest {

	@Test
	public void test(a) throws InterruptedException {
		UserService userService = new UserService();

		int threadCount = 8;

		ForkJoinPool forkJoinPool = new ForkJoinPool(threadCount);
		forkJoinPool.execute(() -> IntStream.range(0, threadCount)
				.mapToObj(i -> new User("Zhang", i))
				.parallel().forEach(userService::register));

		// Wait 1s, otherwise the program ends without seeing the log output
		TimeUnit.SECONDS.sleep(1); }}@Data
@AllArgsConstructor
@NoArgsConstructor
class User {
	/** * User name, which is also the Map key */
	private String username;

	private int age;
}

class UserService {

	private Map<String, User> userMap = new ConcurrentHashMap<>();

	/** * User registration **@param user
	 * @return* /
	boolean register(User user) {
		if (userMap.containsKey(user.getUsername())) {
			System.out.println("User already exists");

			return false;
		} else {
			userMap.put(user.getUsername(), user);
			System.out.println("User" + user.getUsername() + "_" + user.getAge() + "Registration successful");

			return true; }}}Copy the code

Running results:

Existing User Existing User Zhang_3 _4 Successfully registered user Zhang_7 Successfully registered user Zhang_5 Successfully registered user Exists The user existsCopy the code

It can be seen that there is a logic to judge whether a user has registered in the registration process, but in the actual test, three users named Zhang SAN registered successfully at the same time, which obviously does not conform to the rules that users with the same name cannot register.

Three sources of concurrency problems

Concurrency problems come from three sources: atomicity, visibility, and orderliness

ConcurrentHashMap only ensures that the atomic read and write operations provided are thread-safe.

Why do we have concurrency problems with thread-safe ConcurrentHashMap?

1. Visibility issues

The user registration code uses the containsKey() method to determine whether the user exists. Intuitively, we assume that we are operating on the same Map. If another thread writes the key to the userMap, the current thread will see it when accessing the userMap.

In the study of computer principles when I talked about THE CPU cache, memory, hard disk three speed day difference, so the CPU in the calculation of the nearest from their own, the fastest CPU cache to obtain data to calculate, and then from the memory to obtain data.

In addition, after years of CPU development, it is more and more difficult to improve the performance of single-core, in order to improve the performance of single, today’s computers are using multiple CPU cores.

The following figure shows the relationship between the CPU and its cache and memory. Each CPU core has a Cache of its own Cache.

Our thread may be running on a different CPU core, and Thread1 writes user registrations to memory, but Thread2 still fetches data from its OWN CPU cache, so Thread2 sees no duplicates in the registrations, which is a visibility problem.

2. Atomicity

Even if two threads are running on the same CPU core, avoiding visibility issues, another atomicity issue can still throw your concurrent code out of control.

The following figure shows the process of registering users on a timeline. Boolean Register (User User) does not perform an operation on a CPU time scale, but contains:

  • Access userMap to determine whether the current user is registered
  • Registered users

If Thread1 accesses userMap and returns that the current user is unregistered, but does not put the user information into userMap, Thread2 will also get the result that the current user is unregistered, and therefore will perform the following registration operations.

The CPU is executing a task

In fact, we want to determine whether the User is registered or not, and both steps are performed simultaneously. As shown in the figure below, Thread1 executes the register(User User) method together, similar to the atomicity of database transactions.

3. Order problem

The compiler sometimes changes the order of code execution to improve performance. Reordering of single-threaded code instructions has no effect on execution, but can produce unpredictable results for multi-threaded concurrent code execution. Refer to the atomicity problem in the previous section.

How to use ConcurrentHashMap

ConcurrentHashMap only ensures that the atomic read and write operations provided are thread-safe. That is, put() and get() operations are thread-safe. The two operations for multithreaded operations at the same time, between threads are visible, because ConcurrentHashMap. Node. Val and because ConcurrentHashMap baseCount is volatile.

Once you understand the root of the concurrency problem, you can use the concurrency utility class to its true power. Let’s change the code:

boolean registerAtomic(User user) {
    User hasMapped = userMap.putIfAbsent(user.getUsername(), user);
    if(hasMapped ! =null) {
        System.out.println("User already exists");
        return false;
    } else {
        System.out.println("User" + user.getUsername() + "_" + user.getAge() + "Registration successful");
        return true; }}Copy the code

Here we use ConcurrentHashMap#putIfAbsent(), which means that the stored object is returned if the key already exists and null otherwise.

ConcurrentHashMap#putIfAbsent() how does ConcurrentHashMap#putIfAbsent() implement get(), put() atomic operations?

The lock is actually added, as shown in the ConcurrentHashMap#putVal() method. If putIfAbsent is not used in this scenario, the register(User User) method is locked, which has a greater impact on performance.

ConcurrentHashMap#putIfAbsent():

public V putIfAbsent(K key, V value) {
    return putVal(key, value, true);
}
Copy the code

ConcurrentHashMap#putVal():

final V putVal(K key, V value, boolean onlyIfAbsent) {
    if (key == null || value == null) throw new NullPointerException();
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            if (casTabAt(tab, i, null.new Node<K,V>(hash, key, value, null)))
                break;                   // no lock when adding to empty bin
        }
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);
        else {
            V oldVal = null;
            synchronized (f) {
                if (tabAt(tab, i) == f) {
                    if (fh >= 0) {
                        binCount = 1;
                        for (Node<K,V> e = f;; ++binCount) {
                            K ek;
                            if(e.hash == hash && ((ek = e.key) == key || (ek ! =null && key.equals(ek)))) {
                                oldVal = e.val;
                                if(! onlyIfAbsent) e.val = value;break;
                            }
                            Node<K,V> pred = e;
                            if ((e = e.next) == null) {
                                pred.next = new Node<K,V>(hash, key,
                                                          value, null);
                                break; }}}else if (f instanceof TreeBin) {
                        Node<K,V> p;
                        binCount = 2;
                        if((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! =null) {
                            oldVal = p.val;
                            if(! onlyIfAbsent) p.val = value; }}}}if(binCount ! =0) {
                if (binCount >= TREEIFY_THRESHOLD)
                    treeifyBin(tab, i);
                if(oldVal ! =null)
                    return oldVal;
                break;
            }
        }
    }
    addCount(1L, binCount);
    return null;
}
Copy the code

Five, the summary

1. Three sources of concurrency problems:

  • Visibility problem: In multithreading, variables are often not shared between threads because the CPU obtains data from the nearest and fastest CPU cache before retrieving data from memory.
  • Atomicity problem: Even if two threads are running on the same CPU core, avoiding visibility problems, another atomicity problem can still throw your concurrent code out of control.
  • Orderliness problem: Multi-threaded concurrent code execution produces unpredictable results. Refer to the atomicity problem in the previous section.

2. What is thread-safe about ConcurrentHashMap?

  • ConcurrentHashMap is thread-safe. ConcurrentHashMap can only provide atomic read and write operations that are thread-safe. That is, put() and get() operations are thread-safe. The two operations for multithreaded operations at the same time, between threads are visible, because ConcurrentHashMap. Node. Val and because ConcurrentHashMap baseCount is volatile.

3. How to use ConcurrentHashMap correctly

  • ConcurrentHashMap#putIfAbsent(), which implements atomic get() and put() operations because ConcurrentHashMap#putIfAbsent() contains synchronized locks

Six, reference

Xie. Infoq. Cn/article/f92…