Found the problem

The following error occurs on the online log platform

At the same time, the system monitoring and alarm platform also sent a CPU emergency message. According to the investigation, the following code snippets are caused:

"DubboServerHandler - 10.30.66.58:13300 - thread - 65" #1081 daemon prio=5 os_prio=0 tid=0x00007f50a01ec800 nid=0x7ca4 runnable [0x00007f502122b000]
   java.lang.Thread.State: RUNNABLE
	at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:2017) at java.util.HashMap.putVal(HashMap.java:638) at java.util.HashMap.put(HashMap.java:612) .Copy the code

Location problem

Find the error code block

The first reaction is that the HashMap has a concurrency problem when the linked list is converted to a red-black tree, so we find the code fragment

There are two thread-safety issues with this code:

  1. The HashMap itself is thread-unsafe, as shown in a
  2. Premises such as b, each time you enter this method will be global variables to initialize, imagine, if there are two threads to invoke this method at the same time, one of them has come to the last step, thinking about going back to the map value, while another thread has just come in to the map to initialization, then returned to the former is the result of the caller is an empty map.

Both violate atomicity in concurrent programming.

The online log platform reported an error obviously from the first reason.

HashMap thread is not safe

Locate the corresponding code block in the HashMap source code:

static <K,V> void moveRootToFront(Node<K,V>[] tab, TreeNode<K,V> root) {
            int n;
            if(root ! =null&& tab ! =null && (n = tab.length) > 0) {
                int index = (n - 1) & root.hash;
                TreeNode<K,V> first = (TreeNode<K,V>)tab[index];
                if(root ! = first) { Node<K,V> rn; tab[index] = root; TreeNode<K,V> rp = root.prev;if((rn = root.next) ! =null)
                        ((TreeNode<K,V>)rn).prev = rp;
                    if(rp ! =null)
                        rp.next = rn;
                    if(first ! =null)
                        first.prev = root;
                    root.next = first;
                    root.prev = null;
                }
                assert checkInvariants(root); }}Copy the code

The entry to this code is:

Insert a new Node into the HashMap -> the number of bin nodes exceeds the threshold call #resize() to expand -> The position of the new Node is a tree structure then call #split() #treeify() to arrange the nodes of the red-black tree

In line 5 of the above code, the hash value of the root node is combined with the mask obtained by the number of arrays -1 to obtain the new index position in the bin array. At this time, the node at this position in the array is forcibly converted to TreeNode type. This is an obvious violation of atomicity, because other threads are also doing HashMap initialization and insert operations, and in other threads, the Node at index becomes a normal Node due to insertion after initialization, and the strong cast exception occurs.

ConcurrentHashMap?

While the map is thread-safe, the code itself is still non-atomic

So let’s change the code

@Service
public class SubscriptServiceImpl implements SubscriptService {
    @Autowired
    private JedisClusterTemplate musicFmJedisClusterTemplate;

    @Override
    public Map<Integer, String> getSubscriptMap(a) {
        String json = musicFmJedisClusterTemplate.get("subscript:config");
        if (StringUtils.isBlank(json)) {
            return Maps.newHashMap();
        }

        List<SubscriptCache> subscriptCaches = JSON.parseArray(json, SubscriptCache.class);

        returnsubscriptCaches.stream() .collect(Collectors.toMap(SubscriptCache::getChannelId, SubscriptCache::getIconText)); }}Copy the code

The thread is safe, but the problem of high CPU usage is not solved.

Analysis of plateau cause of CPU usage

The applications, the actual every time more than two thousand key values in the Map, and the interface call scenes a lot, QPS for the current system is very high, clearly each plug values are involved in a lot of expansion and black mangrove operation, you know, every time the red-black tree plug value through a left-handed right-handed complex operations, such as consuming CPU performance.

Let’s test this out with JMeter

This is the configuration of my machine:

JMeter interface:

First, let’s take a look at normal CPU usage:

When we set QPS to 500:

When set to 800:

When set to 1000, the computer freezes.

At the end of the day, HashMap and ConcurrentHashMap are suitable for high-query scenarios, and the performance of add, delete, and change operations is not ideal for high-concurrency scenarios. Even when you initialize custom load factors and capacity, you just trade space for time, not both.

To solve the problem

To think of it differently, just change the cache structure to a hash structure, so that every time a value is detected, there is no problem.

@Override
    public String getSubscript(Integer channelId) {
        return musicFmJedisClusterTemplate.hget("subscript:config", channelId.toString());
    }
Copy the code

twitter

There are two other common approaches to thread safety:

  1. Wrapping subscriptmaps of global variables in ThreadLocal does solve the concurrency problem, but each thread has a map with more than 2,000 key values, which is too much for either CPU or memory.
  2. Changing the scope of this class to prototype does not solve the CPU and memory problems.

conclusion

  1. Atomicity of operations is the key to thread safety

  2. Avoid high concurrent additions, deletions, and changes to HashMap and ConcurrentHashMap

  3. Global variables are prone to class state problems