preface
As mentioned above, the author was optimizing the logic of RPC call, just had a clear idea, entered the pressure test stage, found the problem of high CPU utilization, after investigation and reflection, summarized this article.
CPU utilization
**CPU usage: ** In simple terms, the CPU resources occupied by running programs are how many programs are running on your machine at a certain point in time. For example: PROCESS A occupies 10ms, then process B occupies 30ms, then process B occupies 60ms, then process A occupies 10ms, then process B occupies 30ms, then process B occupies 60ms; If this is the case over a period of time, the occupancy rate for that period is 40%.
Real situation
- The real situation (top instruction 381.8%) is shown in the figure below:
- The graph is… (CPU usage approaching 100%) :
Screening process
top + jstack
- perform
top
Command fetch process CPU usage:
$top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 883 work 20 0 22.1g 1.1g 10m S 2.7 0.9 162:10.94 Java 233886 Work 20 0 14960 2024 1804 R 0.3 0.0 0:00.01 Top 1 work 20 0 17040 3628 2668 S 0.0 0.0 0:13.89 shCopy the code
- Find the process with the highest CPU usage
PID
, further use-H
Option to observe the CPU usage of threads under this process:
$top -p 883 -h PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1552 work 20 0 22.1g 1.1g 10m S 0.3 0.9 1:49.06 Java 1553 Work 20 0 22.1g 1.1g 10m S 0.3 0.9 1:49.77 Java 1553 Work 20 0 22.1g 1.1g 10m S 0.3 0.9 1:49.77 JavaCopy the code
- use
jstack
Type thread stack information, pay attention to the command formatjstack {processId} | grep {threadId}
, includingthreadId
The value is hexadecimal.
The hexadecimal value of # 1552 is 610
$ jstack 883 | grep 610 -C 10
...
"Tiresias.util.Worker[0]" prio=10 tid=0x00007ff64cd06000 nid=0x610 waiting on condition [0x00007ff4bcdcc000]
...
Copy the code
- In this way, you can approach the truth step by step, analyzing calls step by step to find problems based on analyzing thread stack information.
show-busy-java-threads
- Although the above method can be analyzed, but a little cumbersome, then use the thread of the script to ~
- The show-busy-Java-Threads script directly prints the call stack for threads with high CPU usage.
Show-busy-java-threads -p < Specified Java process Id>Copy the code
Problem analysis
Finally, use show-busy-Java-Threads to get thread stack information, as shown below.
[1] Busy(98.9%) Thread (1730/ 0x6C2) Stack of Java processes (968) under user(work):"pool-4-thread-1" prio=10 tid=0x00007fd24c005800 nid=0x6c2 runnable [0x00007fd3207c0000]
java.lang.Thread.State: RUNNABLE
at java.util.HashMap.getEntry(HashMap.java:347)
at java.util.HashMap.containsKey(HashMap.java:335)
at java.util.HashSet.contains(HashSet.java:184)
...
Copy the code
- As you can see, it’s in
HashSet
thecontains
The method went wrong, and I finally got to the root of the problem.
Concurrency problem of HashSet in multi-threaded environment, HashSet is implemented by HashMap. Find a lot of data, related to the HashMap infinite loop problem. The author is not good at principle is introduced, see references in https://coolshell.cn/articles/9606.html.
The solution
- Instead of
ConcurrentHashMap
Choose a suitable oneinitialCapacity
. - The problem I ran into was business logic that could have been avoided in other ways.
conclusion
This code optimization feeling harvest is quite abundant, itself is not familiar with the anomaly positioning, this time finally is the formation of some methodology.
reference
baike.baidu.com/item/CPU usage… www.jianshu.com/p/6d573e423… Github.com/oldratlee/u… M.jb51.net/article/150… Coolshell. Cn/articles / 96…