This is the second day of my participation in the August More Text Challenge
My name is Aaron Lang and this article has been included at Github.com/niumoo/Java… And unread code blog, click on, don’t get lost.
preface
Before Arthas, when you had Java online problems like CPU spikes, load spikes, memory overruns, you looked up commands, looked up networks, and then JPS, jStack, Jmap, Jhat, jstat, hprof, etc. You end up in a mess, and you don’t always know what the problem is. Today, Arthas can be used to locate most common problems, solve them quickly, stop your losses, and leave work on time.
1. Introduction to Arthas
Arthas is a Java diagnostic tool that Alibaba opened source in September 2018. Support JDK6+, using command line interaction mode, Tab automatic incomplete, you can easily locate and diagnose online program running problems. As of this writing, Star 26700+ has been achieved.
Arthas official documentation is very detailed, and this article has also referred to the Arthas official documentation, as well as the Github project Issues section, which provides feedback and a large number of use cases for reference.
Open source: github.com/alibaba/art…
Official documentation: Baba.github. IO /arthas
2. Arthas Usage scenarios
Thanks to Arthas’s powerful and rich functionality, Arthas can do things beyond imagination. The following are just a few common usage scenarios, but there are many more that you can explore once you become familiar with Arthas.
- Is there a global view of the health of the system?
- Why did the CPU go up again, and where did it occupy the CPU?
- Do threads running have deadlocks? Is there a blockage?
- Programs take a long time to run. What takes a long time? How do you monitor it?
- From which JAR is this class loaded? Why are all kinds of class-related exceptions reported?
- Why didn’t the code I changed execute? Did I not commit? Got the branch wrong?
- If you encounter a problem, you cannot debug it online. Can you only re-publish it by logging?
- Is there any way to monitor the real-time health of the JVM?
3. How to use Arthas
As mentioned earlier, Arthas is a Command-line interactive Java diagnostic tool written in Java that can be run directly by downloading the corresponding JAR package.
3.1 installation
You can download it on Github, or if it is slow, you can try Gitee.
#Making the download
wget https://alibaba.github.io/arthas/arthas-boot.jar
#Or Gitee download
wget https://arthas.gitee.io/arthas-boot.jar
#Printing Help Information
java -jar arthas-boot.jar -h
Copy the code
3.2 run
Arthas is just a Java program, so you can run it directly from Java-JAR. Select the Java process to monitor at run time or after run.
#Running mode 1: Run first and select Java process PID
java -jar arthas-boot.jar
#Select process (enter [] internal number (not PID) press Enter)
[INFO] arthas-boot version: 3.1.4
[INFO] Found existing java process, please choose one and hit RETURN.
* [1]: 11616 com.Arthas
[2]: 8676
[3]: 16200 org.jetbrains.jps.cmdline.Launcher
[4]: 21032 org.jetbrains.idea.maven.server.RemoteMavenServer
#Running mode 2: Java process PID is selected during running
java -jar arthas-boot.jar [PID]
Copy the code
You can run the ps command or the JPS command provided by the JDK to view the PID.
#View information about running Java processes
$ jps -mlvV
#Filter Java process information
$ jps -mlvV | grep [xxx]
Copy the code
JPS filters the desired process mode.
After the Arthas Logo appears, you can use commands to diagnose problems. More on this below.
For more information about how to start, see the help command.
#Other USESEXAMPLES: Jar <pid> java-jar arthas-boot.jar --target-ip 0.0.0.0 java-jar arthas-boot.jar --telnet-port Java HTTP port - 9999-1 - jar arthas - boot. The jar - tunnel - server 'ws: / / 192.168.10.11:7777 / ws' Java - jar arthas - boot. The jar -- tunnel - server 'ws: / / 192.168.10.11:7777 / ws' - agent - id bvDOe8XbTM2pQWjF4cfw Java - jar arthas - boot. The jar - stat - url Java - jar arthas' http://192.168.10.11:8080/api/stat '- the boot. The jar - c' sysprop; Thread '<pid> java-jar arthas-boot.jar -f batch.as <pid> java-jar arthas-boot.jar --use-version 3.1.4 java-jar arthas-boot.jar --versions java -jar arthas-boot.jar --session-timeout 3600 java -jar arthas-boot.jar --attach-only java -jar arthas-boot.jar --repo-mirror aliyun --use-httpCopy the code
3.3 web console
Arthas currently supports the Web Console, which starts automatically after successfully starting the connection process and can be accessed directly at http://127.0.0.1:8563/, where the page operates in exactly the same mode as the Console.
3.4 Common Commands
Here are some common Arthas commands that you may not know how to use.
The command | introduce |
---|---|
dashboard | Real-time data panel for the current system |
thread | View thread stack information for the current JVM |
watch | Method to perform data observations |
trace | Method internally calls the path and prints the time spent on each node on the method path |
stack | Prints the call path to which the current method is called |
tt | A space-time tunnel of method execution data, recording the input and return information of each call to the specified method, and observing these different time calls |
monitor | Method Execution monitoring |
jvm | View current JVM information |
vmoption | View and update parameters related to JVM diagnostics |
sc | View information about classes loaded by the JVM |
sm | View method information for loaded classes |
jad | Decompile the source code that specifies the loaded class |
classloader | View the classloader inheritance tree, urls, class loading information |
heapdump | The heap dump function is similar to the jmap command |
3.5 quit
Arthas automatically resets all enhanced classes at the same time when you exit with shutdown.
4. Arthas common operations
Now that you’ve seen what Arthas is and how it is started, the following details how Arthas is used, depending on the circumstances. If any problem occurs during the use of a command, you can run each command to view the help information.
Write a test class with various scenarios and run it, then use Arthas for problem location,
import java.util.HashSet;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import lombok.extern.slf4j.Slf4j;
/** * <p> * Arthas Demo *@Author niujinpeng
*/
@Slf4j
public class Arthas {
private static HashSet hashSet = new HashSet();
/** Thread pool, size 1*/
private static ExecutorService executorService = Executors.newFixedThreadPool(1);
public static void main(String[] args) {
// The simulated CPU is too high, this is commented out, can be opened when testing
// cpu();
// Simulate thread blocking
thread();
// Simulate thread deadlock
deadThread();
// Keep adding data to the hashSet
addHashSetThread();
}
/** * continuously adds data to the hashSet */
public static void addHashSetThread(a) {
// Initialize constants
new Thread(() -> {
int count = 0;
while (true) {
try {
hashSet.add("count" + count);
Thread.sleep(10000);
count++;
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}).start();
}
public static void cpu(a) {
cpuHigh();
cpuNormal();
}
/**
* 极度消耗CPU的线程
*/
private static void cpuHigh(a) {
Thread thread = new Thread(() -> {
while (true) {
log.info("cpu start 100"); }});// Add to thread
executorService.submit(thread);
}
/** * Common CPU consuming threads */
private static void cpuNormal(a) {
for (int i = 0; i < 10; i++) {
new Thread(() -> {
while (true) {
log.info("cpu start");
try {
Thread.sleep(3000);
} catch(InterruptedException e) { e.printStackTrace(); } } }).start(); }}/** * Simulates thread blocking and submits threads */ to an already full thread pool
private static void thread(a) {
Thread thread = new Thread(() -> {
while (true) {
log.debug("thread start");
try {
Thread.sleep(3000);
} catch(InterruptedException e) { e.printStackTrace(); }}});// Add to thread
executorService.submit(thread);
}
/** ** deadlock */
private static void deadThread(a) {
/** Create resource */
Object resourceA = new Object();
Object resourceB = new Object();
// Create a thread
Thread threadA = new Thread(() -> {
synchronized (resourceA) {
log.info(Thread.currentThread() + " get ResourceA");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
log.info(Thread.currentThread() + "waiting get resourceB");
synchronized (resourceB) {
log.info(Thread.currentThread() + " get resourceB"); }}}); Thread threadB =new Thread(() -> {
synchronized (resourceB) {
log.info(Thread.currentThread() + " get ResourceB");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
log.info(Thread.currentThread() + "waiting get resourceA");
synchronized (resourceA) {
log.info(Thread.currentThread() + " get resourceA"); }}}); threadA.start(); threadB.start(); }}Copy the code
4.1 Global Monitoring
The dashboard commands give you an overview of your application’s thread, memory, GC, and runtime environment information.
4.2 Why did the CPU take off
The above code example has an infinite loop of IDLE CPU, which is very costly to CPU performance, so how to find it?
usethreadTo viewallThread information is also listed for each threadCPU
Usage, you can see that thread 12 is using 100% CPU.
Using the command Thread 12, you can see that the CPU uses higher methods and lines (this may differ from the number of lines in the code above, which was reprinted at the time I wrote this article).
This is done by looking at thread information in general and then looking at specific thread performance. If you’re just looking for threads with high CPU usage, you can simply use the command thread-n [number of threads displayed] to list the threads with the highest CPU usage.
The method to locate the highest CPU usage.
4.3 Thread Pool Thread status
Before addressing threading issues, let’s review several common states of threads:
- In the operation of the RUNNABLE
- TIMED_WAITINThe thread that called the following method entersTIMED_WAITING:
- Thread#sleep()
- Object#wait() with a timeout parameter
- Thread#join() with a timeout parameter
- LockSupport#parkNanos()
- LockSupport#parkUntil()
- WAITINGA thread enters a WAITING state when it calls the following methods:
- Object#wait() and takes no timeout argument
- Thread#join() and takes no timeout argument
- LockSupport#park()
- BLOCKED, waiting for a lock
In the simulation above, we define a thread pool size of 1, then commit one thread in cpuHigh, commit another thread in Thread, and the next thread will block because the pool is full.
Using thread | grep pool in order to check the thread pool thread information.
You can see that the thread pool has WAITING threads.
4.4 Thread Deadlocks
The deadThread method implements a deadlock in the simulation code above. Use thread -b to view the information directly to locate the deadlock.
/** ** deadlock */
private static void deadThread(a) {
/** Create resource */
Object resourceA = new Object();
Object resourceB = new Object();
// Create a thread
Thread threadA = new Thread(() -> {
synchronized (resourceA) {
log.info(Thread.currentThread() + " get ResourceA");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
log.info(Thread.currentThread() + "waiting get resourceB");
synchronized (resourceB) {
log.info(Thread.currentThread() + " get resourceB"); }}}); Thread threadB =new Thread(() -> {
synchronized (resourceB) {
log.info(Thread.currentThread() + " get ResourceB");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
log.info(Thread.currentThread() + "waiting get resourceA");
synchronized (resourceA) {
log.info(Thread.currentThread() + " get resourceA"); }}}); threadA.start(); threadB.start(); }Copy the code
Deadlock information checked.
4.5 decompiling
The above code is placed under the package com, assuming that this is a thread environment, when you suspect that the current running code is not the code you want, you can directly decompile the code, or you can optionally view the class field or method information.
If you suspect it is not your own code, you can decompile the class directly using jad.
The JAD command also provides some additional parameters:
#Decompilation shows only the source code
jad --source-only com.Arthas
#Decompile a method of a class
jad --source-only com.Arthas mysql
Copy the code
4.6 Viewing Field Information
Run the **sc -d -f ** command to view the field information of the class.
[arthas@20252]$ sc -d -f com.Arthas
sc -d -f com.Arthas
class-info com.Arthas
code-source /C:/Users/Niu/Desktop/arthas/target/classes/
name com.Arthas
isInterface false
isAnnotation false
isEnum false
isAnonymousClass false
isArray false
isLocalClass false
isMemberClass false
isPrimitive false
isSynthetic false
simple-name Arthas
modifier public
annotation
interfaces
super-class +-java.lang.Object
class-loader +-sun.misc.Launcher$AppClassLoader@18b4aac2
+-sun.misc.Launcher$ExtClassLoader@2ef1e4fa
classLoaderHash 18b4aac2
fields modifierfinal,private,static
type org.slf4j.Logger
name log
value Logger[com.Arthas]
modifierprivate,static
type java.util.HashSet
name hashSet
value [count1, count2]
modifierprivate,static
type java.util.concurrent.ExecutorService
name executorService
value java.util.concurrent.ThreadPoolExecutor@71c03156[Ru
nning, pool size = 1, active threads = 1, queued ta
sks = 0, completed tasks = 0]
Affect(row-cnt:1) cost in 9 ms.
Copy the code
4.7 Viewing Method Information
Use the sm command to view the method information of the class.
[arthas@22180]$ sm com.Arthascom.Arthas <init>()V com.Arthas start()V com.Arthas thread()V com.Arthas deadThread()V com.Arthas lambda$cpuHigh$1()V com.Arthas cpuHigh()V com.Arthas lambda$thread$3()V com.Arthas addHashSetThread()V com.Arthas cpuNormal()V com.Arthas cpu()V com.Arthas lambda$addHashSetThread$0()V com.Arthas lambda$deadThread$4(Ljava/lang/Object; Ljava/lang/Object;) V com.Arthas lambda$deadThread$5(Ljava/lang/Object; Ljava/lang/Object;) V com.Arthas lambda$cpuNormal$2()V Affect(row-cnt:16) cost in 6 ms.Copy the code
4.8 Very curious about the value of variables
With the OGNL command, ogNL expressions can easily manipulate the desired information.
Using the same code as in the example above, we look at the data in the variable hashSet:
View static variable hashSet information.
[arthas@19856]$ ognl '@com.Arthas@hashSet'
@HashSet[
@String[count1],
@String[count2],
@String[count29],
@String[count28],
@String[count0],
@String[count27],
@String[count5],
@String[count26],
@String[count6],
@String[count25],
@String[count3],
@String[count24],
Copy the code
Check the static variable hashSet size.
[arthas@19856]$ ognl '@[email protected]()'
@Integer[57]
Copy the code
It can even be manipulated.
[arthas@19856]$ ognl '@[email protected]("test")'
@Boolean[true]
[arthas@19856]$
# View the added characters
[arthas@19856]$ ognl '@com.Arthas@hashSet' | grep test
@String[test],
[arthas@19856]$
Copy the code
Ognl can do many things, you can refer to ognl expressions special usage (https://github.com/alibaba/arthas/issues/71).
4.9 Is there any problem with the program
4.9.1 Running Is Slow and Time-consuming
You can use the trace command to trace the time spent on statistical methods
This time I’m going to switch to the simulation code. Get (uid); getUser (userService.get); getUser (userService.get); In this method, check, service, redis, mysql respectively.
@RestController
@Slf4j
public class UserController {
@Autowired
private UserServiceImpl userService;
@GetMapping(value = "/user")
public HashMap<String, Object> getUser(Integer uid) throws Exception {
// simulate user query
userService.get(uid);
HashMap<String, Object> hashMap = new HashMap<>();
hashMap.put("uid", uid);
hashMap.put("name"."name" + uid);
returnhashMap; }}Copy the code
Mock code Service:
@Service
@Slf4j
public class UserServiceImpl {
public void get(Integer uid) throws Exception {
check(uid);
service(uid);
redis(uid);
mysql(uid);
}
public void service(Integer uid) throws Exception {
int count = 0;
for (int i = 0; i < 10; i++) {
count += i;
}
log.info("service end {}", count);
}
public void redis(Integer uid) throws Exception {
int count = 0;
for (int i = 0; i < 10000; i++) {
count += i;
}
log.info("redis end {}", count);
}
public void mysql(Integer uid) throws Exception {
long count = 0;
for (int i = 0; i < 10000000; i++) {
count += i;
}
log.info("mysql end {}", count);
}
public boolean check(Integer uid) throws Exception {
if (uid == null || uid < 0) {
log.error("Uid not correct, uid:{}", uid);
throw new Exception("Uid incorrect");
}
return true; }}Copy the code
After running Springboot, use the **trace== ** command to start detecting the elapsed time.
[arthas@6592]$ trace com.UserController getUser
Copy the code
Access interface/getUser
You can see the time consuming information. Yescom.UserServiceImpl:get()
The method is time-consuming.
Continue tracing the time-consuming methods, and then visit again.
[arthas@6592]$ trace com.UserServiceImpl get
Copy the code
It is clear that the mysql method of com.UserServiceImpl is the most time-consuming.
Affect(class-cnt:1 , method-cnt: 1)cost in 31 ms. ` -ts=2019-10-16 14:40:10; thread_name=http-nio-8080-exec-8; id=1f; is_daemon=true; priority=5; TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@23a918c7 `---[6.792201ms] com.UserServiceImpl:get()
+---[0.008ms] com.UserServiceImpl:check() #17+ -- -- -0.076ms] com.UserServiceImpl:service() #18+ -- -- -0.1089ms] com.UserServiceImpl:redis() #19
`---[6.528899ms] com.UserServiceImpl:mysql() #20
Copy the code
4.9.2 Statistical Method Time-consuming
Use the monitor command to monitor the execution of the statistics method.
The get method execution of the com.UserServiceImpl class is counted every 5 seconds.
monitor -c 5 com.UserServiceImpl get
Copy the code
4.10 Want to observe method information
The following example uses the first two mock codes from this article.
4.10.1 Observing method input and output parameter information
Use the watch command to view input and output parameters and exceptions.
USAGE:
watch [-b] [-e] [-x <value>] [-f] [-h] [-n <value>] [-E] [-M <value>] [-s] class-pattern method-pattern express [condition-express]
SUMMARY:
Display the input/output parameter, return object, and thrown exception of specified method invocation
The express may be one of the following expression (evaluated dynamically):
target : the object
clazz : the object's class
method : the constructor or method
params : the parameters array of method
params[0..n] : the element of parameters array
returnObj : the returned object of method
throwExp : the throw exception of method
isReturn : the method ended by return
isThrow : the method ended by throwing exception
#cost : the execution time in ms of method invocation
Examples:
watch -b org.apache.commons.lang.StringUtils isBlank params
watch -f org.apache.commons.lang.StringUtils isBlank returnObj
watch org.apache.commons.lang.StringUtils isBlank '{params, target, returnObj}' -x 2
watch -bf *StringUtils isBlank params
watch *StringUtils isBlank params[0]
watch *StringUtils isBlank params[0] params[0].length==1
watch *StringUtils isBlank params '#cost>100'
watch -E -b org\.apache\.commons\.lang\.StringUtils isBlank params[0]
WIKI:
https://alibaba.github.io/arthas/watch
Copy the code
Common operations:
#View incoming and outgoing parameters
$ watch com.Arthas addHashSet '{params[0],returnObj}'
#Check the size of incoming and outgoing parameters
$ watch com.Arthas addHashSet '{params[0],returnObj.size}'
#Check whether the input and output parameters contain'count10'
$ watch com.Arthas addHashSet '{params[0],returnObj.contains("count10")}'
#Look at incoming and outgoing parameters, and outgoing parameters toString
$ watch com.Arthas addHashSet '{params[0],returnObj.toString()}'
Copy the code
Check the input and output parameters.
View the returned exception information.
4.10.2 Observing the invocation path of a method
Use the stack command to view the method call information.
#Look at the mysql method call path of the com.UserServiceImpl class
stack com.UserServiceImpl mysql
Copy the code
4.10.3 Method Call space-time tunnel
Use the tt command to log the details of the method execution.
The tt command is a time and space tunnel of the execution data of the method. It records the input and return information of each call of the specified method, and can observe these different time calls.
Common operations:
Start recording method call information: tt -t com.UserServiceImpl Check
You can see that the is-exp value of the INDEX=1001 record IS true, indicating that the call IS abnormal.
View the recorded method call information: TT-L
View details about call records (-i specifies INDEX) : tt -i 1001
You can see the exception information recorded with INDEX=1001.
Re-invoke, using the specified record, and re-invoke with -p.
tt -i 1001 -p
Copy the code
4.5. Flame diagram analysis
The Arthas performance analysis tool has recently introduced flame map analysis, which uses Async-Profiler to generate CPU/ memory flame maps for performance analysis to complement memory analysis. Arthas is relatively easy to use.
The profiler command supports generating a flame map of application hotspots. Essentially, you take samples, and you take the samples and you create a flame map.
Profiler action [actionArg]
4.5.1. Use cases
Open prifilter
By default, a fire map of the CPU is generated, meaning that event is the CPU. This can be specified with the –event argument, and the start command is used to start capturing information.
$ profiler start
Started [cpu] profiling
Copy the code
Obtain the number of collected samples
$ profiler getSamples
23
Copy the code
You can view the profiler status to see which event is being sampled and at what time.
$ profiler status
[cpu] profiling is running for 4 seconds
Stop the profiler
Generate an SVG flame diagram
$ profiler stop
profiler output file: /tmp/demo/arthas-output/20191125-135546.svg
OK
Copy the code
By default, the generated results are saved to the arthas-output directory under the application’s working directory. You can specify the output path with the –file argument.
Such as:
$ profiler stop --file /tmp/output.svg
Copy the code
HTML output
By default, the resulting file is in SVG format, and if you want to generate HTML, you can specify it with the –format argument: $profiler stop –format HTML
Check the profilter
Use port 3658 by default, arthas, you can open the: http://localhost:3658/arthas-output/ view to arthas – the output directory the profiler results below:
Click to see the results: in the flame chart, the longer the bars, the more they are used, and the call stack information from bottom to top
* * profilter self sustaining multiple analysis methods, the * * common event: CPU | alloc the lock | | cache misses etc. For example, you want to analyze memory usage.
$ profiler start –event alloc
4.5.2. Complex Commands
For example, start sampling:
profiler execute 'start'
Copy the code
Stop sampling and save to the specified file:
profiler execute 'stop,file=/tmp/result.svg'
Copy the code
The code has been uploaded to Github.
The last word
Hello world 🙂
I am a lang, lang of the moon, a technical tool person who moves bricks every day.
If you want to subscribe, you can follow the public account unread Code, or the unread Code blog).
This article has also been compiled at GitHub.com/niumoo/Java… Welcome Star.