preface

Before Arthas, when you had Java online problems like CPU spikes, load spikes, memory overruns, you looked up commands, looked up networks, and then JPS, jStack, Jmap, Jhat, jstat, hprof, etc. You end up in a mess, and you don’t always know what the problem is. Today, Arthas can be used to locate most common problems, solve them quickly, stop your losses, and leave work on time.

Arthas provides a variety of monitoring commands, such as performance trace, to monitor response time. 2. The Web Console allows you to input and execute commands directly through a browser. 3, convenient remote monitoring, is a separate monitoring tool, do not need to increase the JVM configuration and restart the JVM (of course, if the generated environment is closed effect is not obvious).Copy the code

1. Introduction to Arthas

Arthas is a Java diagnostic tool that Alibaba opened source in September 2018. Support JDK6+, using command line interaction mode, Tab automatic incomplete, you can easily locate and diagnose online program running problems. As of this writing, Star 17000+ has been achieved.

Arthas official documentation is very detailed, and this article has also referred to the Arthas official documentation, as well as the Github project Issues section, which provides feedback and a large number of use cases for reference.

Open source: github.com/alibaba/art…

Official document: arthas.aliyun.com/doc/

2. Arthas Usage scenarios

Thanks to Arthas’s powerful and rich functionality, Arthas can do things beyond imagination. The following are just a few common usage scenarios, but there are many more that you can explore once you become familiar with Arthas.

  1. Is there a global view of the health of the system?
  2. Why did the CPU go up again, and where did it occupy the CPU?
  3. Do threads running have deadlocks? Is there a blockage?
  4. Programs take a long time to run. What takes a long time? How do you monitor it?
  5. From which JAR is this class loaded? Why are all kinds of class-related exceptions reported?
  6. Why didn’t the code I changed execute? Did I not commit? Got the branch wrong?
  7. If you encounter a problem, you cannot debug it online. Can you only re-publish it by logging?
  8. Is there any way to monitor the real-time health of the JVM?

3. How to use Arthas

As mentioned earlier, Arthas is a Command-line interactive Java diagnostic tool written in Java that can be run directly by downloading the corresponding JAR package.

3.1 installation

You can download it on Github, or if it is slow, you can try Gitee.

# download jars from ali cloud curl - https://arthas.aliyun.com/arthas-boot.jar # O lot wget to download https://alibaba.github.io/arthas/arthas-boot.jar # or download Gitee wget https://arthas.gitee.io/arthas-boot.jar # print help information java -jar arthas-boot.jar -hCopy the code

3.2 run

Arthas is just a Java program, so you can run it directly from Java-JAR. Select the Java process to monitor at run time or after run.

# Run mode 1, run first, Select PID java-jar arthas-boot.jar # select process (enter [] id (not PID) and press enter) eternity@01ak31301447966 project % java-jar arthas-boot.jar [INFO] arthas-boot version: 3.4.0 [INFO] Found existing Java process, please choose one and input the serial number of the process, eg: 1. Then hit ENTER. * [1]: 769 [2]: 58579 org.jeecg.JeecgApplication [3]: 58580 org.jetbrains.jps.cmdline.Launcher [4]: 27300 org. Jetbrains. The JPS. Cmdline. The Launcher [5] : 32197 myssh. App # operation mode 2, the Java process PID Java runtime choose - jar arthas - boot. Jar [PID]Copy the code

You can run the ps command or the JPS command provided by the JDK to view the PID.

Note: The command window should be on a computer or server, not arthas

# $JPS - run the Java process information mlvV # $JPS - screening of Java process information mlvV | grep [XXX]Copy the code

JPS filters the desired process mode.

eternity@01ak31301447966 ~ % jps -mlvV|grep arthas
67250 arthas-boot.jar
Copy the code

After the Arthas Logo appears, you can use commands to diagnose problems. More on this below.

For more information about how to start, see the help command.

# Other EXAMPLES: Jar <pid> java-jar arthas-boot.jar --target-ip 0.0.0.0 java-jar arthas-boot.jar --telnet-port Java HTTP port - 9999-1 - jar arthas - boot. The jar - tunnel - server 'ws: / / 192.168.10.11:7777 / ws' Java - jar arthas - boot. The jar -- tunnel - server 'ws: / / 192.168.10.11:7777 / ws' - agent - id bvDOe8XbTM2pQWjF4cfw Java - jar arthas - boot. The jar - stat - url Java - jar arthas' http://192.168.10.11:8080/api/stat '- the boot. The jar - c' sysprop; Thread '<pid> java-jar arthas-boot.jar -f batch.as <pid> java-jar arthas-boot.jar --use-version 3.1.4 java-jar arthas-boot.jar --versions java -jar arthas-boot.jar --session-timeout 3600 java -jar arthas-boot.jar --attach-only java  -jar arthas-boot.jar --repo-mirror aliyun --use-httpCopy the code

3.3 web console

Arthas currently supports the Web Console, which starts automatically after successfully starting the connection process and can be accessed directly at http://127.0.0.1:8563/, where the page operates in exactly the same mode as the Console.

3.4 Common Commands

Here are some common Arthas commands that you may not know how to use.

The command

introduce

dashboard

The command

introduce

thread

The command

introduce

watch

The command

introduce

trace

The command

introduce

stack

The command

introduce

tt

The command

introduce

monitor

The command

introduce

jvm

The command

introduce

vmoption

The command

introduce

sc

The command

introduce

sm

The command

introduce

jad

The command

introduce

classloader

The command

introduce

heapdump

The command

introduce

3.5 quit

Arthas automatically resets all enhanced classes at the same time when you exit with shutdown.

4. Arthas common operations

Now that you’ve seen what Arthas is and how it is started, the following details how Arthas is used, depending on the circumstances. If any problem occurs during the use of a command, you can run each command to view the help information.

Write a test class with various scenarios and run it, then use Arthas for problem location,

import java.util.HashSet; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import lombok.extern.slf4j.Slf4j; /** * <p> * * @author niujinpeng */ @slf4j public class Arthas {private static HashSet HashSet = new HashSet(); / * * thread pool, the size of 1 * / private static ExecutorService ExecutorService = Executors. NewFixedThreadPool (1); Public static void main(String[] args) {public static void main(String[] args) { // simulate thread blocking thread(); // Simulate thread deadThread(); // Continuously add data to the hashSet set addHashSetThread(); Public static void addHashSetThread() {new Thread(() -> {int count = 0; while (true) { try { hashSet.add("count" + count); Thread.sleep(10000); count++; } catch (InterruptedException e) { e.printStackTrace(); } } }).start(); } public static void cpu() { cpuHigh(); cpuNormal(); } private static void cpuHigh() {Thread Thread = new Thread(() -> {while (true) {log.info(" CPU ") start 100"); }}); // Add to the thread executorService.submit(thread); } /** * Private static void cpuNormal() {for (int I = 0; i < 10; i++) { new Thread(() -> { while (true) { log.info("cpu start"); try { Thread.sleep(3000); } catch (InterruptedException e) { e.printStackTrace(); } } }).start(); }} private static void thread() {thread thread = new thread(() -> {while (true) { log.debug("thread start"); try { Thread.sleep(3000); } catch (InterruptedException e) { e.printStackTrace(); }}}); // Add to the thread executorService.submit(thread); } /** * deadThread() {private static void deadThread() {/** Create resource */ Object resourceA = new Object(); Object resourceB = new Object(); ThreadA = new Thread(() -> {synchronized (resourceA) {log.info(thread.currentThread () + "get") ResourceA"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } log.info(Thread.currentThread() + "waiting get resourceB"); synchronized (resourceB) { log.info(Thread.currentThread() + " get resourceB"); }}}); Thread threadB = new Thread(() -> { synchronized (resourceB) { log.info(Thread.currentThread() + " get ResourceB"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } log.info(Thread.currentThread() + "waiting get resourceA"); synchronized (resourceA) { log.info(Thread.currentThread() + " get resourceA"); }}}); threadA.start(); threadB.start(); }}Copy the code

4.1 Global Monitoring

The dashboard commands give you an overview of your application’s thread, memory, GC, and runtime environment information.

4.2 Why did the CPU take off

The above code example has an infinite loop of IDLE CPU, which is very costly to CPU performance, so how to find it?

The CPU usage of each thread is listed. You can see that the CPU usage of thread 12 is 100%.

Using the command Thread 12, you can see that the CPU uses higher methods and lines (this may differ from the number of lines in the code above, which was reprinted at the time I wrote this article).

This is done by looking at thread information in general and then looking at specific thread performance. If you’re just looking for threads with high CPU usage, you can simply use the command thread-n [number of threads displayed] to list the threads with the highest CPU usage.

The method to locate the highest CPU usage.

4.3 Thread Pool Thread status

Before addressing threading issues, let’s review several common states of threads:

  • In the operation of the RUNNABLE
  • The thread that TIMED_WAITIN calls the following methods will enter TIMED_WAITING
  1. Thread#sleep()

  2. Object#wait() with a timeout parameter

  3. Thread#join() with a timeout parameter

  4. LockSupport#parkNanos()

  5. LockSupport#parkUntil()

  • WAITING A thread is WAITING when it calls the following method:
  1. Object#wait() and takes no timeout argument

  2. Thread#join() and takes no timeout argument

  3. LockSupport#park()

  • BLOCKED, waiting for a lock

In the simulation above, we define a thread pool size of 1, then commit one thread in cpuHigh, commit another thread in Thread, and the next thread will block because the pool is full.

Using thread | grep pool in order to check the thread pool thread information.

You can see that the thread pool has WAITING threads.

4.4 Thread Deadlocks

The deadThread method implements a deadlock in the simulation code above. Use thread -b to view the information directly to locate the deadlock.

/** * deadThread() {private static void deadThread() {/** Create resource */ Object resourceA = new Object(); Object resourceB = new Object(); ThreadA = new Thread(() -> {synchronized (resourceA) {log.info(thread.currentThread () + "get") ResourceA"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } log.info(Thread.currentThread() + "waiting get resourceB"); synchronized (resourceB) { log.info(Thread.currentThread() + " get resourceB"); }}}); Thread threadB = new Thread(() -> { synchronized (resourceB) { log.info(Thread.currentThread() + " get ResourceB"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } log.info(Thread.currentThread() + "waiting get resourceA"); synchronized (resourceA) { log.info(Thread.currentThread() + " get resourceA"); }}}); threadA.start(); threadB.start(); }Copy the code

Deadlock information checked.

Note that currently support only to find out the synchronized keyword blocked threads, if is a Java util. Concurrent. The Lock, it is not support.

4.5 decompiling

The above code is placed under the package com, assuming that this is a thread environment, when you suspect that the current running code is not the code you want, you can directly decompile the code, or you can optionally view the class field or method information.

If you suspect it is not your own code, you can decompile the class directly using jad.

# define your own classes [arthas @ 20451] $jad com. Fridge. Service. Impl. CardActivateAppServiceImpl this: +-sun.misc.Launcher$AppClassLoader@18b4aac2 +-sun.misc.Launcher$ExtClassLoader@4278a03f Location: /Users/admin/project/uhome/Doc_01_Uhome/mythicalanimals/target/classes/ /* * Decompiled with CFR. * * Could not load the  following classes: * com.fridge.common.base.enumeration.HttpCodeEnum * com.fridge.common.base.exception.CommonException * com.fridge.common.bus.inputVO.CardActivateAppInputVO * com.fridge.common.bus.inputVO.CardActivateEditStatusInputVO * com.fridge.common.bus.inputVO.CardActivateQueryInputVO * com.fridge.common.bus.outVO.CardActivateAppOutputVO * com.fridge.common.bus.outVO.CardActivateQueryOutPutVO * com.fridge.dao.sl.CardAfterRecordMapper * com.fridge.dao.sl.CardConsumeDetailMapper * com.fridge.example.CardAfterRecordExample * com.fridge.example.CardAfterRecordExample$Criteria * com.fridge.model.CardActivate * com.fridge.model.CardDevice * com.fridge.model.CardRules * com.fridge.model.CardStore * com.fridge.model.CardUser * com.fridge.repository.CardActivateAppRepository * com.fridge.repository.CardUserRepository * com.fridge.service.CardActivateAppService * com.fridge.util.BeanUtils */ package com.fridge.service.impl; import cn.hutool.core.collection.CollUtil; import cn.hutool.core.date.DateTime; import cn.hutool.core.date.DateUtil; import cn.hutool.core.map.MapUtil; import cn.hutool.core.util.NumberUtil; import cn.hutool.core.util.ReUtil; import cn.hutool.core.util.StrUtil; import java.math.BigDecimal; import java.util.Arrays; import java.util.Calendar; import java.util.Date; import java.util.HashMap; import java.util.List; import java.util.Set; import java.util.stream.Collectors; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.data.redis.core.RedisTemplate; import org.springframework.stereotype.Service; import org.springframework.transaction.annotation.Transactional; @Service public class CardActivateAppServiceImpl implements CardActivateAppService { private static final Logger log = LoggerFactory.getLogger(CardActivateAppServiceImpl.class); @Autowired private CardActivateAppRepository cardActivateAppRepository; @Autowired private CardUserRepository cardUserRepository; @Autowired private RedisTemplate redisTemplate; @Autowired private CardAfterRecordMapper cardAfterRecordMapper; @Autowired private CardConsumeDetailMapper cardConsumeDetailMapper; public CardActivateAppOutputVO queryCardActivateStatus(CardActivateAppInputVO inputVO) { this.checkParams(inputVO); CardActivateAppOutputVO outputVO = new CardActivateAppOutputVO(); String deviceMac = inputVO.getDeviceMac(); String userId = inputVO.getUserId(); List devices = this.cardActivateAppRepository.queryCardDeviceByMac(deviceMac); List cardActivates = this.cardActivateAppRepository.queryCardActivateByUserId(userId); String[] arr = new String[]{"0", "1", "2"}; List cardActivatesMac = this.cardActivateAppRepository.queryCardActivateByMac(deviceMac, Arrays.asList(arr)); if (CollUtil.isEmpty(cardActivatesMac)) { if (CollUtil.isEmpty(devices)) { if (CollUtil.isEmpty(cardActivates)) { this.getCardActivateOutputVOInvalid(outputVO); return outputVO; } outputVO.setStatus(Integer.valueOf(2)); return outputVO; } outputVO = this.getCardActivateOutputVOvalid(devices, userId, deviceMac); return outputVO; }...Copy the code

The JAD command also provides some additional parameters:

# decompress a method of a class jad --source-only com.Arthas mysqlCopy the code

4.6 Viewing Field Information

Run the **sc -d -f ** command to view the field information of the class.

[arthas@20252]$ sc -d -f com.Arthas sc -d -f com.Arthas class-info com.Arthas code-source /C:/Users/Niu/Desktop/arthas/target/classes/ name com.Arthas isInterface false isAnnotation false isEnum false isAnonymousClass false isArray false isLocalClass false isMemberClass false isPrimitive false isSynthetic false simple-name Arthas modifier public annotation interfaces super-class +-java.lang.Object class-loader +-sun.misc.Launcher$AppClassLoader@18b4aac2 +-sun.misc.Launcher$ExtClassLoader@2ef1e4fa classLoaderHash 18b4aac2 fields modifierfinal,private,static type org.slf4j.Logger name log value Logger[com.Arthas] modifierprivate,static type java.util.HashSet name hashSet value [count1, count2] modifierprivate,static type java.util.concurrent.ExecutorService name executorService value java.util.concurrent.ThreadPoolExecutor@71c03156[Ru nning, pool size = 1, active threads = 1, queued ta sks = 0, Completed the tasks = 0] Affect (row - CNT: 1) cost in 9 Ms. # explain class - info com. Fridge. Service. Impl. CardActivateAppServiceImpl code-source /Users/admin/project/uhome/Doc_01_Uhome/mythicalanimals/targ et/classes/ name Com. Fridge. Service. Impl. Whether CardActivateAppServiceImpl # # # whether annotations isAnnotation interface isInterface false false is enumerated isEnum False # isAnonymousClass false # array isArray false isLocalClass false isMemberClass false isPrimitive false IsSynthetic false # class name is simple - the name CardActivateAppServiceImpl # modifier modifier public # annotations using the annotation Org. Springframework. Stereotype. The interfaces being implemented by the Service # interfaces com. Fridge. Service. CardActivateAppService # super - class parent +-java.lang.Object # classloader class-loader +-sun.misc.Launcher$AppClassLoader@18b4aac2 +-sun.misc.Launcher$ExtClassLoader@4278a03fCopy the code

4.7 Viewing Method Information

Use the sm command to view the method information of the class.

[arthas@22180]$ sm com.Arthas com.Arthas <init>()V com.Arthas start()V com.Arthas thread()V com.Arthas deadThread()V com.Arthas lambda$cpuHigh$1()V com.Arthas cpuHigh()V com.Arthas lambda$thread$3()V com.Arthas addHashSetThread()V com.Arthas cpuNormal()V com.Arthas cpu()V com.Arthas lambda$addHashSetThread$0()V com.Arthas lambda$deadThread$4(Ljava/lang/Object; Ljava/lang/Object;) V com.Arthas lambda$deadThread$5(Ljava/lang/Object; Ljava/lang/Object;) V com.Arthas lambda$cpuNormal$2()V Affect(row-cnt:16) cost in 6 ms.Copy the code

4.8 Very curious about the value of variables

With the OGNL command, ogNL expressions can easily manipulate the desired information.

Using the same code as in the example above, we look at the data in the variable hashSet:

View static variable hashSet information.

[arthas@19856]$ ognl '@com.Arthas@hashSet'
@HashSet[
    @String[count1],
    @String[count2],
    @String[count29],
    @String[count28],
    @String[count0],
    @String[count27],
    @String[count5],
    @String[count26],
    @String[count6],
    @String[count25],
    @String[count3],
    @String[count24],
Copy the code

Check the static variable hashSet size.

[arthas@19856]$ ognl '@[email protected]()'
	@Integer[57]
Copy the code

It can even be manipulated.

[arthas@19856]$ogNL '@[email protected]("test")' @boolean [true] [arthas@19856]$# Check the added character [arthas@19856]$ognl '@com.Arthas@hashSet' | grep test @String[test], [arthas@19856]$Copy the code

Ognl can do many things, you can refer to ognl expressions special usage (https://github.com/alibaba/arthas/issues/71).

4.9 Is there any problem with the program

4.9.1 Running Is Slow and Time-consuming

You can use the trace command to trace the time spent on statistical methods

This time I’m going to switch to the simulation code. Get (uid); getUser (userService.get); getUser (userService.get); In this method, check, service, redis, mysql respectively.

@RestController @Slf4j public class UserController { @Autowired private UserServiceImpl userService; @GetMapping(value = "/user") public HashMap<String, Object> getUser(Integer uid) throws Exception {// Simulate the user to query userService.get(uid); HashMap<String, Object> hashMap = new HashMap<>(); hashMap.put("uid", uid); hashMap.put("name", "name" + uid); return hashMap; }}Copy the code

Mock code Service:

@Service @Slf4j public class UserServiceImpl { public void get(Integer uid) throws Exception { check(uid); service(uid); redis(uid); mysql(uid); } public void service(Integer uid) throws Exception { int count = 0; for (int i = 0; i < 10; i++) { count += i; } log.info("service end {}", count); } public void redis(Integer uid) throws Exception { int count = 0; for (int i = 0; i < 10000; i++) { count += i; } log.info("redis end {}", count); } public void mysql(Integer uid) throws Exception { long count = 0; for (int i = 0; i < 10000000; i++) { count += i; } log.info("mysql end {}", count); } public Boolean check (uid) Integer throws the Exception {the if (uid = = null | | uid < 0) {log. The error (" uid is not correct, the uid: {} ", uid); Throw new Exception("uid incorrect "); } return true; }}Copy the code

After running Springboot, use the **trace== ** command to start detecting the elapsed time.

[arthas@6592]$ trace com.UserController getUser
Copy the code

When accessing /getUser, you can view the time information. The com.userServiceImpl: Get () method takes a high time.

Continue tracing the time-consuming methods, and then visit again.

[arthas@6592]$ trace com.UserServiceImpl get
Copy the code

It is clear that the mysql method of com.UserServiceImpl is the most time-consuming.

Affect(class-cnt:1 , method-cnt:1) cost in 31 ms. `---ts=2019-10-16 14:40:10; thread_name=http-nio-8080-exec-8; id=1f; is_daemon=true; priority=5; TCCL = org. Springframework. Boot. Web. Embedded. Tomcat. TomcatEmbeddedWebappClassLoader @ 23 a918c7 ` - [6.792201] ms Com. UserServiceImpl: get () + - [0.008] ms com. UserServiceImpl: check () # 17 + - [0.076] ms com. UserServiceImpl: service () # 18 + - [0.1089] ms com. UserServiceImpl: redis () # 19 ` - [6.528899] ms com. UserServiceImpl: mysql (# 20)Copy the code

4.9.2 Statistical Method Time-consuming

Use the monitor command to monitor the execution of the statistics method.

The get method execution of the com.UserServiceImpl class is counted every 5 seconds.

monitor -c 5 com.UserServiceImpl get
1
Copy the code

4.10 Want to observe method information

The following example uses the first two mock codes from this article.

4.10.1 Observing method input and output parameter information

Use the watch command to view input and output parameters and exceptions.

USAGE:
   watch [-b] [-e] [-x <value>] [-f] [-h] [-n <value>] [-E] [-M <value>] [-s] class-pattern method-pattern express [condition-express]

 SUMMARY:
   Display the input/output parameter, return object, and thrown exception of specified method invocation
   The express may be one of the following expression (evaluated dynamically):
           target : the object
            clazz : the object's class
           method : the constructor or method
           params : the parameters array of method
     params[0..n] : the element of parameters array
        returnObj : the returned object of method
         throwExp : the throw exception of method
         isReturn : the method ended by return
          isThrow : the method ended by throwing exception
            #cost : the execution time in ms of method invocation
 Examples:
   watch -b org.apache.commons.lang.StringUtils isBlank params
   watch -f org.apache.commons.lang.StringUtils isBlank returnObj
   watch org.apache.commons.lang.StringUtils isBlank '{params, target, returnObj}' -x 2
   watch -bf *StringUtils isBlank params
   watch *StringUtils isBlank params[0]
   watch *StringUtils isBlank params[0] params[0].length==1
   watch *StringUtils isBlank params '#cost>100'
   watch -E -b org\.apache\.commons\.lang\.StringUtils isBlank params[0]

 WIKI:
   https://alibaba.github.io/arthas/watch
Copy the code

Common operations:

$watch com.Arthas addHashSet '{params[0],returnObj}' $watch com.Arthas addHashSet '{params[0], returnobj.size}' # check to see if 'count10' $watch com.arthas addHashSet is included in the input and output parameters [0],returnObj. Contains ("count10")}' ToString $watch com.Arthas addHashSet '{params[0], returnobj.tostring ()}' 12345678Copy the code

Check the input and output parameters.

View the returned exception information.

4.10.2 Observing the invocation path of a method

Use the stack command to view the method call information.

Stack com.UserServiceImpl mysql 12 stack com.UserServiceImpl mysql 12Copy the code

4.10.3 Method Call space-time tunnel

Use the tt command to log the details of the method execution.

The tt command is a time and space tunnel of the execution data of the method. It records the input and return information of each call of the specified method, and can observe these different time calls.

Common operations:

Start recording method call information: tt -t com.UserServiceImpl Check

You can see that the is-exp value of the INDEX=1001 record IS true, indicating that the call IS abnormal.

View the recorded method call information: TT-L

View details about call records (-i specifies INDEX) : tt -i 1001

You can see the exception information recorded with INDEX=1001.

Re-invoke, using the specified record, and re-invoke with -p.

tt -i 1001 -p
Copy the code

Reprinted Arthas online monitoring and problem location – Mar 23 – Blog Garden

Arthas Online Monitoring and Problem Location – mar 23 – Blogging Garden

www.cnblogs.com