JAVA online troubleshooting

Online troubleshooting, mainly including CPU, disk, memory and network, basically the problem is df, free, top three connections, and then jSTACK, JMAP

1.jstack

Jstack is typically used to find business logic loops, frequent GC, and excessive context switches

(1) The CPU usage is too high

1. BytopCommand to view the usage of each process. By default, the process is sorted by CPU usage

2. Usetop -H -p 2634Find the thread with the highest CPU usage under this process

Use 3.printf '%x\n' pidConvert the thread number with the highest CPU usage to hexadecimal

4. Use jStack to troubleshoot the fault

And directly find corresponding stack information in jstack jstack pid | grep ‘nid’ – C5 – color, or the process of thread dump file

5. Analyze dump files

The thread dump information generated by the jstack command contains all living threads in the JVM. To analyze a given thread, you must find out the call stack for the corresponding thread

In the top command, the pid of the thread with high CPU usage has been obtained and converted to a hexadecimal value. In thread dump, each thread has an NID. Find the corresponding NID. Calculate (nID =0x246c); calculate (nid= 0x246C); calculate (JstackCase); You can check the corresponding code to see if there is a problem.

Thread dump analyzes the status of threads

In dump, threads generally have the following states:

1, RUNNABLE, the thread is in execution

2. BLOCKED, the thread is BLOCKED

WAITING, the thread is WAITING

In general, we will pay more attention to the part of WAITING and TIMED_WAITING. BLOCKED state must have problems. If there are too many WAITING and TIMED_WAITING states, it will also be abnormal

Check the statuscat jstack.log | grep "java.lang.Thread.State" | sort -nr | uniq -c

(2) Frequent GC

To be continued

2. Rectify network request delay

The troubleshooting of network request problems can be divided into three parts

1. System delay of the service interface. Determine the request timeout problem by shortening the response time of the request in the service

2. Network delay and packet loss occur during network request transmission.

3. Client caller multi-threaded code execution delay.

2.1 System delay of the Service interface Service

The CPU load is too high and the processing speed is slow. You can check whether the CPU usage is too high

2.2 Network request transmission Causes network delay and packet loss

Run the ping command to check whether the network delay is related to the carrier
If you use Nginx for load balancing, you can configure the log format of nginx to check whether the log is an NGINx fault

log_format main
'$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"
"$upstream_response_time request_time $request_time"';
Copy the code

(upstream_response_time: from the time Nginx establishes a connection to the time Nginx receives data and closes the connection) (Request_time: from the first byte that receives a user request to the time Nginx closes the connection The response data is sent

If I were to add up the whole process, it would be:

[1 user request] [2 Nginx connection] [3 Send response] [4 Receive response] [5 close Nginx connection]

So upstream_response_time is 2+3+4+5

However, it can be considered that the time of [5 closing Nginx connection] is close to 0

So upstream_response_time is actually 2+3+4 and request_time is 1+2+3+4

The difference between the two is the [1 user request] time, if the client network condition is poor or the transfer of data itself is large

Consider that Nginx caches the request body first when using POST

This time adds up to [1 user request]

This explains why request_time might be larger than upstream_response_time

1.jstack

(1) The CPU usage is too high

(2) Frequent GC

2. Rectify network request delay

2.1 System delay of the Service interface Service

2.2 Network request transmission Causes network delay and packet loss

Related Posts

Spring Mybatis multi-data source

Why does Mysql use B+ trees instead of B- trees or red – black trees?

SpringBoot loads the YML file configuration as a component