Q: Do you know the Java memory model?
Which version of the JDK are you referring to?
This article is from mashibing.com, which enriches the original text with technical details and illustrations.
Learn things to learn context first! – horse soldiers
GC basics
1. What is garbage
Memory application in C: malloc free
C + + : new delete
C /C++ manually reclaim memory
Java: new?
Automatic memory reclamation, simple programming, the system is not prone to error, manual release of memory, easy to two types of problems:
Forget about recycling
Many recycling
An object or objects to which there is no reference (circular reference)
2. How to locate garbage
-
ReferenceCount is not used in Java
Add one for each reference, subtract one for each reference, and judge it as garbage when it reaches 0.
-
RootSearching algorithm, currently used
For example, starting from root, the references are connected in a line. What is online is not garbage, and what is not online is garbage.
What can be used as root:
-
Classes loaded and objects created by the bootstrap loader;
-
Objects referenced in JavaStack (objects referenced in stack memory);
-
The object to which the static reference points in the method area;
-
The object to which the constant reference in the method area points;
-
Objects referenced by JNI in Native methods.
-
3. Common garbage collection algorithms
-
Mark sweep – Location discontinuities generate low fragmentation efficiency (two scans), old age recovery
The cleaning effect is shown in the figure:
Disadvantages: Need to scan the entire heap area, large time overhead. If you’re interested, check out the PPT I’ve uploaded, which is an improved version of mark-purge.
Reference: blog.csdn.net/m0_37860933…
-
Copying algorithms – No debris, waste of space, new generation recycling
In the heap area, objects are divided into new generation (young generation), old generation and immortal generation, and the replication algorithm occurs in the new generation. Newly built objects are generally allocated to Eden area of the new generation. When Eden is nearly full, a small garbage collection will be carried out. Surviving objects are moved to Survivor1 (S1).
When GC occurs again, live objects in S1 are copied to previously idle S2 with a lifetime of +1; For each subsequent GC, S1 and S2 will alternate as storage and idle areas for the live object. And if the life of the living object reaches a certain threshold, it will be assigned to the old age.
Note that there are no older concepts in JDK8, instead metaspace is used. If you are interested, please refer to JDK8 Metaspace tuning
-
Mark Compact – No fragmentation, low efficiency (two scans, Pointers need to be adjusted), old collection
The general idea is similar to mark-purge
(1) In the marking stage, all reachable (directly or indirectly accessible) objects are marked by the root node, which is similar to the mark-purge method;
(2) In the clearance phase, the last round of surviving objects are compressed to one end of memory, and then the boundary is cleaned. (This reduces memory fragmentation and avoids running out of space when allocating large objects)
Cleaning effect as shown in figure!
Comparison of several algorithms
- Comparison between the copy algorithm and the mark-compression method:
The copy-collection algorithm will perform more replication operations when the object survival rate is high, and the efficiency will be low. More importantly, if you don’t want to waste 50% of the space, you need to allocate extra space for guarantee, in order to cope with the extreme case that all objects in the memory used are 100% alive, so the replication algorithm is only used in the new generation, and the old generation generally can not directly choose this algorithm, using mark-compression method.
- Comparison between mark-compression and mark-clearance:
Typical of the old days, a “mark-compact” algorithm was proposed, in which the marking process was still the same as the “mark-clean” algorithm, but instead of cleaning up the recyclable objects directly, the next step was to move all surviving objects to one end and then clean up the memory directly beyond the end boundary. It also reduces memory fragmentation.
4. Common garbage collectors
-
Garbage collector metric parameters
(A) Throughput
The ratio of CPU time spent running user code to total CPU consumption;
Throughput = run user code time/(run user code time + garbage collection time);
High throughput means less garbage collection time and longer runtime for user code;
(B) Desired target of garbage collector (concerns)
Pause time
Shorter pauses are suitable for applications that need to interact with the user;
Good response speed can improve user experience;
throughput
High throughput can efficiently use THE CPU time to complete the computation task as soon as possible;
Mainly suitable for tasks that do not require much interaction in the background;
Footprint
Minimize the memory space of the heap while achieving the first two goals;
Better spatial locality can be obtained;
-
STW
Stop-the-word: Refers to the interruption time when the business process is stopped and the garbage collector is started for garbage collection
-
PS and PN difference read on: forced docs.oracle.com/en/java/jav…
-
Garbage collector versus memory size
Serial dozens
PS hundreds of megabytes – several gigabytes (JDK8)
CMS – 20G
G1 – hundreds of grams
Zgc-4t-16t (JDK13)
-
The difference between concurrent and parallel garbage collection
(A) Parallel
Refers to multiple garbage collection threads working in parallel while the user thread is still in a waiting state.
Be the ParNew, Parallel Avenge, Parallel Old;
(B) Concurrent
Refers to the simultaneous execution of the user thread and the garbage collection thread (but not necessarily in parallel and may be executed alternately);
The user program continues to run, while the garbage collector thread runs on another CPU, making no distinction between the new generation and the old generation.
Such as CMS, G1 (also parallel);
-
The difference between Minor and Full GC
(A) Minor GC
Also known as Cenozoic GC, refers to the garbage collection in the new generation.
Because Java objects are mostly ephemeral, Minor GC is frequent and generally fast;
(B) Full GC
Also known as Major GC or old GC, refers to GC occurring in the old age;
The occurrence of a Full GC is often accompanied by at least one Minor GC (not always, the Parallel Sacvenge collector can optionally set a Major GC policy);
Major GC is typically 10 times slower than Minor GC.
The evolution of garbage collector
Parallel algorithm Serial algorithm dozens of megabytes Parallel algorithm several G CMS Dozens of G – Start concurrent reclaim -.- Tricolor mark –
In order to improve efficiency, PS was born. In order to cooperate with CMS, PN was born. CMS was introduced later in version 1.4. CMS is a landmark GC, which opens the process of concurrent collection. So any current JDK version defaults to CMS concurrent garbage collection because you can’t stand STW
Serial Young generation Serial reclamation
For the New generation;
Copy algorithm is adopted;
Single-thread collection;
When garbage collection is done, all worker threads must be paused until complete;
“Stop The World”;
SerialOld
Serial corresponds to the old age collector
A “mark-sweep-compact” algorithm (and compression, mark-sweep-compact);
Single-thread collection;
The ParNew young generation is usually collected in parallel with the CMS
The ParNew garbage collector is a multithreaded version of the Serial collector.
The ParNew/Serial Old collector runs as follows:
-
The characteristics of
With the exception of multithreading, the behavior and characteristics of the Serial collector are the same. For example, Serial collector can control parameters, collection algorithm, Stop The World, memory allocation rules, reclaim strategy, etc. The two collectors share a lot of code; In Server mode, the ParNew collector is a very important collector because it is currently the only one besides Serial that works with the CMS collector; However, in a single CPU environment, it is no better than the Serail collector because of the thread interaction overhead.
-
Set parameters: “-xx :+UseConcMarkSweepGC” : After CMS is specified, ParNew will be used as the new generation collector by default. “-xx :+UseParNewGC” : specifies ParNew forcibly. “-xx :ParallelGCThreads” : specifies the number of garbage collection threads. ParNew enables the same number of garbage collection threads as the CPU by default.
-
Why only ParNew works with the CMS collector: CMS is HotSpot’s first truly Concurrent collector in JDK1.5, the first to allow garbage collection threads to work (basically) with user threads; CMS is an older collector that does not work as Parallel Scavenge with JDK1.4. The Parallel Insane (and G1) do not use the traditional GC collector code framework and are implemented independently; The other collectors share some of the framework code;
Avenge the younger generation
The Parallel Collector is also known as a Throughput Collector because of its affinity to the Throughput.
(A) The ParNew collector has some characteristics similar to those of the ParNew collector
Cenozoic collector;
Copy algorithm is adopted;
Multithreaded collection;
(B) The main feature is that its focus is different from other collectors
Collectors such as CMS focus on minimizing the pause time of user threads during garbage collection;
The goal of the Parallel Insane is to achieve a controlled Throughput.
ParallelOld
Parallel is the default collector for java8
For the old age;
“Mark-collation” algorithm is adopted.
Multithreaded collection;
The Parallel Avenge /Parallel Old collector is illustrated as follows:
Set the parameters
“-xx :+UseParallelOldGC” : Specifies the use of ParallelOld collector;
ConcurrentMarkSweep old age concurrent reclaim
Garbage collection and application run simultaneously, reducing STW time (200ms) CMS problems are many, so now no version of the default CMS, you have to manually specify CMS since it is MarkSweep, there will be fragmentation problems, fragmentation reaches a certain point, the CMS old age allocation object can not allocate time, Use SerialOld for age collection
Imagine: PS + PO -> Add memory for garbage collector -> PN + CMS + SerialOld (STW of a few hours – a few days)
Dozens of gigabytes of memory, single thread collection -> G1 + FGC
Dozens of G -> on T memory server ZGC algorithm: tri-colored markup + Incremental Update
CMS
A Concurrent Mark Sweep (CMS) Collector is also called a Concurrent Low Pause Collector or low-latency garbage Collector;
-
The characteristics of
For the old age;
Based on the “mark-clean” algorithm (no compression operation, memory fragmentation);
To obtain the shortest recovery pause time as the goal;
Concurrent collection, low pause;
Requires more memory (see disadvantages below);
HotSpot is the first truly Concurrent collector in JDK1.5.
For the first time, garbage collection threads work (basically) at the same time as user threads;
Floating garbage cannot be processed and a “Concurrent Mode Failure” may occur
-
steps
(A) CMS Initial Mark
Mark only objects to which GC Roots can be directly associated;
Very fast;
But you need to “Stop The World”;
(B) CMS Concurrent Mark
GC Roots Tracing process;
The surviving object is marked in the collection just generated;
The application is also running;
There is no guarantee that all live objects will be marked;
(C) re-marking (CMS remark)
To correct the mark record of that part of the object whose mark changes because the user program continues to operate during concurrent marking;
You need to “Stop The World”, and The pause is slightly longer than The initial tag, but much shorter than The concurrent tag;
Using multi-thread parallel execution to improve efficiency;
(D) CMS Concurrent sweep
Recycle all garbage objects;
G1(200ms – 10ms)
Algorithm: three color mark + SATB
G1 (garbage-first) is the commercially available collector of JDK7-U4.
-
The characteristics of
(A) Parallelism and concurrency
Can make full use of multi-CPU, multi-core environment hardware advantages;
Can be parallel to shorten The “Stop The World” pause time;
You can also have garbage collection run concurrently with the user program;
(B) Collection by generation, including Cenozoic and old age
The ability to manage the entire GC heap (young and old) independently without needing to be paired with other collectors;
Being able to deal with objects of different eras in different ways;
While the generational concept remains, the memory layout of the Java heap varies considerably;
Divide the whole heap into independent regions of equal size.
Cenozoic and oleozoic are no longer physically separate; they are collections of regions (which do not need to be continuous);
(C) Combined with a variety of garbage collection algorithms, spatial integration, no debris generation
As a whole, it is based on mark-collation algorithm.
Locally (between two regions), it is based on the replication algorithm.
This is an implementation of a train-like algorithm;
Will not generate memory fragmentation, is conducive to a long time running;
(D) Predictable pauses: high throughput with low pauses
G1 can not only pursue low pause, but also build a predictable pause time model.
You can explicitly specify that within a time slice of M milliseconds, garbage collection takes no more than N milliseconds.
-
Application scenarios
Service-oriented applications, for machines with large memory, multi-processor;
The primary application is to provide a solution for applications that require low GC latency and have a large heap;
For example, when the heap size is about 6GB or larger, predictable pause times can be less than 0.5 seconds;
To replace the CMS collector in JDK1.5;
G1 may be better than CMS when:
(1) More than 50% of the Java heap is occupied by active data;
(2) The frequency of object allocation or chronological lifting varies greatly;
(3) GC pause time is too long (longer than 0.5 to 1 second).
If there are no problems with the current collector, don’t rush to G1;
If your application is looking for low pauses, try G1;
Whether or not to replace the CMS requires actual scenario testing.
-
Set the parameters
“-xx :+UseG1GC” : specifies to use G1 collector;
“- XX: InitiatingHeapOccupancyPercent” : when the Java heap utilization rate of parameter values, began to concurrent mark phase; The default value is 45.
“-xx :MaxGCPauseMillis” : sets the pause time target for G1. The default value is 200 ms.
“-xx :G1HeapRegionSize” : set the Region size, ranging from 1MB to 32MB. The goal is to have about 2048 regions at the minimum Java heap;
-
Running steps
(A) Initial Marking
Mark only objects to which GC Roots can be directly associated;
Next Top at Mark Start (TAMS) is modified so that when the Next stage is run concurrently, the user program can create new objects in the correct available Region.
You need to “Stop The World”, but very fast;
(B) Concurrent Marking
GC Roots Tracing process;
The surviving object is marked in the collection just generated;
It takes longer, but the application is running;
There is no guarantee that all live objects will be marked;
(C) Final Marking
To correct the mark record of that part of the object whose mark changes because the user program continues to operate during concurrent marking;
Changes made to objects in the last phase are recorded in the thread Remembered Set Log.
Merge the Remembered Set Log into the Remembered Set;
You need to “Stop The World”, and The pause is slightly longer than The initial tag, but much shorter than The concurrent tag;
Using multi-thread parallel execution to improve efficiency;
(D), Live Data Counting and Evacuation
Firstly, the recovery value and cost of each Region are sorted.
Then make a collection plan based on the expected GC pause time of the user;
Finally, recycle garbage objects in some high-value regions according to plan;
The “copy” algorithm is used to copy living objects from one or more regions to another empty Region on the heap, and compress and release memory in the process.
Can be done concurrently, reducing pause times, and increasing throughput;
ZGC (10ms – 1ms) PK C++
Algorithm: ColoredPointers + LoadBarrier
Shenandoah
Algorithm: ColoredPointers + WriteBarrier
Eplison
5.JVM memory generation model (for generational garbage collection algorithm)
If someone asks you: JVM memory management model? Just say: What kind of garbage collector are you referring to?
-
The model used by part of the garbage collector
All GC except Epsilon ZGC Shenandoah use logical generation model
G1 is logical generation, not physical generation
In addition not only logical generation, but also physical generation
-
New Generation + Old age + Permanent Generation (1.7) Perm Generation/ metadata area (1.8) Metaspace
-
Permanent metadata – Class
-
Permanent generation must specify a size limit, metadata can be set or not set, no upper limit (limited by physical memory)
-
String constants 1.7 – permanent generation, 1.8 – heap
-
MethodArea Logical concepts – persistent generation, metadata
-
-
Cenozoic = Eden + 2 Suvivor regions
-
After YGC is collected, most of the objects are collected and enter S0 alive
-
Again YGC, the living object Eden + s0 -> s1
-
YGC again, Eden + S1 -> S0
-
Old enough -> Old age (15 CMS 6)
-
It won’t fit in section S -> Old age
-
-
The old s
-
diehard
-
The old age is FGC Full GC
-
-
GC Tuning (Generation)
-
Minimize FGC
-
MinorGC = YGC
-
MajorGC = FGC
-
-
Object allocation process diagram
New object to see if there is an escape analysis, it doesn’t exist and can be replaced by a scalar and put on the stack.
-
Dynamic age: (not important) www.jianshu.com/p/989d3b06a…
-
Distribution during the period of guarantee: (not important) YGC survivor area space is not enough space guarantee directly into old age Reference: cloud.tencent.com/developer/a…
Common garbage collector combination parameter Settings :(1.8)
-
-XX:+UseSerialGC = Serial New (DefNew) + Serial Old
- Small programs. This is not the default option, and HotSpot automatically selects the collector based on calculation and configuration and JDK version
-
-XX:+UseParNewGC = ParNew + SerialOld
-
This combination is rarely used (deprecated in some versions)
-
Stackoverflow.com/questions/3…
-
-
-XX:+UseConc
(urrent)
MarkSweepGC = ParNew + CMS + Serial Old
-
-xx :+UseParallelGC = elinsane + elinsane (insane)
-
-XX:+UseParallelOldGC = Parallel Scavenge + Parallel Old
-
-XX:+UseG1GC = G1
-
Linux does not find a way to view the default GC, while Windows prints UseParallelGC
-
java +XX:+PrintCommandLineFlags -version
-
Tell by GC logs
-
-
What is the default garbage collector for Linux 1.8?
-
1.8.0_181 默认(看不出来)Copy MarkCompact
-
1.8.0_222 Default PS + PO
-
The first step in JVM tuning is to learn about common JVM command-line arguments
-
The JVM command-line parameters reference: docs.oracle.com/javase/8/do…
-
HotSpot Parameter classification
Standard: – At the beginning, all HotSpot support
Non-standard: starting with -x, certain versions of HotSpot support certain commands
Unstable: -xx starts, may be cancelled in the next version
java -version
java -X
Java -xx :+PrintFlagsWithComments // only debug version works
Test procedure:
import java.util.List; import java.util.LinkedList; public class HelloGC { public static void main(String[] args) { System.out.println("HelloGC!" ); List list = new LinkedList(); for(;;) { byte[] b = new byte[1024*1024]; list.add(b); }}}Copy the code
-
The memory leaks out of memory
-
java -XX:+PrintCommandLineFlags HelloGC
-
java -Xmn10M -Xms40M -Xmx60M -XX:+PrintCommandLineFlags -XX:+PrintGC HelloGC PrintGCDetails PrintGCTimeStamps PrintGCCauses
-
java -XX:+UseConcMarkSweepGC -XX:+PrintCommandLineFlags HelloGC
-
Java -xx :+PrintFlagsInitial Default parameter value
-
Java -xx :+PrintFlagsFinal Final parameter value
-
Java – XX: + PrintFlagsFinal | grep XXX to find the corresponding parameters
-
java -XX:+PrintFlagsFinal -version |grep GC
-
Java – XX: + PrintFlagsFinal – version | wc -l altogether 728 parameters
-
PS GC logs
The log format is different for each garbage collector!
PS Log Format
Heap dump section:
Eden space 5632 k, 94% informs [x00000000fff00000 x00000000ff980000 0, 0 x00000000ffeb3e28, 0) at the back of the memory address is, the starting address, using the address space is over, the overall end address spaceCopy the code
Total = Eden + 1 survivor
Basic concepts before tuning:
-
Throughput: user code time/(user code execution time + garbage collection time)
-
Response time: The shorter the STW, the better the response time
First of all, what is tuning? Throughput first or response time first? Or in the case of meeting a certain response time, how much throughput is required…
Question:
Scientific computing, throughput. Data mining, Thrput. Average throughput priority :(PS + PO)
Response time: Website GUI API (1.8 G1)
What is tuning?
-
Plan and pre-tune the JVM as required
-
Optimizing the running JVM runtime environment (slow, sluggish)
-
(OOM) Resolve various JVM runtime issues (OOM)
Tuning starts with planning
-
Tuning, starting with a business scenario, without a business scenario tuning is rogue
-
No monitoring (pressure test, results can be seen), no tuning
-
Steps:
-
Familiarize yourself with business scenarios (there is no best garbage collector, only the best garbage collector)
-
Response time, pause time [CMS G1 ZGC] (need to respond to user)
-
Throughput = User time /(User time + GC time) [PS]
-
-
Select the collector combination
-
Calculate memory requirements (experience 1.5GB 16GB)
-
Select CPU (the higher the better)
-
Set age and upgrade age
-
Setting Log Parameters
-
-Xloggc:/opt/xxx/logs/xxx-xxx-gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=20M -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCCause
-
Or one log file per day
-
-
Observing logs
-
-
Case 1: vertical e-commerce, maximum daily million orders, what server configuration is required for order processing system?
This problem is somewhat amateurish, as many different server configurations can support it (1.5GB 16GB)
1 hour 360000 concentrated time, 100 orders/SEC, (find the peak within one hour, 1000 orders/SEC)
Experience value,
Must calculate: How much memory does it take to generate an order? 512K x 1000 500M memory
Ask professionally: request response time of 100ms
Pressure test!
-
Case 2:12306 How to support large-scale ticket snatching during the Spring Festival?
12306 should be the most concurrent seckill website in China:
Known as the highest concurrency 100W
CDN -> LVS -> NGINX -> Business Systems -> 1W concurrency per machine (10K problem) 100 machines
General e-commerce order -> place an order -> Order System (IO) reduce inventory -> wait for users to pay
One possible model for 12306: order -> destock and order (Redis kafka) simultaneously asynchronously -> etc payment
Destocking also ends up putting pressure on a server
Can do distributed local inventory + a separate server to do inventory balance
The handling method of heavy traffic: divide and conquer
-
How much memory does it take to get a transaction?
-
Get a machine, see how much TPS it can handle? Are you meeting your goals? Expand or tune it so that it reaches
-
Use manometry to determine
-
Optimizing the environment
-
There is a 500,000-PV data website (extracting documents from disk to memory). The original server has 32 bits and 1.5G heap, but the user feedback is relatively slow. Therefore, the company decides to upgrade the new server to 64 bits and 16G heap memory, but the user feedback is very serious and the efficiency is lower than before
-
Why is the original site slow? Lots of users browse data, lots of data load into memory, low memory, frequent GC, long STW, slow response time
-
Why is it more sluggish? The larger the memory, the longer the FGC time
-
Do how? PS -> PN + CMS or G1
-
-
System CPU is often 100%, how to tune? CPU100% Then there must be threads occupying system resources.
-
Find which process has the highest CPU (top)
-
Which thread in the process has the highest CPU (top-HP)
-
Export the stack for this thread (jStack)
-
Find which method (stack frame) consumes time (jStack)
-
Worker threads of high | gc thread high proportion
-
-
System memory is high, how to find the problem? (Interview frequency)
-
Exporting heap memory (JMAP)
-
Analysis (Jhat JVisualVM Mat jprofiler…)
-
-
How to monitor the JVM
- jstat jvisualvm jprofiler arthas top…
Fix problems in JVM running
A common tool for case understanding
-
Test code:
package com.mashibing.jvm.gc; import java.math.BigDecimal; import java.util.ArrayList; import java.util.Date; import java.util.List; import java.util.concurrent.ScheduledThreadPoolExecutor; import java.util.concurrent.ThreadPoolExecutor; import java.util.concurrent.TimeUnit; /** * get credit data from database, apply model, Public class T15_FullGC_Problem01 {private static class CardInfo {BigDecimal price = new BigDecimal (0.0); String name = "zhang 3 "; int age = 5; Date birthdate = new Date(); public void m() {} } private static ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(50, new ThreadPoolExecutor.DiscardOldestPolicy()); public static void main(String[] args) throws Exception { executor.setMaximumPoolSize(50); for (;;) { modelFit(); Thread.sleep(100); } } private static void modelFit(){ List<CardInfo> taskList = getAllCardInfo(); taskList.forEach(info -> { // do something executor.scheduleWithFixedDelay(() -> { //do sth with info info.m(); }, 2, 3, TimeUnit.SECONDS); }); } private static List<CardInfo> getAllCardInfo(){ List<CardInfo> taskList = new ArrayList<>(); for (int i = 0; i < 100; i++) { CardInfo ci = new CardInfo(); taskList.add(ci); } return taskList; }}Copy the code
-
java -Xms200M -Xmx200M -XX:+PrintGC com.mashibing.jvm.gc.T15_FullGC_Problem01
-
The operations team is usually the first to receive an alarm message (CPU Memory)
-
The top command displays the following problems: The memory continues to grow and the CPU usage remains high
-
Top-hp Looks at the threads in the process to see which thread has the highest CPU and memory ratio
-
JPS locates specific Java processes. Jstack locates thread status, focusing on: WAITING BLOCKED eg. WAITING on < 0x0000000088CA3310 > (a java.lang.object) We have to find out which thread is holding the lock how do we find that? 1: write a deadlock program, jstack observe 2: write a program, one thread holds the lock does not release, other threads wait
-
Why does ali specification require thread names (especially thread pools) to be written with meaningful names? (Custom ThreadFactory)
-
Jinfo PID Displays process information. Pid is found through JPS
-
Arthas observations/jconsole/jvisualVM/ Jprofiler jstat -gc 4655 500: If the interviewer asks you how do you position the OOM question? If you answer using a GRAPHICAL interface (wrong) 1: What does an already live system use instead of a graphical interface? Cmdline arthas 2: What exactly are graphical interfaces used for? The test! Monitor while testing! (Manometry observation)
-
Jmap histo – 4655 | head – 20, find how many objects
-
Jmap – dump: the format = b, the file = XXX pid:
In online system, the memory is very large, which will have a great impact on the process during the execution of jmap, and even lag (not suitable for e-commerce). 1: when the parameter HeapDump is set, the HeapDump file will be generated automatically in OOM. Many servers backup (high availability), stopping this server does not affect other servers 3: online positioning (generally not used by small companies) 4: pressure test in the test environment (similar to memory growth problem, dump when the heap is not very large)Copy the code
- java -Xms20M -Xmx20M -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError com.mashibing.jvm.gc.T15_FullGC_Problem01
* - XX: + HeapDumpOnOutOfMemoryError OOM occurs in export pile and can be used for analysisCopy the code
-
Use MAT, the jhat or to analyze the dump file jvisualvm www.cnblogs.com/baihuitests… The jhat – J – mx512M XXX dump http://192.168.17.11:7000 at the end of the day: to find the corresponding link can use OQL find specific problems
-
Find the problem with the code
Jconsole Remote connection
-
Program start add parameters:
Java – Djava. Rmi. Server hostname = 192.168.17.11 – Dcom. Sun. Management jmxremote – Dcom. Sun. Management jmxremote. Port = 11111 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false XXX
-
If you encounter an error with Local host name unknown: XXX, modify the /etc/hosts file and add XXX to it
192.168.17.11 basic localhost localhost. Localdomain localhost4 localhost4. Localdomain4: : 1 localhost localhost.localdomain localhost6 localhost6.localdomain6
-
Disable the Linux firewall.
Service iptables stopchkconfig iptables off
-
On Windows, open jConsole remote connection 192.168.17.11:11111
Jvisualvm remote connection
www.cnblogs.com/liugh/p/762… (Simple method)
Jprofiler (charge)
Arthas online search tool
-
Why online screening? In production, we often encounter some problems that are difficult to troubleshoot, such as thread safety problems, using the simplest threaddump or heapdump can not find the cause of the problem. In order to troubleshoot these problems, we sometimes add temporary logs, such as printing the entry and exit parameters in some key functions, and then repackage and publish. If we log and still find no problems, we add more logs and repackage and publish. For companies with complex on-line process and strict audit, there are layers of circulation from code modification to on-line, which will greatly affect the progress of problem investigation.
-
JVM Observe JVM information
-
Thread Locate a thread fault
-
Dashboard Observes the system status
-
Heapdump + JHAT analysis
-
Problem with jad decomcompiling dynamic proxy generated classes Locate third party class versions (observe code) problem (determine if your latest submitted version is in use)
-
Re-define hot replace Re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace re-define hot replace
-
sc – search class
-
watch – watch method
-
What’s not included: JMap, which comes with Linux and is similar to arthas for threaddump analysis
Basic concepts of GC algorithms
- The efficiency of CardTable is very low because the whole OLD region needs to be scanned during YGC, so THE JVM designs CardTable. If there are objects in the OLD region CardTable pointing to Y region, it is set as Dirty. In the next scan, only Dirty cards need to be scanned. Card Table is implemented by BitMap
CMS
The problem of the CMS
-
Memory Fragmentation
+ UseCMSCompactAtFullCollection – – XX: XX: CMSFullGCsBeforeCompaction defaults to 0 refers to how many times before FGC is compressed
-
Floating Garbage
Concurrent Mode Failure causes: if the concurrent collector is unable to finish reclaiming the unreachable objects before the tenured generation fills up, or if an allocation cannot be satisfiedwith the available free space blocks in the tenured generation, then theapplication is paused and the collection is completed with all the applicationthreads stopped
Solution: Lower the threshold for triggering the CMS
PromotionFailed
The solution is similar, keeping the old age with enough space
– XX: CMSInitiatingOccupancyFraction 92% can reduce the value, keep the CMS old s enough space
CMS Log Analysis
Run the Java -xms20m -XMx20m -xx :+PrintGCDetails -xx :+UseConcMarkSweepGC com.mashibing.jvm.gc.t15_fullgC_problem01 command
[ParNew: 6144K->640K(6144K), 0.0265885 secs] 6585K->2770K(19840K), 0.0268035 secs] [Times: User sys = = 0.02 0.00, real = 0.02 secs]
ParNew: young generation collector
6144->640: Comparison before and after collection
(6144) : Capacity of the entire young generation
6585 -> 2770: Whole heap condition
(19840) : The entire heap size
[GC (CMS Initial Mark) [1 CMS-initial-mark: 8511K(13696K)] 9866K(19840K), 0.004032secs] [Times: User =0.01 sys=0.00, real=0.00 secs] //8511 (13696) : Total heap usage (Max) [CMS-concurrent-mark-start][CMS-concurrent-mark: 0.018/0.018secs][Times: User =0.01 sys=0.00, real=0.02 secs] 0.000/0.000 secs] [Times: User =0.00 sys=0.00, real=0.00 secs] // mark Card as Dirty [GC (CMS Final Remark) [YG occupancy: Class consolidation, class consolidation, class consolidation][Rescan (PARALLEL), 0.0002236secs][weak refs processing, class consolidation, 0.0005404 secS][Scrub Symbol table, 0.0006169 SECs][Scrub String table, 0.0004903 secs][1 CMS-remark: Secs] [Times: User =0.00 sys=0.00, real=0.00 secs] //STW phase, YG occupancy: young occupancy //[Rescan (parallel) : Class semantics: unclass class semantics: scrub symbol(string) table //cleaning up symbol and string tables which hold class-level metadata and //internalized string respectively //CMS-remark: 8511K(13696K): old age occupation and capacity after phase [CMS-concurrent-sweep-start][CMS-concurrent-sweep: 0.005/0.005 secs] [Times: User =0.00 sys=0.00, real=0.01 secs] // Mark completed, concurrent cleanup [cms-concurrent-reset-start][CMs-concurrent-reset: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] // Reset internal structure for next GCCopy the code
G1
- Forced the www.oracle.com/technical-r…
G1 Log details
[GC pause (G1 Evacuation Pause) (young) (initial-mark), 0.0015790 secS]// Young -> Young generation Evacuation-> Replication of survival objects // Initial-mark phase of hybrid recovery, here is YGC hybrid old generation recovery [Parallel Time: // A GC thread [GC Worker Start (MS): 92635.7] [Ext Root Scanning (MS): 1.1] [Update RS (MS): 0.0] [Processed Buffers: 1] [Scan RS (MS): 0.0] [Code Root Scanning (MS): 0.0] [Object Copy (MS): 0.1] [Termination (MS): 0.0] [Termination Attempts: 1] [GC Worker Other (MS): 0.0] [GC Worker Total (MS): 1.2] [GC Worker End (ms): 92636.9] [Code Root Fixup: 0ms] [Code Root Purge: 0ms] [Clear CT: 0ms] [Other: [Choose CSet: 0.0 ms] [Ref Proc: 0.0 ms] [Ref Enq: 0.0 ms] [Redirty Cards: 0.0 ms] [Humongous Register: 0.0 ms] [Humongous Reclaim: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 0.0B(1024.0K)->0.0B(1024.0K) Survivors: 0.0B->0.0B Heap: 18.8 M (20.0 M) - > 18.8 M (20.0 M)] [Times: The user sys = = 0.00 0.00, [GC concurrent-root-region-scan-start][GC concurrent-root-region-scan-end, [GC concurrent-mark-start]// Unable to extend, FGC[Full GC (Allocation Failure) 18M->18M(20M), 0.0719656 SECs] [Eden: 0.0B(1024.0K)->0.0B(1024.0K) Survivors: 0.0b -> 0.0b Heap: 18.8m (20.0m)-> 18.8m (20.0m)], [Metaspace: 3876K->3876K(1056768K)] [Times: User sys = = 0.07 0.00, real = 0.07 secs]Copy the code
Case summary
There are a variety of reasons for OOM generation, some programs do not necessarily generate OOM, constantly FGC(high CPU, but very little memory reclamation) (above case)
-
Hardware upgrade system instead of the problem of lag (see above)
-
Thread pool misuse causes OOM problems (see above) adding objects to List repeatedly (too LOW)
-
Smile JIRA problem The actual system keeps rebooting to solve the problem add memory + replace garbage collector G1 What’s the real problem? I don’t know
-
Tomcat HTTP-header-size problem (Hector)
-
Lambda expressions cause MethodArea overflow problems (MethodArea/Perm Metaspace) lambdagc.java -xx :MaxMetaspaceSize= 9m-xx :+PrintGCDetails
"C:\Program Files\Java\jdk1.8.0_181\bin\java.exe" -XX:MaxMetaspaceSize=9M -XX:+PrintGCDetails "-javaagent:C:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2019.1\lib\idea_rt.jar=49316:C:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2019.1\bin" -Dfile.encoding=UTF-8 -classpath "C:\Program Files\Java\jdk1.8.0_181\jre\lib\charsets.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\deploy.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\access-bridge-64.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\cldrdata.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\jaccess.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\jfxrt.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\nashorn.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\sunec.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\sunpkcs11.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\zipfs.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\javaws.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\jce.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\jfr.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\jfxswt.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\jsse.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\management-agent.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\plugin.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\resources.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\rt.jar;C:\work\ijprojects\JVM\out\production\JVM;C:\work\ijprojects\ObjectSize\out\artifacts\ObjectSize_jar\ObjectSize.jar" com.mashibing.jvm.gc.LambdaGC[GC (Metadata GC Threshold) [PSYoungGen: 11341K->1880K(38400K)] 11341K->1888K(125952K), 0.0022190 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] [Full GC (Metadata GC Threshold) [PSYoungGen: 1880K->0K(38400K)] [ParOldGen: 8K->1777K(35328K)] 1888K->1777K(73728K), [Metaspace: 8164K->8164K(1056768K)], 0.0100681 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] [GC (Last ditch collection) [PSYoungGen: 0K->0K(38400K)] 1777K->1777K(73728K), 0.0005698 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] [Full GC (Last ditch collection) [PSYoungGen: 0K->0K(38400K)] [ParOldGen: 1777K->1629K(67584K)] 1777K->1629K(105984K), [Metaspace: 8164K->8156K(1056768K)], 0.0124299 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:388) at sun.instrument.InstrumentationImpl.loadClassAndCallAgentmain(InstrumentationImpl.java:411)Caused by: java.lang.OutOfMemoryError: Compressed class space at sun.misc.Unsafe.defineClass(Native Method) at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63) at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399) at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:394) at java.security.AccessController.doPrivileged(Native Method) at sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:393) at sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:112) at sun.reflect.ReflectionFactory.generateConstructor(ReflectionFactory.java:398) at sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:360) at java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1574) at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:79) at java.io.ObjectStreamClass$3.run(ObjectStreamClass.java:519) at java.io.ObjectStreamClass$3.run(ObjectStreamClass.java:494) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:494) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:391) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) at javax.management.remote.rmi.RMIConnectorServer.encodeJRMPStub(RMIConnectorServer.java:727) at javax.management.remote.rmi.RMIConnectorServer.encodeStub(RMIConnectorServer.java:719) at javax.management.remote.rmi.RMIConnectorServer.encodeStubInAddress(RMIConnectorServer.java:690) at javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:439) at sun.management.jmxremote.ConnectorBootstrap.startLocalConnectorServer(ConnectorBootstrap.java:550) at sun.management.Agent.startLocalManagementAgent(Agent.java:137) Copy the code
-
Unsafe. direct memory overflow (rare) The Deep Understanding of the Java Virtual Machine (P59), using Unsafe to allocate direct memory, or using NIO
-
Stack overflow problem -Xss setting is too small
-
Compare the similarities and differences between these two programs and see which is better:
Object o = null; for(int i=0; i<100; i++) { o = new Object(); } for(int I =0; i<100; i++) { Object o = new Object(); }Copy the code
-
In xiaomi cloud, HBase synchronization system, the system accesses timeout alarm through nginx. Finally, C++ programmer rewriting finalize will cause frequent GC problems. Why C++ programmer rewriting finalize will cause frequent GC problems? (New Delete) Finalize Takes a long time (200ms)
-
If you have a system that consistently consumes less than 10% of memory, but you look at the GC log and see that FGC occurs frequently, what causes it? System.gc() (this is Low)
-
Distuptor has a feature that allows you to set the length of the chain. If it is too large, the object will overflow if it is not released after consuming.
-
1.6.5 a temporary version of SQL subquery parsing algorithm has a problem, 9 exists combined SQL generated millions of objects (from dead objects)
-
(low) should use the thread pool. Solution: Reduce the heap space (too low) and reserve more memory to generate native threads
-
Recent student case SQLLite class library, batch processing will load all the results into memory, some people suddenly update hundreds of thousands of data, the result of memory overflow, location is used to eliminate the module is no problem, add the module will be a problem
-
Memory overflow caused by online Java decompression and compressed files. Procedure
-
Java using OpencV causes lag and slowness
-
The most likely reporting system to crash
-
System crash caused by database and table
GC common parameters
-
– xMn-xms-xmx-XSS Minimum heap maximum stack space of the young generation
The minimum heap and maximum heap are generally configured the same in a production environment
-
-xx :+UseTLAB Uses TLAB and is enabled by default
-
-xx :+PrintTLAB Displays the usage of TLAB
-
-xx :TLABSize Sets the TLAB size
-
-xx :+DisableExplictGC system.gc () does not work, FGC
-
-XX:+PrintGC
-
-XX:+PrintGCDetails
-
-XX:+PrintHeapAtGC
-
-XX:+PrintGCTimeStamps
-
– XX: + PrintGCApplicationConcurrentTime (low) to print the application time
-
– XX: + PrintGCApplicationStoppedTime (low) to print the pause time
-
-xx :+PrintReferenceGC (low importance) records how many references of different reference types are recycled
-
-verbose: indicates the detailed process of class loading
-
-XX:+PrintVMOptions
-
-xx :+PrintFlagsFinal -xx :+PrintFlagsInitial must be used
-
-Xloggc:opt/log/gc.log
-
-xx :MaxTenuringThreshold Indicates the ascending age. The maximum value is 15
-
-xx :PreBlockSpin Hot code detection parameter -xx :CompileThreshold Escape analysis scalar replacement… These are not recommended
Parallel common parameters
-
-XX:SurvivorRatio
-
How – XX: PreTenureSizeThreshold large object to the end
-
-XX:MaxTenuringThreshold
-
-xx :+ParallelGCThreads Specifies the number of parallel threads for the collector, which is also applicable to CMS
-
-xx :+UseAdaptiveSizePolicy Automatically selects the size ratio of each area
CMS Common Parameters
-
-XX:+UseConcMarkSweepGC
-
-xx :ParallelCMSThreads Number of CMS threads
-
– XX: CMSInitiatingOccupancyFraction began after the CMS how much proportion of the old s collection, the default is 68% (approximation), if frequent SerialOld caton, should be small, frequent (CMS)
-
– XX: + UseCMSCompactAtFullCollection during FGC is compressed
-
How many times – XX: CMSFullGCsBeforeCompaction FGC after compressing
-
-XX:+CMSClassUnloadingEnabled
-
– XX: CMSInitiatingPermOccupancyFraction reaches what proportion in Perm
-
GCTimeRatio Sets the percentage of program runtime that GC time occupies
-
-xx :MaxGCPauseMillis pause time, which is a suggested time that GC will try to achieve by various means, such as reducing the young generation
G1 Common Parameters
-
-XX:+UseG1GC
-
-xx :MaxGCPauseMillis recommended value. G1 will try to adjust the number of blocks in the Young block to reach this value
-
– XX: GCPauseIntervalMillis? GC interval time
-
-xx :+G1HeapRegionSize Partition size. You are advised to increase the value gradually. 1 2 4 8 16 32 As size increases, garbage lives longer, GC intervals are longer, but each GC takes longer. ZGC has been improved (dynamic block size)
-
G1NewSizePercent Specifies the minimum size of the new generation. The default value is 5%
-
G1MaxNewSizePercent Specifies the maximum percentage of the new generation. The default value is 60%
-
GCTimeRatio GC time recommended ratio, which G1 adjusts the heap space based on
-
ConcGCThreads Number of threads
-
InitiatingHeapOccupancyPercent start G1 heap space utilization ratio
Practical use of common tools
Jmap and Heapdump cannot be used in the production environment because there may be several GIGABytes of memory in the production environment. Can only be used in test environments or high availability environments. – XX: + HeapDumpOnOutOfMemoryError OOM occurs in export pile and can be used for analysis. You may not be able to analyze the problem. The best way is to write a log.
Linux Built-in Commands
View process information jinfo
Used to view process information, often used to query JVM parameters
[root@instance-m33tfvmh ~]# jps10632 nacos-server.jar27101 Jps[root@instance-m33tfvmh ~]# jinfo 10632Attaching to process ID 10632, please wait... Debugger Attached successfully.Server Compiler detected.JVM version is 25.162-B12Java System Properties: the Java runtime. Name = Java (TM) SE runtime Environmentjava. Vm. Version = 25.162 - b12sun. Boot. If the path = / home/jdk1.8.0 _162 / jre/lib/amd64java. Protocol handler. PKGS = org. Springframework. Boot. Loaderjava. Vendor. Url = http://java.oracle.com/... VM Flags:Non-default VM flags: -XX:CICompilerCount=2 -XX:GCLogFileSize=104857600 -XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912 -XX:MaxNewSize=268435456 -XX:MinHeapDeltaBytes=196608 -XX:NewSize=268435456 -XX:NumberOfGCLogFiles=10 -XX:OldSize=268435456 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseGCLogFileRotation Command line: -Xms512m -Xmx512m -Xmn256m -Dnacos.standalone=true - Djava. Ext dirs = / home/jdk1.8.0 _162 / jre/lib/ext: / home/jdk1.8.0 _162 / lib/ext - Xloggc: / soft/nacos/logs/nacos_gc log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M -Dloader.path=/soft/nacos/plugins/health,/soft/nacos/plugins/cmdb,/soft/nacos/plugins/mysql -Dnacos.home=/soft/nacosCopy the code
Listing stack information jstack:
Jstack can list threads to check for deadlocks and which thread has a high CPU usage.
This command is used to check the high CPU usage.
Thread information in the current process:
- thread
Thread name: Attach Listener, HTTP-NIO-8848-exec-28
Number: #252, #220
Priority: prio=9, prio=5
Status: RUNNABLE, TIMED_WAITING (parking)
[root@instance-m33tfvmh ~]# jps10632 nacos-server.jarf569 Jps[root@instance-m33tfvmh ~]# jstack 10632 | more2020-08-17 20:43:37Full Thread Dump Java HotSpot(TM) 64-bit Server VM (25.162-B12 mixed mode):"Attach Listener" #252 Prio =9 os_prio=0 tid=0x00007f2fc808a800 nid=0x599 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"http-nio-8848-exec-28" #220 daemon prio=5 os_prio=0 tid=0x00007f2fc4997800 nid=0x3b06 waiting on condition [0x00007f2faa63d000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000f27a7d38> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85) at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748)Copy the code
java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) – parking to wait for <0x00000000f27a7d38>
The above code means blocked, waiting for a lock 0x00000000F27a7d38.
The thread names above are not standard. According to the development specification of Alibaba, the thread pool or thread name must be specified to facilitate problem tracing.
jmap
The effect is similar to the threaddump command under arthas, which lists the most memory-hogging objects. The following command lists the objects in the top 20 rows that occupy the most memory.
[root@instance-m33tfvmh tmp]# jmap -histo 10632 | head -20 num #instances #bytes class name---------------------------------------------- 1: 168374 106102736 [B 2: 592794 100008712 [C 3: 319497 12779880 org.apache.derby.impl.sql.conn.GenericStatementContext$CancelQueryTask 4: 456634 10959216 java.lang.String 5: 75096 7996136 [I 6: 480161 7682576 java.lang.Object 7: 125709 5383840 [Ljava.util.HashMap$Node; 8: 105478 4613816 [Ljava.lang.Object; 9: 135823 4346336 java.util.HashMap$Node 10: 106824 4272960 java.util.LinkedHashMap$Entry 11: 56638 3171728 java.util.LinkedHashMap 12: 82954 2654528 java.util.concurrent.ConcurrentHashMap$Node 13: 29979 2638152 java.lang.reflect.Method 14: 16799 2419056 org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper 15: 17954 2010848 sun.nio.ch.SocketChannelImpl 16: 41720 2002560 java.util.HashMap 17: 103232 1835216 [Ljava.lang.Class;
Copy the code
arthas
An overview of the
There are two types of diagnostic analysis tools:
-
Graphical interface tools:
Jvisulvm (in Java /bin)
Jprofiler (charge)
-
Command line interface tools (emphasis) :
arthas
Scenarios for graphical user interface analysis are as follows:
-
A cluster environment can be fusing a machine for analysis
-
Use the tcpdump command to copy the traffic to the standby machine for analysis.
-
During the pressure test.
Otherwise, use the command-line interface tool.
Document Address:
Github.com/alibaba/art…
Download and run:
Download arthas-boot.jar and start it with java-jar:
curl -O https://arthas.aliyun.com/arthas-boot.jarjava -jar arthas-boot.jar
Copy the code
Print help information:
java -jar arthas-boot.jar -h
Copy the code
-
If the download is slow, you can use aliyun’s image: Java -jar arthas-boot.jar –repo-mirror aliyun –use-http
[root@ ~]# java -jar arthas-boot.jar[INFO] arthas-boot version: 3.3.9[INFO] Found Existing Java process, please choose one and input the serial number of the process, eg: 1. Then hit ENTER.* [1]: 10632 /soft/nacos/target/nacos-server.jar
When you start arthas, you will find the process 10632 of nacOS. Next, you need to attach arthas to the NacOS program for inspection. On the above screen, you can list the programs started by the server.
The complete command and log are as follows:
[root@instance-m33tfvmh ~]# java -jar arthas-boot.jar[INFO] arthas-boot version: 3.3.9[INFO] Found Existing Java process, please choose one and input the serial number of the process, eg: 1\. Then hit ENTER.* [1]: 10632 /soft/nacos/target/nacos-server.jar1[INFO] Start download arthas from remote server: https://arthas.aliyun.com/download/3.3.9?mirror=aliyun/INFO File size: 11.44 MB, downloaded the size: 2.05 MB, downloading... [INFO] File size: 11.44 MB, downloaded size: 4.26 MB, downloading... [INFO] File size: 11.44 MB, downloaded size: 6.05 MB, downloading... [INFO] File size: 11.44 MB, downloaded size: 7.44 MB, downloading... [INFO] File size: 11.44 MB, downloaded size: 9.32 MB, downloading... [INFO] File size: 11.44 MB, downloaded size: 11.20 MB, downloading... [INFO] Download arthas success.[INFO] arthas home: / root /. Arthas/lib / 3.3.9 arthas [INFO] Try to attach the process 10632 [INFO] attach process 10632 success. [INFO] arthas - client Connect 127.0.0.1 3658, -. -- -- -- -- -- -, -- -- -- -- -- -- -- -.,.,.,. -, -. / O \ | -. '-.. -' | '-' | / O \ '. - '|. -. | | '-' | |. | -. | |. -. | `. ` - | | | | | | \ \ | | | | | | | | | |. - '| ` - `' ` - '-' - '` - ` - `' ` - '-' `--'`-----' wiki https://arthas.aliyun.com/doc tutorials https://arthas.aliyun.com/doc/arthas-tutorials.html version Pid 10632 time 2020-08-17 21:34:19 [arthas@10632]$Copy the code
You can change the name in the system to [arthas@10632].
Arthas’s command:
help
View the commands arthas supports through help
[arthas@10632]$ help NAME DESCRIPTION help Display Arthas Help keymap Display all the available keymap for the specified connection. sc Search all the classes loaded by JVM sm Search the method of classes loaded by JVM ...Copy the code
dashboard
Using the dashboard tool, you can view the thread id, name, group, priority, CPU, and running time
Group can see whether it is a system thread or a business thread. Observe if memory or CPU continues to increase, indicating a problem with the program.
jvm
You can see that the garbage collector name is COLLECTORS, the new generation algorithm is Copy, and the old algorithm is MarkSweepCompact
...GARBAGE-COLLECTORS ---------------------------------------------------------------------------------------------------------------------------------------- Copy name : Copy [count/time (ms)] collectionCount : 35340 collectionTime : 452285 MarkSweepCompact name : MarkSweepCompact [count/time (ms)] collectionCount : 3 collectionTime : 374...
Copy the code
thread
View the list of threads
Add the thread number to view the stack information of a specific thread
View deadlocked threads by thread
heapdump
Exporting the heap to a file is used to analyze the cause of full GC
[arthas@10632]$ heapdumpDumping heap to /tmp/heapdump2020-08-17-22-06145210892456962923.hprof ... Heap dump file created[arthas@10632]$Copy the code
Download to the local, and then open the tool for analysis, find the memory occupying objects to analyze the code.
jad
Decompile class online to analyze the contents of a running program
redefine
After analyzing the inconsistency of the published book by JAD, re-define class replacement can be carried out to achieve non-stop effect.
homework
-
-xx: What does MaxTenuringThreshold control? A: the age at which the object is promoted to the old age B: the percentage of memory garbage when FGC is triggered in the old age
-
In production environments, you tend to set maximum heap memory and minimum heap memory to :(why?) A: Same B: Different
-
The default garbage collector for JDK1.8 is: A: ParNew + CMS B: G1 C: PS + ParallelOld D: none of the above
-
What is response time first?
-
What is throughput First?
-
What’s the difference between ParNew and PS?
-
What’s the difference between ParNew and ParallelOld? (Different era, different algorithm)
-
The scenario for long computation should be: A: pause time B: throughput
-
Large-scale e-commerce sites should choose: A: pause time B: throughput
-
What are the most commonly used garbage collectors for HotSpot?
-
What are the common combinations of HotSpot garbage collectors?
-
What is the default garbage collector for JDK1.7 1.8 1.9? How to check?
-
What exactly is tuning?
-
If you use PS + ParrallelOld, what can you do to make the system almost FGC free
-
If you use the ParNew + CMS combination, how can you make your system almost FGC free
1. Increase the memory of JVM 2. Increase the proportion of Young 3. 5. Avoid code memory leaksCopy the code
-
Is G1 generational? Does the G1 garbage collector produce FGC?
-
If G1 produces FGC, what should you do?
1. Memory expansion 2. CPU performance improvement (Fast garbage collection, fixed object generation speed of service logic, faster garbage collection, larger memory space) 3. Lower the threshold for MixedGC to occur earlier (default: 45%)Copy the code
-
Q: Can you dump casually in a production environment? Small heaps do not have much impact, large heaps can have service pauses or delays (alleviated by live), and FGC before dump
-
Q: What are the common OOM problems? Stack heap MethodArea direct memory
-
What if the JVM process exits quietly?
-xx :ErrorFile=/var/log/hs_err_pid<pid>. Log Super complex files include: Crash-thread information SafePoint information lock information Native code cache, compile events, GC related records JVM memory mapping, etc. The log is located in /var/log/messages. 2. Egrep -i 'killed process' /var/log/messages. Hardware or kernel q.1 dmesg | grep Java 5. Look for the horse soldiers teacher * * ^ ^! **Copy the code
- How to check direct memory?
1. Open -- -xx :NativeMemoryTracking=detail. 2Copy the code
- What are the common log analysis tools?
1. gceasy
Copy the code
- How to troubleshoot CPU surge?
1. top -Hp jstack 2. arthas - dashboard thread thread XXXX 3. There are two scenarios: 1: service thread 2: GC thread - GC logCopy the code
- How to check deadlocks?
Arthas-thread-b arthas-thread-bCopy the code
The resources
-
blogs.oracle.com/jonthecollector/our-collectors
-
Docs.oracle.com/javase/8/do…
-
Java.sun.com/javase/tech…
-
The JVM tuning reference documentation: docs.oracle.com/en/java/jav…
-
www.cnblogs.com/nxlhero/p/1… Online Screening tool
-
www.jianshu.com/p/507f7e0cc… Arthas common command
-
Arthas handbook:
-
Arthas java-jar arthas-boot.jar
-
Binding a Java process
-
The dashboard command displays the overall system status
-
Help View help
-
Help xx View the help information about a command
-
-
Jmap command reference: www.jianshu.com/p/507f7e0cc…
-
jmap -heap pid
-
jmap -histo pid
-
jmap -clstats pid
-
-
Blog.csdn.net/chenssy/art… Analyze hotspot Error file
-
www.cnblogs.com/cxxjohnson/… Garbage collector