Note: this article is original and should be reprinted in full and marked with the original link.
JVM memory analysis is usually done by senior students in the team. Through three articles, this series provides in-depth analysis and helps readers fully and deeply grasp the use of MAT. You can quickly become a memory analyst without any practical experience with JVM memory analysis!
This is the first of three articles in this series:
- JVM Memory analysis Tool MAT introduction to MAT product features, basic concepts, comparison with other tools, Quick Start guide.
- “JVM Memory Analysis Tool MAT in-depth Explanation and Practice – Advanced Chapter” expands and introduces MAT’s core functions in detail, and explains to help you deepen your understanding in specific practical scenarios.
- “JVM Memory analysis tool MAT in-depth explanation and practice — Advanced chapter” summarizes the systematic analysis of complex memory problems, and through a comprehensive case to improve your practical ability.
1. Introduction to MAT tool
Memory Analyzer Tool MAT (full name: Memory Analyzer Tool) is a fast, convenient and powerful Tool for offline analysis of JVM heap Memory. It helps to locate memory leaks or optimize large memory consumption logic by showing runtime Heap dump status recorded when a JVM exception occurs (Heap dump analysis can also be done during normal runs).
1.1 Application scenarios of MAT and main solutions
Scenario 1: Memory runs out and the JVM heap or method area does not hold any living and pending objects. For example, if an OOM Out of Memory (OOM) exception occurs during peak hours, locate the Memory bottleneck to guide optimization.
Scenario 2: Memory leak, no longer used objects cannot be collected by the garbage collector. For example, Full GC appears after the system runs for a period of time, or even needs to be manually restarted after periodically OOM.
Scenario 3: High memory usage. For example, if the system frequently GC, it is necessary to locate the reasons affecting the real-time, stability, and throughput capacity of the service.
1.2 Basic Concepts
1.2.1 Heap Dump
Heap Dump is a snapshot of a Java process at a point in time in the Heap. It supports HPROF and DTFJ formats, the former generated by Oracle family JVMS and the latter by IBM family JVMS. Its contents mainly include the following categories:
-
Instance information of all objects: the name of the class to which the object belongs, attributes of the underlying type and reference type, etc.
-
All class information: class loaders, class names, inheritance relationships, static properties, and so on.
-
GC Root: GC Root represents the starting set for determining whether a JVM object is alive through reachability analysis. JVM adopts Tracing GC mode. From all GC Roots, the objects that can be associated by reference are alive (and not recoverable), and the remaining Unreachable objects (Unreachable objects: If a reference path cannot be found from GC Root to reach an object, the object is Unreachable object and can be reclaimed.
-
Thread stack and local variables: Thread stack frames for all threads at the time of snapshot creation, and local variables for each thread stack.
1.2.2 Shallow Heap
Shallow Heap represents the amount of memory taken up by an object structure itself, not including the amount of memory taken up by objects referenced by its attributes. For example, the Shallow Heap of a java.util.ArrayList object contains an 8-byte object header, an 8-byte object array attribute elementData reference, a 4-byte size attribute, and a 4-byte modCount attribute (from the java.util AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList class AbstractList
1.2.3 Retained Set
Retained Set of an object refers to the collection of Retained objects that can be recovered after the Retained object is recovered by GC. (As shown in the following figure, the Retained Set of G excludes H. H is also referenced by G, but H is also referenced by F. G cannot be released by garbage collection H); Also, if the object cannot be retrieved by GC, its Retained set cannot be retrieved by GC.
1. Retained Heap
Retained Heap is the size of memory that can be freed after the Retained Heap is recovered by GC. The Retained Heap is equal to the sum of all the Shallow Heap of the Retained Heap of the freed object. The retained Heap of E is the sum of G and E’s Shallow Heap, excluding H).
1.2.5 Dominator tree
If all paths to object Y pass through object X, X dominates Y (C and D dominate F, G does not dominate H). The Dominator tree is an overall tree based on object references and Dominator relationships. The Dominator tree clearly describes the dependencies between objects. The Dominator tree on the left is shown in the Dominator tree diagram on the lower right. There are also the following relationships of dominance:
-
The subtree of any node in the Dominator tree is the collection of nodes governed by that node, its Retain Set.
-
If X governs Y directly, then all dominant nodes of X govern Y.
1.2.6 OQL
OQL is a unified query language dedicated to MAT similar to SQL, which can query and filter data such as classes or objects in dump files according to complex query conditions.
1.2.7 the references
Outgoing References and Incoming References can directly access dependencies between objects, and MAT also provides fast chained operations.
-
Outgoing References: External objects that the object references (note that the object’s primitive type attributes are not included. Basic properties are available at Inspector.
-
Incoming References: Objects that directly reference the current object; each object may have zero to more incoming references.
2. Function overview and comparison of MAT
2.1 Function Overview of MAT
Note: MAT’s product capabilities are very rich. This paper briefly summarizes the product features to help you understand the overall picture. In the next article “JVM Memory Analysis Practical Progress – Core Functions and Application Scenarios”, scenarios, cases and best practices of various core functions will be introduced in detail.
MAT works by indexing dump files, Based on indexes, the four core functions of [1] memory distribution, [2] inter-object dependence (such as entity object reference relationship, thread reference relationship, ClassLoader reference relationship, etc.), [3] object state (memory usage, field attribute value, etc.), and [4] conditional retrieval (OQL, regular matching query, etc.) are realized. And assist Developer to understand the full picture of JVM heap memory through visual presentation.
2.1.1 Memory Distribution
-
Global overview information: heap size, number of objects, number of classes, number of classloaders, number of GC roots, thread profile, and other global statistics.
-
Dominator Tree: Sorting by Retain Heap of objects, also supports clustering statistics by multiple dimensions, one of the most commonly used features.
-
Histogram: The memory ratio of each class instance is listed, including its own Shallow Heap and retained Heap of the dominant object. It supports cluster statistics by package, class Loader, Super class, and class.
-
Targeting the memory-hogging Suspects on the chain solves some basic problems, but complex ones tend to be of limited help.
-
Top Consumers: Shows which classes, class loaders, and packages occupy the highest proportion of memory.
2.1.2 Dependency between Objects
-
References: Provide the external reference relationship and referenced relationship of the object. Complete dependency link details are provided through direct and indirect reference details (mainly attribute values and memory usage) of any object.
-
Dominator Tree: Support sorting by object Retain Heap, and provide detailed dominance relationship, combined with references can achieve fast association analysis of large objects;
-
Thread Overview: Displays details about Thread stack frames during file dump and related memory information, such as the Retain Heap information of each Thread.
-
Path To GC Roots: Provides details of the link between any object and GC Root To help understand why it cannot be collected by GC.
2.1.3 Object Status
-
At the core, the Inspector panel provides object attribute information, class inheritance information and other data to help analyze the relationship between high memory usage and business logic.
-
Collection state detection, such as locating the memory waste caused by empty collection and empty array by the fill rate of ArrayList or array, determining whether hash policy is reasonable by the conflict rate of HashMap, etc.
2.1.4 Retrieving objects by condition
-
OQL: provides an object (class) level unified structured query language similar to SQL. Select * from java.util.ArrayList where size=0 and modCount=0; Select s.length from instanceof String s.
-
Many functions of memory distribution and dependency between objects support retrieval by string, retrieval by regular, and so on.
-
Addresses by virtual memory address, finds objects based on their hexadecimal address.
In addition, in order to facilitate memory and review, the following brain maps are sorted out:
2.2 Comparison of Common Memory analysis Tools
In the following figure, Y indicates support, and N indicates not support.
Product features | MAT | JProfiler | Visual VM | jhat | jmap | hprof |
---|---|---|---|---|---|---|
Object association analysis, shallow heap, GC ROOT, memory leak detection, thread analysis, providing custom program extension extensions | Y | N | N | N | N | N |
Offline global analysis | Y | N | Y | Y | N | N |
Real-time memory allocation | N | Y | Y | Y | Y | Y |
OQL | Y | N | Y | N | N | N |
Memory allocation stack, hotspot ratio | N | Y | N | N | N | N |
Out of heap memory analysis | N | N | N | N | N | N |
Note 1: The Dump file contains the distribution of Java objects in heap memory at the time the snapshot was dumped, but the snapshot is only an instant record, so it does not contain information about when and in which method objects were allocated.
Note 2: The general out-of-heap memory overflow detection can be combined with GperfTools and BTrace detection, which is not introduced in many articles.
3. Quick Start and use skills
3.1 Quick Start
Note: There are many articles about Quick Start. This article focuses on the installation process and use skills.
1. Install MAT: stamp [download link] You can also integrate directly into the Eclipse IDE (Eclipse → Help → Eclipse Marketplace → search “MAT”).
2. Adjust MAT heap memory size: MAT also runs as a Java process during analysis. If there is enough memory, it is recommended to allocate at least 1.2 times the dump file size to MAT, so that the analysis speed will be faster. By modifying the memoryanalyer. ini file, Adjust the -xmx parameter (Windows available search software artifact everything find and modify, MAC OS is in commonly/Applications/mat. App/Contents/Eclipse/MemoryAnalyzer ini. Alfred software can be used to search for modifications if not found.
3. To obtain heap snapshot dump file (Full GC is required for heap dump, please pay attention to the impact when using online services), generally use three methods:
-
To use the Jmap tool provided by the JDK, run the jmap-dump :format=b,file= file name process NUMBER. When a process is near death, you can add the -f parameter to force dump: jmap -f -dump:format=b,file= file name Process number.
-
Java processes that run locally can be obtained directly from MAT using File → Accquire Heap dump.
-
The JVM starts the Java process configuration parameters: – XX: – HeapDumpOnOutOfMemoryError, when OOM happened without human intervention will automatically generate the dump file. To specify a directory, use -xx :HeapDumpPath= file path.
4, Analyze dump files: Path is File → Open Heap dump, then MAT will create index and analyze, large dump files will take a long time. After the dump file is analyzed, the directory where the dump file resides contains index files with the suffix index and HTML files with the suffix ZIP.
5. After index calculation, MAT presents an Overview view, including three parts:
-
Global overview information, heap memory size, number of classes, number of instances, number of Class Loaders.
-
Unreachable Object Histogram, which shows the information of objects that can be reclaimed when dumping snapshots (generally not required attention, unless GC frequently affects real-time scenario analysis)
-
Biggest Objects by Retained Size. Across Size, it shows which instances have the highest total memory occupied by the associated Objects and the memory occupied by them. Under simple related code, specific reference exceptions, such as memory leaks, can be directly analyzed. It also includes Max object and link support for further analysis.
6. If the code is complex, it is necessary to continue to use MAT tools and further analyze the cause of the memory exception in combination with the business code. The most commonly used ones are as follows (specific cases, scenarios and usage methods are described in detail in the Advanced chapter of JVM Memory analysis Tool MAT) :
-
View the overall heap situation: Histogram, Dominator Tree, Thread Details, etc.
-
Top Consumers and Leak Suspects analyzed by MAT
3.2 Tips and precautions
1. Pay attention to The impact on The performance of The running process: Full GC is performed before Heap dump. In addition, to ensure The consistency of The object data view, Stop The World at a safe point to pause The response. The following techniques can be used to reduce the impact:
-
Disable inbound traffic before performing dump.
-
Select memory dump with small impact.
-
Memory is dumped when a script is used to capture specified events.
2. The Dump file and index file created may be large. If the configuration of the development machine is insufficient, you can perform the analysis on the server first and view the result based on the index file after analysis.
-
Large file analysis method: Generally, dump files are not higher than 1.2 times of the main memory of the analysis machine, which can be directly analyzed in the development machine. If the dump file is too large, the scripts provided by MAT can be used to build indexes on high-configuration machines and then directly display the index analysis results (generally Linux machines, you can use the scripts provided by MAT: Sh $HEAPDUMP. The garbage object with the unreachable tag in the heap is also saved during dump. This data is not analyzed by default. – keep_unreachable_objects).
-
If you do not care about unreachable objects in the heap, use the “live” parameter to reduce the file size. The command is jmap -dump:live,format=b,file=<dumpfile> <pid>
-
Manually perform a FULL GC before Dump to remove invalid objects and further reduce heap Dump and index building time.
-
Dump files are huge. After indexing, objects in the main view take up less memory, because most of the objects are not referenced by GC Roots and can be freed.
-
During Dump, specify a disk location with large space to avoid service impact when full partitions are created.
-
The disk space of the dump index machine must be large enough. Generally, the disk space of the dump index machine must be at least twice that of the dump file. The intermediate index file is also large, as shown in the following figure:
3, other
-
JDK version problem: in case of “VMVersionMismatchException”, use the JDK version of the target process.
-
The main interface of some core functions is not displayed. If the problem is complex enough, open the inspector. For example, MAT does not open the inspector by default.
-
HeapDumpOnOutOfMemoryError parameter configuration, but when an OutOfMemoryError but not automatically generate the dump file, may be three reasons:
- The application creates its own OutOfMemoryError and throws it
- Other resources of the process, such as threads, are exhausted
- Heap exhaustion in C code (such as JVM source code) can occur for a variety of reasons, such as running out of swap space, running out of process limits, or just address space limits, where dump file analysis is not really helpful.
Summary outlook: at this point in this paper, the primary content of explained the MAT practice necessary, my next the JVM memory analysis tool depth interpretation and practice of MAT – advanced “will unfold and detail MAT rich core functions, each function point to explain from talking about specific scene, help people deepen experience implementing advanced under actual combat scenarios.
Refer to the content
- MAT’s official website: help.eclipse.org/2020-09/ind…
Welcome to forward, follow, the author’s wechat official account:Q’s blog.Irregular delivery of dry goods, practical experience, system summary, source code interpretation, technical principles.