directory
⊙ Heap memory area
⊙ Mechanism of GC execution
⊙GC principle – Garbage collection algorithm
Today I’ll focus on the HEAP memory model for the JVM. These are all necessary for a big factory interview, please pay attention to the class
The memory area of the heap
1.1
Introduction to the heap memory area
There are three regions in the JVM heap memory:
1. Young generation: Used to store newly generated objects.
2. Old age: Used to store objects that have been referenced for a long time.
3. Persistent strip: used to store meta information of Class and method (changed to meta space after 1.8).
The young generation
The young generation contains two zones, Eden and survivor, and is used to store newly generated objects, including two survivor zones
The old s
The young generation is put into the old generation if the garbage collection is repeated without being collected by the GC, as well as large objects (such as caches, where caches are weak references) that can be put into the old generation without going into the young generation
Last generation
Persistent proxies are used to store class, method meta information, size configuration and project size, number of classes and methods.
dimension
After JDK1.8, remove perm permanent generation and use meta space instead
Similar in nature to permanent generations, meta-spaces are implementations of method areas in the JVM specification. However, the biggest difference between a meta-space and a permanent generation is that the meta-space is not in the virtual machine, but uses local memory. And it can be dynamically expanded. So what are the problems with using meta-spaces? You can think about that.
1.2
Why generation?
\
Because the life cycle of different objects is different. 80% to 98% of objects are “dead” and have a short life cycle. Most new objects are in the young generation and can be recycled efficiently without going through all objects. However, the life cycle of old objects is generally very long, and only a small part of memory may be reclaimed each time, so the reclamation efficiency is very low.
The memory reclamation algorithm of the young generation is completely different from that of the old generation. Because there are few surviving objects in the young generation, the efficiency of clearly marked recompression is very low. Therefore, the replication algorithm is adopted to move the surviving objects to the survivor zone, which is more efficient. In the old age, on the contrary, the change of the surviving object is very small, so the marked clear compression algorithm is more appropriate.
1.3
Memory allocation policy
1.3.1 Priority is given to allocation in Eden area
In most cases, objects are allocated in Eden area of the new generation. When Eden area does not have enough space to allocate, THE VM initiates a Minor GC to put Eden area and surviving objects in one Survivor area into another Survivor area. If a Cenozoic surviving object is found to be unable to fit into a free Survivor zone during the Minor GC, the object is advanced to the old age through a space allocation guarantee mechanism (see below for space allocation guarantees).
1.3.2 Large objects directly into the old age
Serial and ParNew provides two collector – XX: PretenureSizeThreshold parameters, make more than the value distribution of large objects directly in the old s, The goal is to avoid a lot of memory copying between Eden and Survivor (large objects are Java objects that require a lot of contiguous memory, such as very long strings and arrays), so large objects tend to trigger GC early when there is still a lot of free memory to get enough contiguous space.
1.3.3 Long-term survival objects enter the old age zone
If the object survives after Eden is born and passes the first Minor GC and can be accommodated by Survivor, it is moved to Survivor space and the object age is set to 1. Each time the object survives a Minor GC in Survivor, its age increases by 1. When its age increases to a certain level (default: 15)_, it is promoted to the old age.
1.3.4 Dynamic age determination of objects
If the combined size of all objects of the same age in a Survivor space is greater than half of the size in a Survivor space, objects older than or equal to that age can go straight to the old age
1.3.5 Guarantee of space allocation
Before Minor GC occurs, the virtual machine checks to see if the maximum available contiguous space of the old generation is greater than the total space of all objects of the new generation. If this condition is true, then Minor GC is guaranteed to be safe. If this is not true, the virtual machine checks the HandlePromotionFailure setting to see if the guarantee failure is allowed. If so, it continues to check whether the maximum available contiguous space of the old age is greater than the average size of the objects promoted to the old age. If so, a Minor GC is attempted, although this Minor GC is risky, and a Full GC is attempted if the guarantee fails. If less than, or if the HandlePromotionFailure setting does not allow risk, then do a Full GC instead. \
HotSpot turns on the space allocation guarantee by default.
Second, GC execution mechanism
Because objects are processed in generations, garbage collection regions and times are different. There are two types of GC: Minor GC and Full GC.
2.1
Minor GC(young GC)
In general, when a new object is created and Eden fails to claim space, a Minor GC is triggered to clean up the Eden region, clean up the non-viable objects, and move the surviving objects to the Survivor region. Then the two zones of Survivor are collated. In this way, GC is performed on the Eden area of the young generation without affecting the old generation. Since most objects start from Eden, and Eden is not allocated very large, GC in Eden will occur frequently. Therefore, a fast and efficient algorithm is generally needed to make Eden free as soon as possible. \
2.2
Full GC
Clean up the entire heap, including Young, Tenured, and Perm. Full GC is slower than Minor GC because it needs to recycle the entire heap, so you should minimize the number of Full GC. A large part of the process of tuning the JVM is tuning FullGC. There are several possible reasons for Full GC: \
1. Tenured generations are full
2. Persistent generation (Perm) is full
System.gc() is displayed
4. The Heap allocation policy for each field has changed dynamically since the last GC
2.3
Object life and death determination method
Now that we know about the JVM’s GC mechanism, what criteria do we need for an object to be discarded by GC? \
1. Reference count: Each object has a reference count attribute. When a new reference is added, the count is increased by 1; when a reference is released, the count is decreased by 1. This method is simple and does not solve the problem of objects referring to each other circularly.
2. Accessibility analysis algorithm
In mainstream implementations of major commercial languages such as Java and C#, reachability analysis algorithms are used to determine whether an object is alive or not: a series of objects called GC Roots are used as a starting point and then searched down; The path traversed by the search is called Reference Chain. When an object is not connected to GC Roots by any Reference Chain, that is, the object is unreachable, which means the object is unavailable, as shown in the following figure: Object5, 6, and 7 are not reachable to GC Roots, although they are related to each other, so they are also judged to be recyclable:
In Java, objects available as GC Roots include:
Method area: objects referenced by static attributes of the class;
Method area: objects referenced by constants;
Objects referenced in the virtual machine stack (local variable table).
Objects referenced in the Native method stack (JNI).
Note: Even if an object is unreachable in the reachability analysis algorithm, the VM does not immediately reclaim it, because to actually declare an object dead, it must go through at least two marking processes: The first one is that there is no reference chain connected to GC Roots after reachabability analysis, and the second one is the small scale tag that GC makes on objects in the execution Queue of F-Queue (objects need to override the Finalize () method and have not been called).
Three, GC principle – garbage collection algorithm
The biggest technical difference between Java and languages like C++ is the automated garbage collection (GC) mechanism, so why understand GC and memory allocation strategies?
1. Interview needs
2. GC has an impact on application performance;
3. Writing code is good
Stack: The life cycle in a stack follows threads, so it is generally not a concern
Heap: Objects in the heap are the focus of garbage collection
Method area/meta space: Garbage collection also occurs in this area, but this area is less efficient and generally not our focus
So far, the JVM has developed four fairly mature garbage collection algorithms:
1. Mark-clear algorithm;
2. Replication algorithm;
3. Mark-collation algorithm;
4. Generational collection algorithm
3.1
Mark-clear algorithm
\
This garbage collection is divided into two stages: marking and cleaning. All objects that need to be reclaimed are marked first, and all marked objects are reclaimed after the marking is complete. When a large object is frequently allocated, the JVM cannot find large contiguous chunks of memory in the new generation, leading to frequent memory reclamation by the JVM (there is a mechanism for allocating large objects directly to the old generation).
This garbage collection is divided into two stages: marking and cleaning. All objects that need to be reclaimed are marked first, and all marked objects are reclaimed after the marking is complete. When a large object is frequently allocated, the JVM cannot find large contiguous chunks of memory in the new generation, leading to frequent memory reclamation by the JVM (there is a mechanism for allocating large objects directly to the old generation).
advantages
- 100% utilization rate
disadvantages
- Inefficient marking and cleaning (compare to copy algorithm)
- A large number of discrete memory fragments are generated
3.2
Replication algorithm
\
This algorithm divides memory into two equal blocks, using only one at a time. When a block of memory becomes insufficient, the surviving objects are copied to another block of memory, which is then cleaned up at a time. This is efficient and avoids memory fragmentation. But this is half the usable space of memory, which is not a small loss.
advantages
Simple and efficient, no memory fragmentation problems
disadvantages
Low memory utilization, only half
The efficiency decreases obviously when there are too many living objects
3.3
Mark-collation algorithm
This is an updated version of the mark-sweep algorithm. After the marking phase is complete, instead of cleaning up the recoverable objects directly, live objects are moved toward one end and memory beyond the boundary is cleaned up
advantages
- 100% utilization rate
- No memory fragmentation
disadvantages
- Labeling and scavenging are inefficient
- Efficiency is lower than mark-clear
3.4
Generational collection algorithm
\
Current commercial virtual machines use this algorithm. Firstly, according to the different life cycle of objects, the memory is divided into several blocks, namely the new generation and the old age. Then, according to the characteristics of different ages, different collection algorithms are adopted
New generation: Every garbage collection finds a large number of objects dead and only a few surviving. Therefore, the selection of replication algorithm, only need to pay a small amount of living object replication costs can be completed
In the old days: Because an object has a high survival rate and no extra space is guaranteed to allocate it, it must be recycled using a “mark-clean” or “mark-clean” algorithm without memory replication and free memory directly.