JVM&GC analyses

1. The JVM structure

2. Public data area at runtime

Methods area: The JVM has a method area that is shared between all JVM threads and is used to store information about classes loaded by the VM, constants, static variables, and code compiled by the compiler. Before Java7, the method area was a “logical part”. In order to distinguish it from the heap, the method area was also known as the permanent generation. Jdk1.7 HotSpot, the area has put the original method of static variables, such as string constant pool to heap memory, jdk1.8, has no permanent generation (method), replace it with a piece of space is called the “dimension”, and the permanent generation, is the implementation of the JVM specification for method area, but the dimension is not in a virtual machine, Instead, local memory is used. The size of the MetaspaceSize is limited only by local memory, but you can specify the size of the MetaspaceSize by using -xx :MetaspaceSize and -xx :MaxMetaspaceSize.

Heap: This is where objects are created, instances are created, and arrays are allocated. It is the largest area of memory managed by the JVM. Heap memory and method areas are shared by all threads

3. Private data area at runtime

PC Register (program calculator) : The JVM supports multithreading, and each thread has its own separate PC memory. Only one thread can run the code of a method at any time. If the method is not native, the PC register contains the address of the JVM instruction currently being executed

JVM stack: each JVM thread has a private JVM stack, each JVM has a number of stack frames, as the threads created and creating, is stored in the stack frame, save the local variables and the calculation results, it is mainly used for the stack and the stack, and each Java methods from is called to execute end, corresponds to a frame from the stack to the stack process.

Native method stack: Native methods are not written in the Java language and the JVM can load the native method stack, so there is no need to provide the stack required by native methods

4. Memory overflow

Heap memory overflow (OutOfMemoryErro) : Heap memory main storage object, array, etc., when the object operation capacity, the young generation on a GC, if there is insufficient space on the light after GC put the object and Cenozoic meet conditions in old age (generally the young generation at 15 years, when they will be hosting and old s), if there is insufficient space on the old s will generate fullGC, OutOfMemoryError is then thrown if insufficient space occurs

Virtual machine, method StackOverflowError: too many stack frames in the virtual machine stack

5. Recycling

Reference counting method: Every time an object + 1, and recording, destroyed – 1, when the count is zero, it can be recycled, but the existence of circulation problem, is the object, the object reference the B B object and refers to the object, the reference relationship between each other will produce loops, this time the reference counting method obviously doesn’t work As is shown above, Although both a and B are empty, they cannot be reclaimed because they reference each other (with reference counts of 1)

Reachability calculations: The first thing to know is something that can be used as GCRoot’s:

The object referenced in the virtual machine stack (the local variable table in the stack frame)
The object referenced by the class static property in the method area
The object referenced by the constant in the method area
Objects referenced by JNI (commonly referred to as Native methods) in the Native method stack

A reference chain is formed through GC until all nodes have been traversed. If the related objects are not in any of the reference chains starting with GC Root, they are judged to be “garbage” and will be collected by GC. However, it is important to note that objects will not be reclaimed by the GC if reachable and unavailable conditions occur

Garbage collection algorithm

Mark clearance: Objects that can be recycled are calculated according to the reachability algorithm, as shown in yellow. The disadvantage of this algorithm is that it will generate memory fragmentation

Replication algorithm: Divide the heap into two regions, A and B. Region A is responsible for allocating objects, and region B is not allocated. Use the above notation for region A to mark the surviving objects. Then copy all the living objects from region A to region B (live objects are arranged right next to each other) and finally clean up all the objects from region A to free up space, thus eliminating the memory fragmentation problem

Disadvantages: This algorithm will waste memory space, each collection to move the surviving object to the other half, relatively inefficient

Mark cleanup: The first two steps are the same as mark cleanup, except that it adds a cleanup process on the basis of mark cleanup, that is, all surviving objects are moved to one end, next to each other (as shown in the picture), and then all areas of the other end are cleaned up. In this way, the memory fragmentation problem is solved.

Disadvantages: This algorithm has to move the living object frequently every time it removes garbage, which is inefficient

Generational collection algorithm: First understand the four regions, Eden, the garden of Eden area), S0 survivors (0), S1 survivors (1), the old old (s) compared with 8:1:1, hereinafter referred to as the garden of Eden E light GC occurs, if the object to survive, to store S0, the next time E light GC occurs, live objects including S0 all live objects move to S1 area, Repeat until S0 or S1 reaches 15 years old, then move the object to old. FullGC occurs when the old age is full

Garbage collector

Garbage recyclers that work in the new generation: Serial, ParNew, ParallelScavenge
Garbage collectors working in the Old days: CMS, Serial Old, Parallel Old
Garbage collector that also works in the new generation: G1

The following connected collectors are collectors that can work together

Cenozoic collector

Serial collector: This collector is single-threaded, meaning that only one CPU or one collection thread is used to complete garbage collection. Single-threaded garbage collector may seem impractical, but any techniques we need to know can be used out of context. In Client mode, it is simple and efficient (compared to the single-threaded collection of other collectors). Serial single-threaded mode reduces overhead by eliminating the need to interact with other threads in a single-CPU environment. STW times can be kept in the low 100ms range, as long as they do not occur frequently. The Serial collector is the default collector for the new generation

ParNew collector: The ParNew collector is a multithreaded version of the Serial collector. In addition to using multithreading, it mainly works in server mode compared to Serial. Multithreading allows garbage collection to be fast. It can also effectively reduce the STW time and improve the application response speed.

Parallel Avenge: The Parallel Insane collector is the same as the ParNew collector. The Parallel Insane collector uses replication algorithms, multi-threading, and the ParNew Collector. This collector is more concerned with throughput (throughput = run user code time/(run user code time + garbage collection time), CMS is for better user interaction and reduce GC time, while Parallel is more suitable for tasks that do not require much user interaction such as background computation. The Parallel Scavenge collector provides two parameters to precisely control throughput, the -xx :MaxGCPauseMillis parameter to control the maximum garbage collection time and the -xx :GCTimeRatio parameter to directly set the throughput size (default 99%).

Old age collector

Serial Old: The Serial collector is a single-threaded collector that works for the new generation, as opposed to the Serial Old collector that works for the Old generation

The Parallel Old collector: The Parallel Old collector is an older version of the Parallel Avenge collector that uses multithreading and mark-collation. The combination of the two is illustrated below. Both collectors are truly “throughput first” because they are multithreaded collectors

CMS collector: is the shortest collector, SWT application if attaches great importance to service speed, can consider to use CMS, CMS work in old age, but USES a tag QingChuFa, memory can be thought of CMS collector recycling process is concurrently with user threads, of course, no one is perfect, the CMS is the insufficiency of his:

Throughput: THE CMS collector is very CPU sensitive. For example, I originally had 10 threads to process user requests, but when USING CMS, I had to allocate three threads to collect user requests. The throughput decreased by 30%. (Number of cpus +3) / 4, so if the number of cpus is only one or two, then throughput drops directly to 50%, which is obviously not feasible
Unable to handle floating garbage: A “Concurrent Mode Failure” may occur, resulting in another Full GC. The user thread can execute the garbage at the same time as the garbage is processed, so the garbage is also generated at the same time as the garbage is cleaned. The newly generated garbage is called floating garbage, and the user thread must continue the garbage collection phase. This means that the CMS cannot like other collector at old s full then were collected for recycling, can pass XX: CMSInitiatingOccupancyFraction set reached 68%, bear in mind that cannot be set too high, or produce Concurrent Mode Failure, Serail Old will be started again for Old age collection, and of course SWT time will be longer
A large amount of memory fragmentation is generated: Because the CMS is the tag removal methods, the garbage collection algorithm is one of the biggest drawbacks is produces discontinuous fragments of the memory, if can’t find enough space to allocate memory, will trigger FullGC, very affect performance, can be used – XX: + UseCMSCompactAtFullCollection Settings, Default is open, can be used to merge sort memory fragments, memory consolidation can lead to SWT time growth, XX: CMSFullGCsBeforeCompation used to set the execution of how many times does not bring a compression after Full GC follow with compression.

G1 collector: Is a service oriented to the garbage collector, is currently a newbest garbage collector, solve the disadvantages of the CMS, the use of the algorithm is to mark sorting algorithm, can eliminate the memory fragments, throughput, and not sacrifice as CMS on the STW predictable pauses model is established, the user can specify the expected pause time, G1 will control the pause time within the user-set pause time. The traditional collector performs a region-wide garbage collection on the whole heap if Full GC occurs. G1 allocates garbage to each Region. It is convenient for G1 to track the value of garbage accumulation in each Region (the space obtained by recycling and the experience value required for recycling), so as to maintain a priority list according to the value. According to the allowed collection time, the Region with the highest recycling value is preferentially collected, thus avoiding the collection of the whole old age. This reduces STW pauses. At the same time, because only part of regions are collected, STW time can be controlled.

1. The JVM structure

2. Public data area at runtime

3. Private data area at runtime

4. Memory overflow

5. Recycling

Garbage collection algorithm

Garbage collector

Cenozoic collector

Old age collector

Related Posts

C/C++ function pointer declarations

Raspberry PI is the application of NextCloud on Raspberry PI

Run your first Hadoop program quickly!