What is the JVM

The JVM is an imaginary computer that can run Java code, including a set of bytecode instructions, a set of registers, a stack, a garbage collection, a heap, and a storage method field. The JVM runs on top of the operating system and has no direct interaction with the hardware.

Welcome to pay attention to my B station account

B station account

If the content helps you, welcome everyone to like, favorites + attention

Learning exchange group

We all know that Java source files, through the compiler, can produce corresponding. Class files, or bytecode files, which in turn are compiled into machine code for a particular machine through an interpreter in the Java Virtual machine.

Insert a picture description here

The interpreter for each platform is different, but the virtual machine implemented on each platform is the same, which is why Java can be cross-platform. When a program is started, the virtual machine is instantiated, and multiple programs are started with multiple virtual machine instances. If the program exits or is stopped, the VM instance disappears and data cannot be shared between multiple VM instances.

Insert a picture description here

thread

A thread here refers to a thread entity during the execution of a program. The JVM allows an application to execute multiple threads concurrently. Java threads in the Hotspot JVM map directly to native operating system threads. When thread local storage, buffer allocation, synchronous objects, stacks, program counters, and so on are ready, an operating system native thread is created. The Java thread terminates and the native thread is reclaimed. The operating system is responsible for scheduling all threads and allocating them to any available CPU. When the native thread completes initialization, the Java thread’s run() method is called. When the thread terminates, all resources of the native thread and Java thread are released.

  • Hotspot JVM runs in the background on the following threads:

  • Virtual machine thread: This thread waits for the JVM to reach a safe point for the operation to occur. These operations must be performed in separate threads, because threads require the JVM to be at a safe point when heap modifications cannot be made. These operations are stop-the-world garbage collection, thread stack dump, thread pause, and Thread Biased locking release.

  • Periodic task thread: This thread is responsible for timer events (interrupts) that schedule the execution of periodic operations.

  • GC threads: These threads support different garbage collection activities within the JVM.

  • Compiler threads: These threads dynamically compile bytecode to local platform-specific machine code at run time.

  • Signal distributor thread: This thread receives signals sent to the JVM and calls the appropriate JVM methods to process them.

JVM memory region

Insert a picture description here

  • The JVM memory area is mainly divided into thread private area (program counters, virtual stack, local method area), thread shared area (JAVA heap, method area), and direct memory. – Thread private data areas have the same life cycle as threads, depending on the start/end of the user thread to create/destroy (in HotspotVM, each thread is mapped directly to the operating system’s local thread, so the memory of this part of the memory area follows the life/death of the local thread).

  • Thread shared areas are created/destroyed with the startup/shutdown of the virtual machine.

Insert a picture description here

Program counter (thread private)

  • A small area of memory that is a line number indicator of the bytecode being executed by the current thread. Each thread has a separate program counter. This type of memory is also called “thread-private” memory.

  • If the Java method is being executed, the counter records the address of the virtual machine bytecode instruction (the address of the current instruction). Null if the Native method is used.

  • This memory region is the only one in the virtual machine where no OutOfMemoryError condition is specified.

JAVA Virtual Machine Stack (thread private)

  • Is a memory model that describes the execution of Java methods. Each method creates a Stack Frame for storing information such as local variable table, operand Stack, dynamic link, method exit, etc. The process of each method from invocation to completion corresponds to the process of a stack frame being pushed into and out of the virtual machine stack.

  • Stack frames are data structures used to store data and partial process results. They are also used to handle Dynamic Linking, method return values, and Dispatch exceptions. Stack frames are created as the method is called and destroyed as the method terminates — method completion counts whether the method completes normally or if an exception completes (throwing an exception that was not caught within the method).

    Insert a picture description here

Local method area (thread private)

  • The local method Stack is similar to the Java Stack, except that the VM Stack serves the execution of Java methods, while the Native method Stack serves the execution of Native methods. If a VM implementation uses the C-linkage model to support Native calls, the Stack will be a C Stack. But HotSpot VM simply blends the local method stack with the virtual machine stack.

Heap (heap-thread shared) run-time data area

  • An area of memory shared by threads, where objects and arrays are created and stored in Java heap memory, is the most important area of memory for garbage collection by the garbage collector. Because modern VMS use generational collection algorithms, the Java heap can also be subdivided From a GC perspective into: the new generation (Eden, From Survivor, and To Survivor) and the old generation (JDK1.7).

Method area/persistent generation (thread sharing)

  • Permanent Generation, as it is often called, is used to store classes loaded by the JVM, constants, static variables, code compiled by the just-in-time compiler, and more. HotSpot VM extends GC generation collection to the method section using persistent generations of the Java heap to implement the method section, This allows HotSpot’s garbage collector to manage this part of memory in the same way it manages the Java heap, without having to develop a special memory manager for the method area (the main goal of persistent band memory reclamation is constant pool reclamation and type offloading, so the benefits are generally small).

Run-time constant pool

  • The Runtime Constant Pool is part of the method area. The Constant Pool Table is used to store various literals and symbolic references generated at compile time. This part of the Constant Table is stored in the runtime Constant Pool of the method area after the Class is loaded. The Java virtual machine has strict rules on the format of each part of a Class file (including, of course, the constant pool), and each byte must be used to store what data must conform to the specification before it is accepted, loaded, and executed by the virtual machine.

Direct memory

  • Direct memory is not part of the JVM runtime data area, but is frequently used: NIO, introduced in JDK 1.4, provides Channel and Buffer based I/O. It can use Native function libraries to directly allocate out-of-heap memory, and then use DirectByteBuffer objects as references to this memory. This avoids copying data back and forth between the Java heap and Native heap, which can significantly improve performance in some scenarios.

JVM runtime memory (JDk1.7)

  • The Java heap can also be subdivided From a GC perspective into the new generation (Eden, From Survivor, and To Survivor) and the old generation

    Insert a picture description here

The new generation

It’s used to store new objects. It takes up about a third of the heap. Because objects are created frequently, MinorGC is frequently triggered for garbage collection by the new generation. The Cenozoic era is divided into three regions: Eden, ServivorFrom and ServivorTo.

  • Eden: Birthplace of a new Java object (if the newly created object takes up a lot of memory, it is allocated directly to the old age). When Eden runs out of memory, MinorGC is triggered to do a garbage collection for the new generation.

  • ServivorFrom: Survivor of the last GC, as scanned in this GC.

  • ServivorTo: Saved a MinorGC survivor.

  • MinorGC’s process :(copy -> empty -> swap) MinorGC uses the copy algorithm.

  • First, copy the surviving objects in Eden and ServivorFrom to the ServicorTo region (if any objects are old and have reached the age of 15 by default, This can be set to -xxMaxtenuringThreshold), and the age of these objects +1 (if ServicorTo runs out of space, put it in the age area).

  • Clear Eden and servicorFrom** Clear objects in Eden and servicorFrom.

  • ServicorTo and ServicorFrom swap Finally, ServicorTo and ServicorFrom swap, and the original ServicorTo becomes the ServicorFrom section for the next GC.

The old s

  • It mainly stores memory objects with long lifetime in application programs.

  • Older objects are more stable, so MajorGC is not executed very often. Before MajorGC is generally carried out a MinorGC, so that there is a new generation of objects into the old age, resulting in space is not enough time to trigger. MajorGC is also triggered early for garbage collection to free up space when a large contiguous space cannot be found for a newly created larger object.

  • MajorGC uses a mark-clearing algorithm: it scans all the ages once, marks the surviving objects, and then recycles the unmarked objects. MajorGC takes a long time because it is scanned and recycled. MajorGC generates memory fragmentation, and in order to reduce memory consumption, we usually need to merge or mark it for direct allocation next time. An OOM (Out of Memory) exception is raised when the old age is too full.

The permanent generation

– Refers to the permanent area of memory where Class and Meta information is stored. Classes are placed in the permanent area when they are loaded. Unlike the area where instances are stored,GC does not clean the permanent area during the main program runtime. This also causes the permanent generation area to swell as more classes are loaded, resulting in an OOM exception.

JAVA8 and metadata

In Java8, the persistent generation has been removed and replaced by an area called the “metadata area” (meta space). The essence of a meta-space is similar to that of a permanent generation. The biggest difference between a meta-space and a permanent generation is that the meta-space does not exist in a VIRTUAL machine but uses local memory. Therefore, by default, the size of the meta-space is limited only by local memory. The class’s metadata is put into NativemEmory, and the string pool and static variables of the class are put into the Java heap, so that the amount of class metadata that can be loaded is controlled not by MaxPermSize but by the actual available space of the system.

Garbage collection and algorithms

Insert a picture description here

How to identify garbage

Reference counting method

  • In Java, references and objects are associated. If you want to manipulate objects, you must do so by reference. Therefore, it is obvious that a simple way to determine whether an object is recyclable is by reference counting. Simply put, an object that has no references associated with it, that is, none of them has a reference count of zero, which means that the object is unlikely to be used again, and therefore is a recyclable object.

Accessibility analysis

  • To solve the circular reference problem of reference counting, Java uses the method of reachability analysis. Search ** through a series of “GC Roots” objects as a starting point. If there is no reachable path between GC roots and an object, the object is said to be unreachable **. It is important to note that unreachable objects are not equivalent to recyclable objects, and at least two marking processes are required for unreachable objects to become recyclable. If the object is still recyclable after two marks, it will face collection.

Mark-sweep algorithm

  1. The most basic garbage collection algorithm is divided into two stages, annotation and cleanup. The mark phase marks all objects that need to be reclaimed, and the clear phase recycles the space occupied by the marked objects. As shown in figure

    Insert a picture description here

    As can be seen from the figure, the biggest problem of this algorithm is the serious memory fragmentation, and the problem that large objects cannot find available space may occur later.

Copying algorithms

  • An algorithm was proposed to solve the memory fragmentation defect of Mark-Sweep algorithm. The memory is divided into two pieces of equal size based on the memory capacity. Use only one block at a time. When this block is full, copy the surviving objects to the other block to clear the used memory, as shown in the figure:

    Insert a picture description here

    Although this algorithm is simple to implement, has high memory efficiency and is not easy to generate fragmentation, the biggest problem is that the available memory is compressed to half of the original. Also, with more surviving objects, the efficiency of Copying algorithms decreases dramatically.

Mark-compact Algorithm

Combined with the above two algorithms, in order to avoid defects. The marking phase is the same as the Mark-sweep algorithm. Instead of cleaning up objects, the living objects are moved to one end of memory. It then clears objects outside the end boundary. As shown in figure:

Insert a picture description here

Generational collection algorithm

  • Generational collection is the current approach used by most JVMS. The core idea is to divide memory into different domains based on the lifetime of the object, typically dividing the GC heap into Tenured/Old Generation and YoungGeneration. The characteristics of the old generation are that only a small number of objects need to be recycled in each garbage collection, while the characteristics of the new generation are that a large number of garbage needs to be recycled in each garbage collection, so different algorithms can be selected according to different regions.

New generation and replication algorithms

  • Most JVM GCS currently adopt a Copying algorithm for the new generation because it recycles most of its objects with each garbage collection, meaning there are fewer operations that need to be replicated, but the new generation is usually not classified in a 1:1 fashion. Generally, the new generation is divided into a large Eden Space and two small Survivor Spaces (From Space, To Space). Each time Eden Space and one Survivor Space are used, when recycling, The surviving objects in the two Spaces are copied to the other Survivor space.

    Insert a picture description here

Old age and tag copy algorithm

In the old days, the mark-Compact algorithm was used because only a few objects were collected at a time.

  • The JAVA VIRTUAL machine mentioned Permanet Generation in the method area, which stores class classes, constants, method descriptions, and so on. Recycling of the immortal generation mainly involves discarding constants and useless classes.

  • The memory allocation of objects is mainly in the Eden Space of the new generation and the From Space of Survivor Space. In a few cases, it is directly allocated to the old generation.

  • When the EdenSpace and From Space of the new generation are insufficient, a GC will occur. After GC, the surviving objects in EdenSpace and From Space will be moved To To Space. Then clean up Eden Space and FromSpace.

  • If To Space is not sufficient To store an object, the object is stored in the old generation.

  • After GC, Eden Space and To Space are used, and the cycle repeats.

  • When an object escapes a GC in a Survivor zone, its age increases by +1. By default, objects aged 15 are moved to the old generation.

GC generation collection algorithm VS partition collection algorithm

Generational collection algorithm

The current mainstream JVM garbage Collection uses “Generational Collection,” which divides memory into chunks based on the lifetime of an object, like the new generation in the JVM, the old generation, the permanent generation, In this way, the most appropriate GC algorithm can be used according to the characteristics of each era.

  • In the new generation – copy algorithm each garbage collection can find a large number of objects dead, only a few survive. Therefore, the selection of replication algorithm, only need to pay a small amount of replication cost of living objects can complete the collection.

  • In the old time-mark collation algorithm, because the object has a high survival rate and there is no extra space for it to allocate guarantee, it must use “** mark-clean” or “mark-collation” algorithm to recycle, without memory replication, and directly free up free memory.

Partition collection algorithm

Partition algorithm divides the whole heap space into continuous different cells, and each cell is used independently and recycled independently. The advantage of this is that you can control how many cells are recycled at a time, and reasonably recycle several cells at a time (rather than the whole heap) depending on the target pause time, thereby reducing the pauses generated by a SINGLE GC.

GC garbage collector

Java heap memory is divided into the new generation and the old generation. The new generation mainly uses the copy and mark-clean garbage collection algorithm, while the old generation mainly uses the mark-clean garbage collection algorithm. Therefore, Java virtual provides a variety of different garbage collectors for the new generation and the old generation respectively. The Sun HotSpot VIRTUAL machine garbage collector in JDK1.6 is as follows:

Insert a picture description here

Serial garbage collector (single thread, replication algorithm)

  • Serial is a basic garbage collector that uses a replication algorithm and was the only garbage collector in the new generation prior to JDK1.3.1. Serial is a single-threaded collector that not only uses one CPU or one thread to complete garbage collection, but must suspend all other worker threads until the garbage collection is complete. The Serial garbage collector, while suspending all other worker threads during garbage collection, is simple and efficient, achieving the highest single-thread garbage collection efficiency for a limited single-CPU environment without the overhead of thread interaction. So Serial garbage collector is still the default generation garbage collector for Java virtual machines running in Client mode.

ParNew garbage collector (Serial+ Multithreading)

  • The ParNew garbage collector is a multithreaded version of the Serial garbage collector. It also uses a replication algorithm and behaves exactly the same as the Serial garbage collector except that it uses multiple threads for garbage collection. The ParNew garbage collector also suspends all other worker threads during garbage collection. By default, the ParNew collector opens the same number of threads as the number of cpus. You can limit the number of garbage collector threads by using the -xx :ParallelGCThreads parameter. Although ParNew is almost identical to Serial except for multi-threading, the ParNew garbage collector is the default garbage collector for the new generation of Java virtual machines running in Server mode.

Parallel Insane (Multi-threaded replication algorithm, Efficient)

  • The Parallel Scavenge collector is a new generation garbage collector that also uses replication algorithms and is a multi-threaded garbage collector that focuses on achieving a controlled throughput (Thoughput, the amount of time the CPU takes to run user code, That is, throughput = run user code time /(run user code time + garbage collection time)), high throughput can make the most efficient use of CPU time, as soon as possible to complete the program’s computing tasks, mainly suitable for the background operations without too much interaction. Adaptive conditioning strategies are also an important difference between the ParallelScavenge collector and the ParNew collector.

Serial Old collector (single-threaded tag collation algorithm)

Serial Old is the tenured version of the Serial garbage collector, which is also a single-threaded collector using the mark-collation algorithm. This collector is also primarily the default tenured garbage collector running on the Client’s default Java virtual machine. Use the Server Insane with the new generation of the Parallel Exploder. 2. As a backup garbage collection solution for the older generation using the CMS collector. Garbage collection process diagram of new Generation Serial and Old generation Serial

Insert a picture description here

The next-generation Parallel Scavenge collector is similar to the ParNew collector in that it is a multi-threaded collector and uses replication algorithms to suspend all worker threads during garbage collection. Avenge the insane and ParNew insane.

Insert a picture description here

Parallel Old Collector (Multi-thread tag collation algorithm)

The Parallel Old collector is an older version of the Parallel insane, using the multithreaded mark-collation algorithm and was only available in JDK1.6. Insane insane Insane Insane Insane Insane Insane Insane Insane insane insane insane insane insane insane The Parallel Old collector is designed to provide a through-first garbage collector in the older generation. The Newer Generation Parallel Scavenge and the Older Generation Parallel Old collector can be used as a match strategy if the system requires higher throughput. Insane and the Older Generation of Parallel Collector.

Insert a picture description here

CMS collector (Multi-threaded tag clearing algorithm)

A Concurrent Mark sweep(CMS) collector is an tenured garbage collector whose primary goal is to obtain the minimum garbage collection pause time. Unlike other tenured mark-collation algorithms, it uses a multithreaded mark-sweep algorithm. Minimum garbage collection pauses can improve the user experience for highly interactive applications. The working mechanism of CMS is more complex than other garbage collectors. The whole process is divided into the following four stages: 1. Initial marking: Just mark objects that GC Roots can directly associate with, which is fast and still requires suspending all worker threads. 2. Concurrent marking: The process of GC Roots tracing, working with the user thread without the need to suspend the worker thread. 3. Re-marking: In order to correct the marking record of the part of the object whose mark changed because the user program continued to run during concurrent marking, all worker threads still need to be suspended. 4. Concurrent cleanup: Clears unreachable GC Roots and works with the user thread without suspending the worker thread. Since the garbage collector thread can now work concurrently with the user during the longest parallel tagging and concurrent cleanup processes, overall the CMS collector’s memory collection and the user thread are executed concurrently.

Insert a picture description here

G1 collector

Garbage First Garbage collector is the most advanced theoretical development of Garbage collector. Compared with CMS collector, G1 collector has two notable improvements:

  1. Based on mark-defragment algorithm, no memory fragmentation is generated.

  2. Pause times can be controlled very precisely, enabling low-pause garbage collection without sacrificing throughput. The G1 collector avoids region-wide garbage collection by dividing heap memory into separate regions of fixed size and tracking the progress of garbage collection in these regions, while maintaining a priority list in the background that prioritizes the areas with the most garbage collected at a time based on the allowed collection time. Zone partitioning and priority zone collection mechanisms ensure that the G1 collector can achieve maximum garbage collection efficiency in limited time.