Introduction to the JVM specification

There are three ways to look at the JVM

Insert a picture description here

Summary of the JVM

JVM: A Java Virtual Machine
Virtual machine refers to: through software simulation with complete hardware system functions, running in a completely isolated environment of the computer system
The JVM is an instruction set that simulates Java bytecode through software and is the running environment of Java programs

Key FEATURES of JVM

Find and load class files by ClassLoader
Interprets the bytecode as instructions and executes them, providing a runtime environment for class files
Do runtime memory allocation and garbage collection
Provide a platform to interact with hardware

Virtual machines are Java platform-independent guarantees

Insert a picture description here

The ROLE of the JVM specification and its core

JVM specification role

The Java Virtual Machine specification provides a specification for compiling Java technology code for different hardware
The specification makes Java software platform-independent because it is compiled for a “normal machine” as a virtual machine
This “general machine” can be simulated and run on various existing computer systems in software or implemented in hardware

The main content of the JVM specification definition

Bytecode instruction set
The format of the Class file
Data types and values
Runtime data area
The stack frame
Special methods
The class library
abnormal
Start, load, link, and initialize virtual machines
…

Class bytecode parsing

Class file format overview

The Class file is the input to the JVM, and the structure of the Class file is defined in the Java Virtual Machine specification. The Class file is the platform independent, technology independent foundation of the JVM implementation

Unsigned: Basic data type, with u1, U2, U4, u8 representing unsigned numbers of several bytes
Table: A conforming data type consisting of multiple unsigned and other tables, usually ending with “_info”
A Class file is an 8-byte stream of data items arranged in a compact sequence
Data items that occupy more than 8 bytes are divided into 8 bytes for storage
There are only two types in the Class file format: unsigned numbers and tables

The format of the Class file

The Javap tool generates an informal “virtual machine assembly language” in the following format:
[[]]… ] [comment]
Is the subscript of the instruction opcode in an array that stores the Java virtual machine code of the current method in bytes; It can also be the byte offset corresponding to the start of the method
Is a mnemonic for an instruction, an operand, a comment at the end of a line

Class file format description

Constant_pool_count: Starts from 1
Different types of constant are identified by tag, which is followed by a different info structure
L for object, [for array, V for void
Stack: The depth of the operation stack when the method is executed
Locals: storage space required by local variables, in slot
Slot is the minimum unit that a virtual machine can use to allocate memory for local variables
Args_size: Number of arguments. If it is 1, the instance method will pass this by default, and locals will also reserve a slot for it

ASM

“

Refer to the post

Summary of the ASM

ASM is a Java bytecode manipulation framework that can be used to dynamically generate classes or enhance the functionality of existing classes
ASM can directly generate binary class files or dynamically change class behavior before classes are loaded into virtual machines. After reading information from class files, ASM can change class behavior, analyze class information, and even generate new classes as required
Many frameworks such as Cglib, Hibernate, and Spring use ASM to manipulate bytecode directly or indirectly

ASM programming model

Core API: Provides a programming model based on event form. This model does not require the entire class structure to be read into memory at once, so it is faster and requires less memory, but it is more difficult to program in this way
Tree API: Provides a tree-based programming model. This model requires the entire structure of a class to be read into memory at one time, so this method requires more memory, which is easier to program

The Core of the ASM API

The ability to manipulate bytecode in the ASM Core ApI is based on the ClassVisitor interface. Each method in this interface corresponds to each item in the class file
ASM provides three classes based on the ClassVisitor interface to generate and transform class files
ClassReader: ClassReader parses the class bytecode of a class
ClassAdapter: The ClassAdapter is the implementation class of the ClassVisitor that implements the functionality to be changed
ClassWriter: ClassWriter is also the ClassVisitro implementation class, which can be used to output the changed bytecode
ASM provides the ASMifier tool for development. You can use the ASMifier tool to generate ASM structures for comparison

Class loading, wiring, and initialization

Class loaders and class loaders

The entire life cycle of a class starts when it is loaded into the JVM and ends when it is unloaded from memory.

Insert a picture description here

Load: Finds and loads the binary data of the class file
Join: Merges the binary data of classes that have been read into memory into the JVM runtime environment, including the following steps:
Validation: To ensure the correctness of the classes being loaded
Preparation: Allocates memory for static variables of the class and initializes them
Parse: Convert symbolic references in the constant pool to direct references
Initialization: Assigns initial values to static variables of a class

The functionality to be accomplished by class loading

Gets the binary byte stream of a class by its fully qualified name
Converts binary byte flows into a runtime data structure for the method area
Create a java.lang.Class object on the heap that encapsulates the data structure of the Class in the method area and provides an interface to access the data structure in the method area

The way classes are loaded

The most common methods: load from a local file system or from an archive such as a JAR
Dynamic: Dynamically compile Java source files into classes
Other methods: Network download, load from a proprietary database, and so on

Class loader

Java VIRTUAL machines have the following built-in loaders:
Start the class loader (BootstrapClassLoader)
PlatformClassLoader JDK8: ExtensionClassLoader
Application class loader (AppClassLoader)
User-defined loader: it is a subclass of java.lang.ClassLoader. You can customize the loading mode of the class. It’s just that custom class loaders are loaded last of all system class loaders

Class loader relationship

Insert a picture description here

Class loader is used

Class loader description

Bootstrap class loader: Used to load bootstrap base module classes, such as java.base, java.management, java.xml, etc
Platform classloader: Used to load platform-specific modules, such as Java. scripting, Java.com piler *, java.corba *, etc
Application class loaders: used to load application-level modules such as jak.compiler, JDK.jartool, JDk.jshell, etc. All the class libraries in the classpath path are also loaded
JAVA_HOME>/lib or the path specified by the -xbootclasspath parameter, and the virtual machine recognized libraries are loaded into memory (recognized by name, such as rt.jar).
JDK8: Extension class loader: is responsible for loading all class libraries in the path specified by the <JRE_HOME>/lib/ext, or the java.ext.dirs system variable
JDK8: Application class loader: responsible for loading all the class libraries in the classpath path
Java programs cannot directly reference the start classLoader. Set the classLoader to null. The start classLoader is used by default
Instead of waiting for “first active use” of a class before loading it, the JVM specification allows classloaders to preload a class when they anticipate it will be used
If the.class file is missing at load time, LinkageError will be reported on the first active use of the class. If it has not been used, no error will be reported

Parental delegation model

Classloaders in the JVM typically use the parent delegate model, requiring that all classloaders have their own parent loader in addition to starting the ClassLoader. Here the father-child relationship is a combination rather than an inheritance and works as follows:

When a classloader receives a classloading request, it first searches for all “named modules” defined by its built-in loader
If a suitable module definition is found, it will be loaded using the loader
If the class is not found in any named module defined by these loaders, it is delegated to the parent loader until the class loader is started
If the parent loader reports that it cannot complete the load request, such as not finding the class in its search path, then the subclass loader will load it itself
Classes found under the classpath will become the nameless modules of these loaders

The parental delegation model is important to ensure the stable operation of Java programs, preventing a class from being loaded multiple times
The code that implements parental delegation is in the loadClass() method of java.lang.classloader, and it is recommended to override the findClass() method if you are customizing the ClassLoader
If there is a Class loader that can load a Class, it is called a defining Class loader, and all classloaders that successfully return the Class of that Class are called initial classloaders

Description of the parental delegation model

The parental delegation model is important to ensure the stable operation of Java programs
In the loadClass() method of the code java.lang.classLoader that implements parental delegation, it is recommended to override the findClass() method if you are customizing the ClassLoader
If there is a Class loader that can load a Class, it is called a defining Class loader, and all Class loaders that can successfully return the Class of that Class are called initialization loaders
If no parent loader is specified, the class loader is started by default
Each class loader has its own namespace, which consists of the classes loaded by the class loader and all its parent loaders. Different namespaces can have the same class full path name
A runtime package consists of classes with the same classloader, and determining whether two classes belong to the same runtime package depends not only on whether the full pathnames are the same, but also on whether the class loaders that define them are the same. Only classes belonging to the same runtime package can achieve mutual in-package visibility

Break the parent delegate model

One problem with the parent delegate model is that the parent loader cannot downward identify resources loaded by the child loader
To solve this problem, a Thread context classloader was introduced, which can be set via Thread’s setContextClassLoader()
When hot deployment is implemented, such as OSGI’s modular hot deployment, its class loaders are no longer strictly based on the parental delegation model, and many may be executed in flat class loaders

Class connection and initialization

Class connections primarily validate content

Class file structure check: Follow the class file structure specified by the JVM specification
Metadata validation: Semantic analysis of the information described by bytecode to ensure compliance with Java language specifications
Bytecode validation: Ensuring that program semantics are legal and logical by analyzing data flow and control flow. The method body is mainly verified here
Symbolic reference validation: Checks for matches between symbolic references in the constant pool and information outside the class itself

Preparation in class connections

Allocates memory for static variables of the class and initializes them

Parsing in class connections

Parsing is the process of converting symbolic references in constants into direct references, including: symbolic references: a set of unambiguous (unique) symbols describing the referenced object, independent of virtual machine invalidation
Direct references: Pointers, relative offsets, or handles that can be indirectly located to the target are implementation-specific
Mainly for: class, interface, field, class method, interface method, method type, method handle, call point qualifier

Class initialization

Class initialization is the process of assigning initial values to static variables of a class, or executing class constructor methods

When a class is initialized, the interface it implements is not initialized first
When an interface is initialized, its parent interface is not initialized first
Interface initialization occurs only when a program first uses a variable in the interface or calls an interface method
If the class is not already loaded and connected, load and connect first
If a class has a parent class and the parent class is not initialized, the parent class is initialized first
If there are initialization statements in the class, they are executed in turn
For interfaces:
Calling the loadClass method of the Classloader class to load a class does not initialize the class and is not an active use of the class

Active initialization of a class

Class initialization timing

Java programs use classes in active and passive ways, and the JVM must initialize each class or interface on its “first active use.” Passive use of a class does not result in class initialization. Active use:

Creating a class instance
A static variable that accesses a class or interface
Call a static method of a class
Reflect a class
Initializes a subclass of a class that has not yet been initialized
The main class that runs when the JVM starts
Defines the interface of the default method when the interface implements class initialization

Class of unloading

When the Class object representing a Class is no longer referenced, the life of the Class object ends and the corresponding data in the method area is unloaded
Classes loaded by JVM class loaders are not unloaded. Classes loaded by user-defined class loaders can be unloaded

Memory allocation

Insert a picture description here

Simplified architecture and runtime data areas for the JVM

Insert a picture description here

Runtime data area

PC registers, Java virtual machine stack, Java heap, method area, runtime constant pool, local method stack, etc

PC register

Each thread has a PC register, which is thread-private and stores the address to the next instruction
When the thread is created, the corresponding PC register is created
The VALUE of the PC register is undefined when the local method is executed
Is a relatively small memory area, and the only one where outofMemoryErrors are not specified in the JVM specification

Java stack

A stack consists of a series of frames (hence the Java stack, also called a Frame stack) that are thread private
Stack frames are used to hold local variables of a method, the operand stack (Java has no registers and uses the operand stack for all arguments passed), constant pool Pointers, dynamic links, method returns, and so on
Each method call creates a frame and pushes the stack. When exiting the method, changing the top pointer destroys the contents of the stack frame
The local variable table holds various basic data types and reference types known at compile time. Each slot holds 32 bits of data, long, double, and two slots
Stack advantage: access faster than the heap block, second only to registers
Disadvantages of stack: the size of data in the stack, the survival area is determined by the compiler, lack of flexibility

The Java heap

Used to store objects and arrays created by the application system, all threads share the Java heap
The GC mainly manages heap space, and for generational GC, the heap is also generational
The advantages of heap: dynamic allocation of memory size during the runtime, automatic garbage collection;
Disadvantages of heap: Relatively slow efficiency

Methods area

The method area is shared by threads and is typically used to hold structural information about loaded classes
It is usually associated with a meta-space, but it depends on the JVM implementation and version
The JVM specification describes the method area as a logical part of the heap, but it has an alias called non-heap, presumably to separate it from the Java heap

Run-time constant pool

Is a constant pool table for each Class or interface in a Class file, which is represented at runtime, usually including: version of the Class, field, method, interface, etc
Assign in the method area
Typically, after classes and interfaces are loaded into the JVM, the corresponding runtime constant pool is created

Local method stack

The stack used in the JVM to support native method execution is the native method stack

Stack, heap, method area interaction

Insert a picture description here

Java heap memory model and allocation

Overview of Java heap memory

The Java heap is used to hold objects and arrays created by the application system and is shared by all threads
The Java heap allocates memory dynamically at run time, automatically collecting garbage
Java garbage collection (GC) is mainly about reclaiming heap memory, and for generational GC, the heap is also generational

Structure of the Java heap

Insert a picture description here

The new generation is used to store newly assigned objects; Objects that are not recycled from the new generation are copied to the old age
Old-generation storage objects are much older than new-generation storage objects
The old age stores some big objects
Total heap size = New generation + old age
Cenozoic = Eden + survival zone
In the previous persistent generation, the area used to store meta information, such as Class and Method, has been removed from JDK8 and replaced with MetaSpace. MetaSpace does not exist in the virtual machine, but uses local memory

Object memory layout

The layout of objects stored in memory (using the Hotspot VIRTUAL machine as an example) is divided into object headers, instance data, and aligned padding
Object header, consisting of two parts:
Mark Word: Stores the running data of the object itself, such as HashCode, GC generation age, lock status flag, etc
Type pointer: A pointer to an object’s class metadata
Instance data: Where the actual instance data of an object is stored
Alignment fill: This part does not have to exist and has no special meaning. It is just a placeholder. Because HotSpot requires that the object’s starting address be a multiple of 8 bytes, if not, align

Object access location

Use handles: A block of memory is allocated to the Java heap as a handle pool. Reference stores the address of the handle, and the address of the instance data and class metadata of the object in the handle, as shown in the figure

Insert a picture description here

Use Pointers: The Java heap stores the address of the class metadata, and reference stores the address of the object directly, as shown in the figure below:

Insert a picture description here

Trace tracing and Java heap parameter configuration

“

doc

Trace Trace parameter

You can print brief information about GC: -Xlog: GC
Print GC details: -xlog: GC *
-xlog: GC :garbage-collection.log specifies the location of the GC log in the output file: -xlog: GC :garbage-collection.log
After each GC, heap information is printed: -xlog: GC +heap = DEBUG

GC log format

The time for GC to occur is the number of seconds that have elapsed since the JVM was started
Log level information, and log type flags
The GC identification number
The type of GC and the reason for GC
Capacity: pre-GC capacity -> post-GC capacity (total capacity of the region)
GC duration, in seconds. Some collectors have detailed descriptions, such as user for the time consumed by the application, sys for the time consumed by the system kernel, and real for the time from start to finish of an operation

Arguments to the Java heap

-Xms: initializes the heap size. The default heap size is 1/64 of the physical memory
-Xmx: the maximum heap size, 1/4 of the default physical memory
-Xmn: new generation size, by default, 3/8 of the entire heap
– XX: + HeapDumpOnOutOfMemoryError: OOM heap exported to a file
-xx: +HeapDumpPath: indicates the OOM export path
-xx :NewRatio: indicates the ratio of the old to the Cenozoic. If XMS = XMX and XMN is set, this parameter is not required
-xx :SurvivorRatio :SurvivorRatio of Eden zones to Survivor zones. If this parameter is set to 8, the ratio of two Survivor zones to one Eden zone is 2:8, and one Survivor accounts for 1/10 of all new ones
-xx :OnOutOfMemoryError: Executes a script in OOM
-xss: Usually only a few hundred k, determines the depth of the function call

The parameters of the meta space

-xx :MetaspaceSize: indicates the initial space size
-xx :MaxMetaspaceSize: indicates the maximum space. By default, there is no limit
-xx :MinMetaspaceFreeRatio: percentage of the minimum Metaspace free space capacity after GC
-xx :MaxMetaspaceFreeRatio: indicates the percentage of the maximum Metaspace free space capacity after GC

Bytecode execution engine

The JVM’s bytecode execution engine basically inputs a bytecode file, parses and processes the bytecode, and outputs the result of execution
This can be done either by directly interpreting bytecode execution through the interpreter, or by producing native code through a just-in-time compiler, known as compilation execution, or both

The stack frame

Stack frames are data structures used to support method calls and method execution by the JVM
Stack frames are created with method calls and destroyed with method terminations
The stack frame stores the local variable table, operand stack, dynamic link, method return address and other information of the method

Local variable scale

Local variable table: Storage space for method parameters and local variables defined within a method

The unit of variable slot is slot. Currently, one slot stores data types up to 32 bits
Two slots for 64-bit data
For instance methods, slot 0 holds this, which is then assigned to the argument list from 1 to n
Slots are then assigned based on the order and scope of variables defined inside the method body
Slots are multiplexed to save space on stack frames, a design that can affect the system’s garbage collection behavior

The operand stack

Operand stack: Used to store the data operated by each instruction during the execution of a method

The data types of the elements in the operand stack must exactly match the order of the bytecode instructions
The virtual machine may optimize the stack frame implementation so that the two stack frames overlap to store common data

Dynamic link

Dynamic linking: Each stack frame holds a reference to the method that the stack frame belongs to in the runtime constant pool to support dynamic linking during method calls

Static resolution: Symbolic references are converted to direct references when the class is loaded
Dynamic linking: Converted to a direct reference at run time

Method return address

Method return address: the address returned by the method after execution

The method call

Method call: Determines which method to call, and does not involve the internal execution of the method

Part of the method is to determine the direct reference relationship directly during the parsing phase of the class load
But for instance methods, also known as virtual methods, because of multiplicity and polymorphism, runtime dynamic dispatch is required

The dispatch

Static dispatch: All rely on static types to locate the method execution version of the dispatch, such as overloading methods
Dynamic dispatch: Dispatch methods that locate the execution version of a method based on the actual type of the runtime, such as override methods
Single assignment and multiple assignment: according to the dimensions of assignment, more than one is considered multiple assignment, and only one is called single assignment

The garbage collection

Overview of Garbage Collection

What is garbage: simply put, memory that is no longer being used is garbage
Garbage collection algorithm:
Objects that can be used as GC Roots include: objects referenced in the virtual machine stack (stack frame local variables), objects referenced by static attributes of the method area class, objects referenced by constants in the method area, and objects referenced by JNI in the local method stack
HotSpot uses a set of data structures called OopMap to achieve accurate GC
JVM can do GC Roots enumeration very quickly with the help of OopMap. However, the JVM does not generate an OopMap for each instruction
These “specific places” where OopMap is recorded are referred to as safe points, where the current thread is allowed to pause GC until it executes
A zone is called a safe zone if the object reference relationship does not change in a piece of code and it is safe to start GC anywhere in the zone
Advantages: simple failure, high efficiency
Disadvantages: Does not solve the problem of circular references between objects
Reference counting: Adds a reference counter to an object, incrementing it with access and decrement it with reference failures
Root search algorithm (reachability analysis method) : Search object node down from GC Roots node. The search path is called reference chain. If there is no connection between an object and the root, the object is unavailable

Fundamentals of Garbage Collection

Across generations reference

Cross-generation references: That is, objects in one generation refer to objects in another generation
Cross-generation citation hypothesis: Cross-generation citation is a minority relative to same-generation citation
Implicit corollary: two objects that reference each other should tend to live or die at the same time

The memory set

Remembered Set: An abstract data structure for recording a collection of Pointers from a non-collection area to a collection area
Word length accuracy: Each record is accurate to one machine word length, and the child contains cross-generation Pointers
Object precision: Each record is accurate to an object that has fields containing cross-generation Pointers
Card accuracy: Each record is accurate to an area of memory that contains objects with cross-generation Pointers
Card Table: is a specific implementation of memory set, which defines the record accuracy of memory set and the mapping relationship with heap memory
Each element of the Card table corresponds to a memory block of a specific size in its identified memory area, which is called a Card Page.

Write barriers

Write barriers can be thought of as THE JVM’s AOP for the action of reference type field assignment
The write barrier is used to maintain the card table state when the object state changes

Steps to determine whether garbage is present

Not available to the search algorithm
See if it is necessary to execute the Finalize method
If an object is still unused after two steps, it is garbage

GC type

MinorGC/YoungGC: Collecting action in the new generation
MajorGC/OldGC: GC that occurs in older generations, currently only the CMS collector has the behavior of collecting older generations separately
MixedGC: Collects the entire new generation and part of the old generation, currently only the G1 collector does this
FullGC: GC that collects the entire Java heap and method area

Stop-The-World

STW is a global pause phenomenon in Java, mostly caused by GC. Global pauses are when all Java code stops running and native code can execute but not interact with the JVM
Its harm is a long time service stop, no response; For HA systems, active/standby switchover may occur, which seriously harms the production environment

Garbage collection type

Serial collection: GC single thread memory collection, suspends all user threads, such as Serial
Parallel collection: Multiple GC threads work concurrently when the user thread is suspended, as in Parallel
Concurrent collection: The user thread and GC thread execute simultaneously (not necessarily in parallel, but alternately) without the need to suspend the user thread, such as CMS

Determine the conditions under which a class is useless

All instances of this class in the JVM have been reclaimed
The ClassLoader that loaded the class has been reclaimed
The Class object of this Class is not referenced anywhere
This class cannot be accessed from anywhere through reflection

Garbage collection algorithm

Mark clearing algorithm

Mark-sweep algorithm: It is divided into two phases, marking the objects to be collected first and then collecting them uniformly
Insert a picture description here
Pros: Simplicity
Disadvantages:
Not efficient, not efficient in labeling and cleaning
A large number of discrete memory fragments are generated, resulting in GC being triggered when large objects are allocated

Replication algorithm

Copying algorithms: Partitioning memory into two identical areas, using one piece each time, Copying surviving objects from one piece to another when one is used up and then cleaning it up
Insert a picture description here
Advantages: simple implementation, efficient operation, do not consider the memory fragmentation problem
Disadvantages: Some memory waste
In the actual implementation of the JVM, memory is divided into one large Eden area and two smaller Survivor Spaces, and each time Eden and one Survivor are used, the surviving objects are copied to the other Survivor when reclaimed
HotSpot has a default Eden/Survivor ratio of 8:1, which means it can use 90% of the new generation space at a time
If the Survivor space is insufficient, it relies on the old generation to allocate the guarantee, and the objects that cannot be put into the old generation directly

“

Allocation guarantee: When the new generation carries out garbage collection, the living area of the new generation can not be placed, so the strategy of placing these objects to the old age is to make space allocation guarantee for the GC of the new generation, the steps are as follows:

Before MinorGC occurs, the JVM checks to see if the maximum contiguous space available for the old generation is greater than the total space for all objects of the new generation, and if so, to ensure that MinorGC is safe
If less, then the JVM checks to see if it is set to allow guarantee failure, and if so, continues to check if the maximum contiguous space available for the old age is greater than the average size of objects promoted to the old age over time
If it is, try a MinorGC
If not, do a Full GC instead

Tag sorting algorithm

Mark-compact: Because the replication algorithm has low efficiency and waste of space when there are many living objects, the replication algorithm is generally not used in the old era, and the marker Compact algorithm is mostly used in the old era
The marking process is the same as the mark clearing algorithm, but instead of clearing the recyclable objects directly afterwards, all living objects are moved to one end, and memory beyond the boundary is cleared directly
Insert a picture description here

Garbage collector

Serial collector, Parallel collector, Cenozoic Parallel collector, Scavenge collector, CMS, G1
Insert a picture description here

Serial collector

Serial collector/Serial Old collector, which is a single-threaded collector, stop-the-world when garbage is collected
Insert a picture description here
Advantages: simple, for a single CPU, because there is no multi-threaded interaction overhead, may be more efficient, is the default Client mode of the new generation of collectors
Starting with -xx :+UseSerialGC uses the collector combination of: Serial + SerialOld
The new generation uses the copy algorithm and the old generation uses the mark-collation algorithm

Parallel collector

ParNew collector

ParNew (parallel) collector: Uses multiple threads for garbage collection, stop-the-world during garbage collection
Insert a picture description here
In concurrent CPU environments, it pauses for less time than the serial collector; But for single-CPU or weak concurrent cpus, it can be worse than serial collector due to the interaction overhead of multithreading
Server mode is the preferred new generation collector, and can be used with CMS collector
No longer use -xx :+UseParNewGC to enable it separately
-xx :ParallelGCThreads: Specifies the number of threads, preferably the same as the number of cpus

The New Generation Parallel Scavenge

The Cenozoic Parallel collector/Parallel Old collector. A collector designed to be exploited on a Cenozoic basis using replication algorithms
Similar to ParNew, but more focused on throughput, can maximize CPU efficiency, suitable for running background applications
Insert a picture description here
Use -xx :+UseParallelGC to turn it on
Use -xx :+UseParallelOldGC to unlock the Old Collector using the ParallelOld collector, use the Parallel Scavenge + ParallelOld collector combination
-xx :MaxGCPauseMillis: Sets the maximum pause time for GC
The new generation uses the copy algorithm and the old generation uses the mark-collation algorithm

CMS collector

CMS (Concurrent Mark and Sweep) collectors are divided into: Initial marks: only marks objects that GC Roots can be directly associated with; Concurrent marking: The process of GC Roots Tracing
Relabelling: Corrects the part of the object whose markup changes as the program runs during concurrent markup
Concurrent cleanup: Concurrent collection of garbage objects
Insert a picture description here
Stop-the-world still occurs during the initialization and re-marking phases
Garbage collector that uses the tag sweep algorithm, multi-threaded concurrent collection
The last reset thread, which clears the data associated with the collection and resets it in preparation for the next collection
Advantages: Low pauses, concurrent execution
Disadvantages:
Concurrent execution puts heavy pressure on CPU resources
Garbage generated during processing (floating garbage) cannot be processed, which can lead to FullGC
The token clearing algorithm used causes a large amount of fragmentation, which can trigger FullGC when allocating large objects
On: -xx :UseConcMarkSweepGC: Collector combination using ParNew + CMS + Serial Old, which will serve as a backup collector for CMS errors
CMS collector – XX: CMSInitiatingOccupancyFraction: set how much space is used on the old s triggered after the recovery, 80% by default

G1 collector

The G1 (garbage-first) collector is a collector for service-oriented applications. Compared with other collectors, it has the following characteristics:

G1 divides memory into independent regions
G1 still uses the concept of generation, preserving the new generation and the old age, but they are no longer physically isolated, but a collection of parts of the Region, and do not need to be continuous

Insert a picture description here
G1 can make full use of the hardware advantages of multi-CPU and multi-core environment and shorten STW as much as possible
G1 uses a mark-collation algorithm as a whole, and a copy algorithm as a part, without memory fragmentation
G1 pauses are predictable and specify how much time can be spent in garbage collection during a period of time
G1 tracks the value of garbage in each Region, maintains a priority list in the background, and collects the areas with the highest value according to the allowed time each time to ensure efficient collection within a limited time
Garbage collection:
Initial tag: Marks only objects to which GC Roots can be directly associated
Concurrent marking: The process of GC Roots Tracing
Final markup: Corrects the portion of the object whose markup changes as the program runs during concurrent markup
Filter collection: Maximize value collection based on time
Insert a picture description here
Use and configure G1: -xx :+UseG1GC: Enable G1, default is G1
-xx :MaxGCPauseMillis = n: Maximum GC pause time, which is a soft target that the JVM will try (but not guarantee) to pause less than
– XX: InitiatingHeapOccupancyPercent = n: the number of triggered when the GC heap usage, the default is 45
-xx :NewRatio = n: The default value is 2
-xx :SurvivorRatio = n: The default value is 8
-xx :MaxTenuringThreshold = n: New generation to old age. The default value is 15
-xx :ParallelGCThreads = n: number of ParallelGCThreads. The default value varies from platform to platform
-xx :ConcGCThreads = n: indicates the number of threads used for concurrent GC
-xx :G1ReservePercent = n: Set the percentage of reserved memory as free space to reduce the overflow risk of the target space. The default value is 10%
-xx :G1HeapRegionSize = n: Size of the G1 region. The value is a power of 2, ranging from 1MB to 32MB, and the goal is to partition about 2048 regions based on the minimum Java heap size

ZGC collector (Understood)

ZGC collector: An experimental low-latency collector added to JDK11
ZGC is designed to support terabytes of memory capacity, low pause times (<10ms), and less than 15% impact on overall program throughput
New technology in ZGC: coloring Pointers and read barriers
GC performance indicators:
Throughput = application code execution time/total run time
GC load, as opposed to throughput, is GC time/total run time
The pause time is the total time that stop-the-world takes place
GC frequency is the number of GC occurrences in a period of time
Reaction speed: is the time from the time the object becomes garbage to the time it is collected
Interactive applications usually want as little pause time as possible
JVM memory configuration principles:
Make the new generation as large as possible, if it is too small it will result in:
For older generations, for response-time-first applications: Since older generations typically use concurrent collectors, their size takes into account parameters such as concurrency and concurrency duration
For the old age, for throughput first applications: usually set the large generation and small generation, so that most short-term objects can be recycled as much as possible, reduce the mid-term objects, and the old age as much as possible to store the long-term survival objects
Objects are classified according to their life cycle. Objects are preferentially allocated in the new generation, and objects that survive for a long time enter the old age
According to the characteristics of different generations, appropriate collection algorithms are selected: a small number of objects survive, which is suitable for replication algorithm; A large number of objects survive and are suitable for tag scavenging or tag collation

If the setting is too small, memory fragmentation may occur, and high reclamation frequency may cause application suspension
If the setting is large, it will take a long time to recycle
YGC is more frequent
May cause objects after YGC to enter the old age. If the old age is full, FGC will be triggered

Efficient concurrent

Java memory model and memory interaction

Java memory model

The JCP defines a Java memory model that was previously in the JVM specification and later spun out as JSR-133(Java Memory Model and Threading Specification Revision)
Memory model: Process abstraction of read and write access to a particular memory or cache under a particular operating protocol
The Java memory model focuses on the low-level details of storing variable values into and out of memory within the JVM
Insert a picture description here
All variables (shared) are stored in main memory, and each thread has its own working memory; The main memory copy of variables used by the thread is held in working memory
All thread operations (reads, writes) on variables should be done in working memory
Working memory cannot be accessed by different threads, and data is exchanged through main memory

Interoperation between memory

The Java memory model specifies operations to implement memory interactions, and the JVM keeps them atomic
Lock: Lock that identifies a variable as thread-exclusive and acts on the main memory variable
Unlock: The main memory variable can only be used by other threads
Read: reads variables from main memory into working memory
Load: Loads the values read into a copy of the variables in working memory
Use: to pass the value of a variable in working memory to the execution engine
Assign: assigns a value received from the execution engine to a variable in working memory
Store: Store, passing the value of a variable in working memory to main memory
Wirte: Write, store data into variables such as main memory
Insert a picture description here

Rules for memory interaction

One of the read and load, store and write operations is not allowed to occur alone. The two operations must be executed sequentially, but are not guaranteed to be executed consecutively, that is, other instructions may be inserted between read and load and between store and write
A thread is not allowed to discard its most recent assign operation, which means that after a variable has changed in working memory, it must synchronize the change back to main memory
A thread is not allowed to synchronize data from the thread’s working memory back to main memory without any assign operation
A new variable can only be “created” from the main memory. It is not allowed to use an uninitialized variable in the working memory. That is, the use and store operations on a variable must be assigned and loaded first
A variable can be locked by only one thread at a time. However, the lock operation can be repeated by the same thread several times. After multiple lock operations are performed, the variable can be unlocked only after the same unlock operations are performed
If you lock a variable, the value of the variable will be emptied from working memory. Before the execution engine can use the variable, you need to re-initialize the value by load or assign
You cannot unlock a variable that has not been locked by a lock operation, nor can you unlock a variable that has been locked by another thread
Before you can unlock a variable, you must synchronize it back to main memory (store and write)

Volatile features

Visibility in multithreading

Visibility: When one thread changes a variable, other threads can know about it
Common ways to ensure visibility: volatile, synchronized, final (once initialization is complete, other threads are visible)

volatile

Volatile is basically the lightest synchronization mechanism provided by the JVM. Volatile variables are visible to all threads, meaning that writes to volatile variables are immediately reflected in other threads
Volatile variables are still unsafe in multithreaded environments
Volatile modifies variables that prohibit instruction reordering optimization
Suitable for scenarios where Valatile is used
The result of the operation is independent of the current value of the variable
Ensure that only one thread modifies the value of a variable

Instruction rearrangement

Instruction reordering: When the JVM, for optimization purposes, rearranges instructions in such a way as to run the current sequence of instructions that can be executed immediately, avoiding the wait to retrieve the data needed for the next instruction
Serial semantics within threads, regardless of semantics between threads
Not all commands can be rearranged. For example:
A = 1; B = a; After writing a variable, read the position
A = 1; A = 2; You write a variable, and then you write that variable
A = b; B = 1; After reading a variable, write the variable
The above statement cannot be rearranged, but a = 1; B = 2; It can be rearranged
Program sequence principle: guarantee semantic serialization within a thread
Volatile rule: Writes to volatile variables occur first before reads
Lock rule: An unlock must occur before a subsequent lock
Transitivity: A precedes B, and B precedes C, so A must precede C
A thread’s start method precedes each of its actions
All operations of a thread precede its termination
Code that precedes the interrupted thread
Object constructors terminate before finalize() methods

Java thread safe handling methods

Immutable is thread-safe
The mutex synchronization (non-blocking synchronous) : synchronized, Java. Util. Concurrent. Already. At present, the performance of the two methods is almost the same, so it is recommended to use synchronized. ReentrantLock adds the following features:
Interruptible wait: When the thread holding the lock does not release the lock for a long time, the thread waiting can choose to abandon the wait
Fair lock: When multiple threads are waiting for the same lock, the lock must be obtained in the exact order in which the lock was applied
Locks bind to multiple conditions: A ReentrantLock object can bind to multiple Condition objects, whereas synchronized is specific to one condition and requires multiple locks
Non-blocking synchronization: An optimistic lock strategy based on conflict checking. It usually operates first. If there are no conflicts, the operation succeeds
Synchronization free: In multithreading, methods do not involve sharing data, so synchronization is not required

Lock the optimization

Spin-locking and adaptive spin

Spin: If the thread can acquire the lock quickly, then instead of suspending the thread in the OS layer, it can do several busy loops, which is called spin
Adaptive spin: The spin time is no longer fixed, but is determined by the previous spin time on the same lock and the state of the lock owner
If the lock is held for a short period of time and the spin is successful, it can save time for threads to hang and switch, thus improving system performance
If the lock is held for a long time and the spin fails, processor resources are wasted and system performance degrades

Lock elimination

When compiling the code, it detects that there is no shared data race at all, so there is no need to synchronize locking. Run the -xx :+EliminateLocks command to enable EliminateLocks
Also use -xx :DoEscapeAnalysis to start escape analysis

“

Escape analysis:

If an object defined in a method may be referenced by an external method, it is called method escape
If the object can be accessed by another external thread, it is called thread escape, such as assignment to a class variable or instance variable that can be accessed in another thread

Lock coarsening

In general, we want to keep synchronized blocks small, but a series of consecutive operations causes an object to be locked and unlocked repeatedly, which can lead to unnecessary performance losses. This scenario suggests extending lock synchronization to the entire sequence of operations

Lightweight lock

Lightweight is relative to the traditional locking mechanism, the original intention is to reduce the performance loss caused by the traditional locking mechanism using OS mutual exclusion in the case of no multi-thread competition
Its implementation principle is very simple, is similar to optimistic lock way
If the lightweight lock fails, it indicates that there is competition and the lightweight lock is upgraded to the heavyweight lock, resulting in performance deterioration

Biased locking

Biased locking directly eliminates the whole synchronization in the case of no competition, not even optimistic locking, so as to improve performance. The so-called bias, is eccentric, that is, the lock will be biased in favor of the current thread that has the lock
As long as there is no contention, the thread that acquired the biased lock will enter the synchronized block in the future and will not need to synchronize
The bias mode ends when another thread requests the same lock
If the majority of the locks in your application are always accessed by multiple threads, which is a competitive situation, locking bias can actually degrade performance
Run the -xx: -usebiasedlocking command to disable biased locking. This command is enabled by default

Steps in the JVM to acquire locks

Biased locking will be tried first; Then try lightweight locks
Then try spin locking
Finally, try plain locks, suspended at the operating system level using OS mutex

Basic rules for synchronizing code

Minimize the amount of time you hold locks
Minimize the granularity of locks

Performance monitoring and troubleshooting tools

Command line tool

Cli tools: JPS, jinfo, jstack, jmap, jstat, jstatd, and JCMD
Graphic tools: JConsole, JMC, VisualVM
Two connection modes: JMX and jStatd

The role of JVM instrumentation tools

Monitor internal conditions of the JVM during runtime, such as viewing JVM parameters, CPU, memory, heap, etc
Assist in performance tuning
Help resolve application runtime problems such as OutofMemoryErrors, memory leaks, thread deadlocks, lock contention, Java processes consuming too much CPU, and so on

jps

JPS (JVM Process Status Tool) is used to output Status information about running PROCESSES in the JVM. The syntax is as follows: JPS [options] [hostid]
Hostid The syntax of the string is the same as that of the URI: [protocol:] [[/ /] hostname] [:port] [/servername]. If you do not specify hostid, the value is the current host or server by default

jinfo

Prints configuration information for a given process or core or remote debug server. Jinfo [option] pid # Specifies the process number (PID)
Jinfo [option] <executable # specifies the core file
Jinfo [option] [server-id@] # Specifies the remote debugging server

jstack

Jstack is used to view stack information about threads in a Java process. Syntax: jstack [option] PID
jstack [option] executable core
jstack [option] [server-id@] remorte-hostname-or-ip

jmap

Jmap is used to view heap memory usage in the following syntax: jmap [option] pid
jmap [option] executable core
jmap [option] [server-id@] remote-hostname-or-ip

jstat

JVM statistical monitoring tool that looks at memory and GC for each region
Syntax format is as follows: jstat [generalOption | outputOptions vmid [interval [s | ms] [count]]]

jstated

The VIRTUAL machine’s Jstat daemon is used to monitor the creation and termination of JVMS and to provide an interface to sequential remote monitoring tools attached to JVMS running on local systems,
Syntax: jstatd [options]

jcmd

JVM diagnostic tools that send diagnostic command requests to running Java virtual machines, such as exporting heap, viewing Java processes, exporting thread information, performing GC, etc

Graphic tool

jconsole

A JMX-compliant graphical tool for monitoring Java virtual machines. It can monitor local and remote JVMS, as well as monitor and manage applications

jmc

JMC (JDK Mission Control) Java Task Control (JMC) clients include tools for monitoring and managing Java applications, rather than introducing the performance overhead typically associated with these types of tools

VisualVM

A graphical tool that provides detailed information about Java technology-based applications running in Java virtual machines
Java VisualVM provides memory and CPU analysis, heap dump analysis, memory leak detection, access to MBeans, and garbage collection

The remote connection

JMX connections allow you to view system information, CPU usage, number of threads, manual garbage collection, and more system-specific information
Jstatd connection mode can provide: the JVM memory distribution details, garbage collection distribution, thread details, you can even see an object using memory size s | ms] [count]]]

Send you the following learning materials, pay attention to the public number: IT elder brother, receive

This is enough for the JVM

Introduction to the JVM specification

There are three ways to look at the JVM

Summary of the JVM

Key FEATURES of JVM

Virtual machines are Java platform-independent guarantees

The ROLE of the JVM specification and its core

JVM specification role

The main content of the JVM specification definition

Class bytecode parsing

Class file format overview

The format of the Class file

Class file format description

ASM

Summary of the ASM

ASM programming model

The Core of the ASM API

Class loading, wiring, and initialization

Class loaders and class loaders

The functionality to be accomplished by class loading

The way classes are loaded

Class loader

Class loader relationship

Class loader is used

Class loader description

Parental delegation model

Description of the parental delegation model

Break the parent delegate model

Class connection and initialization

Class connections primarily validate content

Preparation in class connections

Parsing in class connections

Class initialization

Active initialization of a class

Class initialization timing

Class of unloading

Memory allocation

Simplified architecture and runtime data areas for the JVM

Runtime data area

PC register

Java stack

The Java heap

Methods area

Run-time constant pool

Local method stack

Stack, heap, method area interaction

Java heap memory model and allocation

Overview of Java heap memory

Structure of the Java heap

Object memory layout

Object access location

Trace tracing and Java heap parameter configuration

Trace Trace parameter

GC log format

Arguments to the Java heap

The parameters of the meta space

Bytecode execution engine

The stack frame

Local variable scale

The operand stack

Dynamic link

Method return address

The method call

The dispatch

The garbage collection

Overview of Garbage Collection

Fundamentals of Garbage Collection

Across generations reference

The memory set

Write barriers

Steps to determine whether garbage is present

GC type

Stop-The-World

Garbage collection type

Determine the conditions under which a class is useless

Garbage collection algorithm

Mark clearing algorithm

Replication algorithm

Tag sorting algorithm

Garbage collector