This paper is based on JDK8 analysis

  • An overview of the
  • JVM architecture
  • Class loading mechanism
  • Runtime data area
  • Garbage collection mechanism

An overview of the

JVM, short for Java Virtual Machine (Java Virtual Machine), is a specification for computing devices, which is a fictitious computer that is implemented by emulating various computer functions on a real computer.

The Java Virtual Machine is essentially a program that, when started on the command line, starts executing instructions stored in a bytecode file. The portability of the Java language is based on the Java Virtual Machine. Bytecode files (.class) can run on any platform as long as the Java Virtual Machine (JVM) is installed for that platform (the JVM helps us to mask the underlying hardware and instructions of different operating systems at the software level). This is “compile once, run many times.”

JVM architecture

The Java virtual machine includes a classloader subsystem, an execution engine, a runtime data area, a native method interface, and a garbage collection module. The garbage collection module is not required in the Java Virtual Machine specification for Java virtual Machine garbage collection, but before the invention of infinite memory, most JVM implementations had garbage collection.

  • Class loader subsystem: loads class files into the method area of the runtime data area based on the given fully qualified class name (e.g., java.lang.object).
  • Execution engine: Executes bytecode or executes local methods.
  • Runtime data area: we often talk about the JVM’s memory, heap, method area, virtual machine stack, local method stack, program counters.
  • Native method interface: Interacts with the native method library for the purpose of merging different programming languages for Java use. It was originally intended to merge C/C++ programs.

First by the compiler converts the Java code into the bytecode, class loader to bytecode loaded into memory (runtime data area) in the area, and bytecode file just the JVM code of a set of instruction set, can’t go directly to the underlying system execution, so need a specific command parser execution engine bytecode translation into the underlying system instruction, Then it is handed over to the CPI to execute, and this process requires calling the native library interface of another language to implement the entire program function.

Class loading mechanism

The Java Class loading mechanism is that the virtual machine loads the data describing the Class from the Class file to the memory, verifies, parses and initializes the data, and finally forms a Java type that can be directly used by the virtual machine.

Class loader

Class loaders are divided into startup class loaders, extension class loaders, application class loaders, and custom class loaders. The various class loaders have a logical parent-child relationship, but not a real parent-child relationship because they have no direct dependencies. Everything is written in Java except the Bootstrap ClassLoader, which is written in C++. Classloaders written in Java inherit from the java.lang.classLoader class.

  1. BootstrapClassLoader: Is responsible for loading the core class libraries in the $JAVA_HOME/jre/lib directory, such as rt.jar, charsets.jar.
  2. The extended classloader (ExtClassLoader) is responsible for loading the JAR packages supporting the J$JAVA_HOME/jre/lib/ext directory. The parent loader is the bootstrap class loader.
  3. AppClassLoader: Is responsible for loading the class packages in the ClassPath, mainly those we wrote ourselves. The parent loader is the extension class loader.
  4. CustomClassLoader: Is responsible for loading class packages in user-defined directories. The parent loader is the application class loader.

Class loading process

Java class loading is divided into five processes: load –> validate –> prepare –> parse –> initialize. These five phases generally occur sequentially, but in the case of dynamic binding, the resolution phase occurs after the initialization phase.

  • Load: Converts bytecode from a different data source (maybe a class file, maybe a JAR package, or even a network) into a binary byte stream and loads it into memory; Convert the static storage structure represented by the byte stream into the runtime data structure of the method area; And generate one in the heap that represents the classjava.lang.ClassObject that serves as an entry point for various data of the class in the method area.
  • Validation: The purpose of validation is to ensure that the class files being loaded conform to the JVM specifications, typically for file format validation, metadata validation, bytecode validation, and symbolic reference validation.
  1. File format validation: Verifies that the byte stream complies with the Class file format specification and can be processed by the current version of the virtual machine. The main purpose of this validation is to ensure that the input byte stream is properly parsed and stored in the method area. After this stage of validation, the byte stream will be stored in the method area of memory. The following three validations are based on the storage structure of the method area.
  2. Metadata validation: Performs semantic verification (in fact, syntax-verification for each data type in the class) on the metadata information of the class to ensure that there is no metadata information that does not comply with the Java syntax specifications.
  3. Bytecode verification: In this stage, the main work is to analyze the data flow and control flow, and verify and analyze the method body of the class, so as to ensure that the methods of the verified class will not endanger the security of the virtual machine at run time.
  4. Symbolic reference validation: This is the final stage of validation, which occurs when the virtual machine converts a symbolic reference to a direct reference (which occurs during the resolution phase), and is mainly used to verify the matching of information outside the class itself (the various symbolic references in the constant pool).
  • Prepare: Assign space and default values to static variables of the class.
  • Resolution: the replacement symbol references for direct reference to the stage will put some static methods (symbol references, such as the main () method) to replace for the pointer to the data storage, memory, or a handle, etc. (reference) directly, this is the so-called static linking process (complete) during class loading, dynamic linking is done during the program is running will use replacement for direct reference symbols.
  • Initialization: Executes a static block of code by initializing a static variable of the class to the specified value.

Parent delegation mechanism

The parent delegate mechanism is that when a classloader receives a request to load a class, if the class has not already been loaded, the classloader will not load it directly. It will delegate to the parent first, and if the parent has not been loaded, it will be passed up until the top level starts the classloader. If the parent loader can complete the loading task, the parent loader loads back; If the parent loader is unable to complete the loading task, it will do the loading itself. In one sentence, the loading process for the parent delegate mechanism is to check from the bottom up to see if the class has been loaded and try to load it from the top down.

Advantages of the parent delegation mechanism:

  1. Sandbox security: prevent core API tampering. Java.lang.string. class classes written by yourself will not be loaded.
  2. Avoid reloading: If the parent loader has already loaded the class, there is no need for the child loader to load it again.

The parent delegate mechanism loads the core code of the class.

protectedClass<? >loadClass(String name, boolean resolve)
        throws ClassNotFoundException
    {
        synchronized (getClassLoadingLock(name)) {
            // First, the class is checked to see if it has been loaded by the class loader. If it has been loaded, the class is returnedClass<? > c = findLoadedClass(name);if (c == null) {
                // If it is not loaded, delegate it to the parent loader
                long t0 = System.nanoTime();
                try {
                    if(parent ! = null) {// Let the parent object call the loadClass method
                        c = parent.loadClass(name, false);
                    } else {
                        // Parent ==null; The bootstrap class loader is written in C++, where you call the local method area to try to load the class.c = findBootstrapClassOrNull(name); }}catch (ClassNotFoundException e) {
                }
                if (c == null) {
                    // If still not found, then invoke findClass in order
                    // to find the class.
                    long t1 = System.nanoTime();
                    // If the parent loader does not load the class, it will load it itself. The findClass() method of the URLClassLoader class is calledc = findClass(name); sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0); sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1); sun.misc.PerfCounter.getFindClasses().increment(); }}if (resolve) {
                resolveClass(c);
            }
            returnc; }}Copy the code

Overall responsibility for the delegation mechanism

The overall responsibility delegate mechanism is that when a Classloader loads a Class, other classes that the Class depends on and reference are usually loaded by the Classloader.

Break parent delegation

Breaking the parent delegate mechanism is that we want the custom class loader to load the specified class directly, rather than delegating it to the parent loader first or having the parent loader load it when the custom class loader fails to load it.

Custom class loader implementation

Having understood the parent delegate mechanism and breaking it, we can write our own custom classloader. Custom class loader:

Using the parent delegate mechanism overrides the findClass() method (the method the classloader specifically loads the class), code portal.

If we want to break the parent delegate mechanism, we need to override the findClass() method and re-loadClass () method. Here we can rewrite the logic to let the class loader load the class first, and then let the parent loader load the class again.

JVM runtime data area

  • Program counter

The program counter is thread private, has the same lifetime as the thread, and is a small piece of memory that can be seen as an indicator of the line number of bytecode executed by the current thread. If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instruction being executed. If the method that the thread is currently executing is a local method, this counter value should be null. The bytecode interpreter’s job is to use the value of this counter to select the next byte code instruction to execute. Basic functions such as branching, looping, jumping, exception handling, thread recovery, and so on depend on this counter. This is the only area where an OutOfMemoryError is not thrown.

  • The virtual machine stack

The virtual machine stack is thread private and has the same lifetime as the thread. When each method is executed, the Java VIRTUAL machine synchronously creates a stack frame to store information about local variables, operand stacks, dynamic links, method exits, and so on. Each method that is called until the end of the execution corresponds to a stack frame in the virtual machine stack of the stack to the stack of the process.

The local variable table stores the basic Java VIRTUAL machine data types known at compile time (Boolean, byte, char, short, int, float, long, double), object references, and returnAddress types.

The following exception conditions are related to the Java virtual machine stack:

If the stack depth requested in the thread is greater than the depth allowed by the virtual machine, the Java VIRTUAL machine will throw a StackOverflowError exception. If the stack size of the Java virtual machine can be dynamically expanded, the Java virtual machine will throw an OutOfMemoryError exception when the stack cannot be extended enough memory.

  • Local method stack

The local method stack is thread private and has the same lifetime as the current thread. The function is the same as that of the virtual machine stack, except that the virtual machine stack performs Java method (that is, bytecode) services for the virtual machine, while the local method stack serves the local methods used by the virtual machine.

The following exception conditions are associated with the local method stack (as with the virtual machine stack) :

If the stack depth requested in the thread is greater than the depth allowed by the virtual machine, the Java VIRTUAL machine throws a tackOverflowError exception.

If the Java VIRTUAL machine stack size can be dynamically expanded, the Java virtual machine will throw an OutOfMemoryError when the stack size cannot be allocated enough memory.

  • The heap

The heap is shared by threads, created at virtual machine startup, from which memory is allocated for class instances (almost all objects are in the heap, but not all) and arrays. For most applications, the heap is the largest area of memory; At the same time, the heap is the most important area of the memory model, and it is also the area that JVM tuning focuses on. The heap storage of an object is collected by an automatic storage management system called a garbage collector; Objects are never explicitly released.

The following exceptions are associated with the heap:

The Java VIRTUAL Machine will throw an OutOfMemoryError if there is no memory in the Java heap for instance allocation and the heap is no longer growing.

The heap memory is divided into Young Generation and Old Generation.

  1. Young generation (YoungGen) : The young generation is divided intoEdenandSurvivorArea.SurvivorArea byFromSpaceandToSpaceComposition.EdenThe area occupies a large capacity,SurvivorThe two partitions have a small capacity, the default ratio is 8:1:1.
  2. (the old sOldGen).
  • Method area (meta-space)

The method area is shared by threads and created when the virtual machine starts. It is used to store data such as class information, constants, static variables, and code compiled by the real-time compiler that has been loaded by the virtual machine.

The following abnormal conditions are associated with the method area:

The Java virtual machine will throw an OutOfMemoryError if the method area does not meet the new memory allocation requirements.

  • Runtime constant pool

The runtime constant pool is part of the method area. It contains a variety of constants, ranging from numeric literals known at compile time to method and field references that must be resolved at run time. The runtime constant pool functions like a symbol table in a regular programming language, although it contains a larger range of data than a typical symbol table. Each run-time constant pool is allocated from the Method area of the Java virtual machine. When the Java VIRTUAL Machine creates a class or interface, a run-time constant pool is constructed for that class or interface.

The following exception conditions are associated with the construction of a run-time constant pool for a class or interface:

The Java VIRTUAL machine will throw an OutOfMemoryError if the runtime constant pool can no longer apply for memory.

  • Direct memory

Direct memory is not part of the data area when the virtual machine is running, nor is it a memory area defined in the Java Virtual Machine Specification. However, this part of memory is sometimes used and can cause An OutOfMemoryError, so it is briefly mentioned here. The allocation of direct memory is not limited by the size of the Java heap, and since it is memory, it is certainly limited by the size of the total native memory and processor addressing space. When configuring VM parameters, server administrators usually set parameters such as -xmx based on the actual memory, but often ignore the direct memory. As a result, the sum of each memory region is greater than the physical memory limit, which leads to OutOfMemoryError during dynamic expansion. [Access to information]

Garbage collection mechanism

In Java, programmers do not need to display to free an object’s memory, but the virtual machine does it itself. In the JVM, there is a garbage collection thread, which is of low priority and does not normally execute, only when the virtual machine is idle or the current heap memory is low. It scans for objects that are not referenced at all and adds them to the collection to reclaim.

GC object determination method

  1. Reference counting method: create a reference counter for each object, when the object reference counter +1, when the reference is released counter -1, when the counter is 0 can be reclaimed. One drawback is that it does not solve the problem of circular references.

  2. Accessibility algorithm (reference chain method) : Start with GC Roots and search down. The path searched is called reference chain. An object is proved to be recyclable when there is no chain of references to the GC Roots.

Garbage collection algorithm

  • Generational collection theory

  • Mark-clear algorithm

The useless object is marked and then cleared for collection. Disadvantages: inefficient, unable to remove garbage debris.

  • Tag-copy algorithm

According to the capacity divided into two equal size of the memory area, when one used up when the living objects will be copied to the other, and then the memory space has been used to clean up. Disadvantages: Low memory usage, only half of the original.

  • Tag-collation algorithm

Mark useless objects, move all living objects to one end, and then clear memory directly beyond the end boundary.

Garbage collector

  • Serial collector (mark-copy algorithm) : A new generation of single-threaded collectors, both marking and cleaning are single-threaded, with the advantage of simplicity and efficiency.

  • ParNew Collector (mark-copy algorithm) : A new-generation parallel collector, which is actually a multi-threaded version of the Serial collector, performs better than Serial in multi-core CPU environments.

  • Parallel scexploiture. The APPLICATION of the Parallel scexploiture pursues high throughput, efficient CPU application. Throughput = user thread time /(user thread time +GC thread time). High throughput can efficiently utilize CPU time and complete the operation tasks of the program as soon as possible. It is suitable for scenarios such as background applications that have low interactive requirements.

  • Serial Old collector (mark-collation algorithm) : Old age single-threaded collector, Old age version of Serial collector.

  • Parallel Collector (mark-collection algorithm). The Old Parallel collector, throughput first, is an Old version of the Parallel Scavenge.

  • Concurrent Mark Sweep (CMS) collector: The old parallel collector, which aims to obtain the shortest collection pause time, has the characteristics of high concurrency, low pauses, and pursues the shortest GC collection pause time.

  • Garbage First (G1) Collector: Java heap parallel collector. The G1 collector is a new collector provided by JDK1.7. The G1 collector is based on the “tag-collation” algorithm, that is, it does not generate memory fragmentation. In addition, the G1 collector differs from previous collectors in one important feature: the G1 collector collects the entire Java heap (both the new generation and the old generation), whereas the first six collectors collect only the new generation or the old generation.

JVM tuning parameters

  • -Xms4g: initializes the heap size to 4G
  • -Xmx4g: The maximum heap memory is 4 gb
  • -xx :NewRatio=4: Sets the memory ratio of the young generation and the old generation to 1:4
  • -xx :SurvivorRatio=8: Sets a SurvivorRatioEden and SurvivorThe ratio is 8:2 (8:1:1)
  • -xx :+UseParNewGC: SpecifiedParNew + Serial Old Garbage collector combination
  • -xx :+UseParallelOldGC: SpecifiedParNew + ParNew Old Garbage collector combination
  • -xx :+UseConcMarkSweepGC: SpecifiedCMS + Serial OldGarbage collector combination
  • -xx :+PrintGC: enables printinggcinformation
  • – XX: + PrintGCDetails: printinggcThe detailed information

Classification:Java.JVM