The introduction

Java Virtual Machine (JVM) The Concept of The Java Virtual Machine is familiar to everyone. The fundamental reason why Java can achieve the cross-platform “compile once, run anywhere” is due to the JVM. JVM is built on the operating system (OS). Java VIRTUAL machine shields developers from direct contact with the operating system. When we write programs through Java, we only need to write Java code, and the specific execution will be translated into mechanical instructions by JVM after loading bytecode for OS execution.

At the same time, knowledge of the JVM has become a must for every Java developer in today’s era of increasing volume in Java, and the Java VIRTUAL machine is also a must-ask topic during job interviews.

Initial Java Virtual Machine JVM

What about the JVM? There are many official, rigid and difficult to understand introductions on the Internet. In my opinion, a simple way to understand it is: a platform with a structure on a platform. What exactly does that mean? Before Java, programs developed in C or assembly, such as low-level languages, needed different compilations in different environments/systems before they could be executed. And Java is in the opposite direction, no longer need according to the different environment and change their code, but their architecture is out of a platform, developers write code you just need to can be executed on your platform, contact with the operating system completely blocked by the developer and the Java platform is known as the JVM architecture itself. Also, any language that can be converted to a class file can be accepted and executed by the JVM.

For example, low-level languages such as C and assembly are “useless” and code needs to be changed for different environments. Java, on the other hand, has become a “roof flower” because of the JVM, and no matter how the environment changes, the code doesn’t need to change.

As shown below:



It’s also worth noting that the JVM can be cross-platform as well as cross-lingual, as long as code written in another language ends up being translated into something that the JVM can read.classFile, then the JVM can still execute it.

1.1 The relationship between JDK, JRE and JVM

The JVM belongs to the JRE, and we have learned at the beginning of learning Java: JDK, JRE and JVM several nouns, so what is the relationship between them?

  • JDK(Java Development Kit) : A Java SDK Development Kit, including the toolkit, JRE, and JVM
  • Java Runtime Environment (JRE) : A Java Runtime Environment, including the JavaAPI and JVM, that provides the minimum Environment requirements for Java program execution
  • Java Virtual Machine (JVM) : A Virtual computer platform based on the Java architecture, which ensures the execution of Java bytecode

In terms of coverage, the JDK contains the JRE, and the JRE contains the JVM, which is the platform on which Java programs are executed.

1.2. Two architecture models of virtual machine and JVM architecture model

At present, there are two main architecture models of virtual machine, one is based on stack, the other is based on register. In terms of performance, the register-based model will be better than the stack-based virtual machine, but in terms of portability, the stack virtual machine is much better than the register-based virtual machine.

What is stack-based virtual Machine?

In stack-based mode, a VM executes instructions in a stack-based instruction set, and pushes the instructions to be executed one by one on the stack. The operations include pushing and removing instructions from the stack. The concept of an operand stack exists in a stack-based VIRTUAL machine. Virtual machines operate through the operand stack and interact with memory in real operations. Simply speaking, all operations need to be carried out through the operand stack. The biggest advantage of this model of virtual machine is that it can ignore the hardware, physical architecture. Of course, the disadvantage is also obvious, because no matter what operation goes through the operand stack, the performance will be lower.

Virtual machines such as the Java JVM, Python’s CPython, and.NET’s CLR are all based on this stack model.

To get a quick feel for the stack-based computing of virtual machines:

/* ------Java code ------ */
int a = 3;
int b = 2;
int c = a + b;

/* ------javap -c -v -p bytecode ------ */
0: iconst_3 // Put 3 at the top of the operand stack
1: istore_1 // Write out the top element of the operand stack and place it at index 1 in the local variable table
2: iconst_2 // Put 2 at the top of the operand stack
3: istore_2 // Write out the top element of the operand stack and place it at index 2 in the local variable table
4: iload_1  // Load the value of index position =1 from the local variable table
5: iload_2  // Load the value of index position =2 from the local variable table
6: iadd     // Pop two elements on top of stack and add (3 + 2)
7: istore_3 // Write the added result at index 3 in the local variation table
8: return   / / return
Copy the code

For the above process, the first four allocation instructions will not be analyzed, and the following operation process will be analyzed, that is, C =a+ B. The specific execution is as follows:

  • (1) dataaThe bus transfers from the local table of variables to the operand stack
  • (2) databThe bus transfers from the local table of variables to the operand stack
  • (3) dataaFrom the operand stack across the bus toCPU
  • (4) databFrom the operand stack across the bus toCPU
  • 5.CPUWhen the calculation is complete, the results are transferred to the operand stack over the data bus
  • ⑥ The result of the operation is transferred from the operand stack to the busCPU
  • All landownersCPUTransfer the data over the bus to the local variable table to assign values toc

As you can see from the above steps, a simple addition operation requires seven bus transfers, which are much less efficient because all operations go through the operand stack.

What is a register-based virtual machine?

Register-based virtual machines have many concepts of virtual registers, which are used to simulate real PC registers in the CPU, but they often exist under aliases such as R1, R2, R3…. And so on, the execution engine needs to parse these aliases, find the location of the operands, and then extract the operands for operation. These virtual registers do not reside directly in the CPU, but, like the operand stack, reside in the runtime stack, where all virtual registers are stored in an array (contiguous memory space in the runtime stack frame).

Virtual machines such as Dalvik of Android and Register-based of Lua5.0 are implemented based on register model.

As before, a simple c=a+ B example is used to understand this kind of register-based virtual machine operation:

// -------- Lua code ----------
a = 2;
b = - 1;
c = a + b;

// -------- Register instruction --------
LW R1,a
LW R2,b
ADD R3,R1,R2
Copy the code

LW stands for load. Data A and B are loaded into virtual registers R1 and R2. Again, let’s focus on the process:

  • (1) dataafromR1The virtual register is transferred over the bus toCPU
  • (2) databfromR2The virtual register is transferred over the bus toCPU
  • 3.CPUfor+After calculation, the data is written back to the virtual register by busR3In the

From the above process, it is not difficult to see that the virtual machine based on the register model does not need to transfer the data in the virtual register to the operand stack and then to the bus operation, but can directly send the data into the physical CPU. Compared to the previous stack based virtual machine, register model virtual machine will obviously be more advantageous in terms of performance. Because virtual machines in this model do not have the concept of operand stack, they can save a lot of data out /push (POP /push) instructions during operation, which means less bus transmission times during data transmission.

In general, the stack implementation of virtual machines, because of the operand stack operation, will have a lot of data on/off the stack instructions, the number of instructions will be much more than the register model of virtual machines. Registrie-based instructions are delivered directly to the CPU for execution in the cache (L1/L2/L3), which is better in terms of performance. But because register-based instruction sets are tightly tied to the hardware architecture, the instructions will be slightly different for different CPU hardware architectures, so they are not portable.

Java VIRTUAL machine JVM is based on the stack model implementation, because of the existence of the operand stack, so does not need to rely on hardware support, so has excellent cross-platform. Because all operations need to go through the operand stack and then interact with the memory, the execution efficiency is not that efficient, and the performance will be slow. But in the current mainstream JVM, use the JIT compile this kind of instant and hot spot detection technique, we can often run some of the Java code directly into mechanical code (mechanical) save, after every time they perform these hot code is not run by way of bytecode again, but direct operation by means of mechanical instruction. This allows Java to be cross-platform while maintaining performance.

1.3 Java Virtual Machine lifecycle

The JVM lifecycle is typically divided into three phases: startup, running, and exit:

  • Activation:BootstrapThe loader creates the initial class, initializes the JVM, and creates a process
  • Run: The normal phase of execution after the process is created. A Java program is actually a JVM process
  • Exit: The process terminates in the following situations:
    • ① The program exits after normal execution
    • ② An error occurs during the execution, resulting in abnormal termination and exit
    • ③ Operating system error leads to program termination and exit, such as insufficient memory, system crash, etc
    • The use of code in the program manual exit, such as callSystem.exitorRuntime.exit/haltMethods etc.
    • ⑤ Exit when loading or uninstalling VMS in JNI

The JVM life cycle is essentially the same as that of a normal process, because the JVM is only a process to the operating system, and the JVM is treated “equally” with other programs’ processes.

Java Virtual Machine: History of the JVM

When talking about the JVM, there are many topics such as JVM memory partitioning, GC generation/recycle strategy, AND JVM tuning, all of which are built on the HopSpot virtual machine by default because it is the default virtual machine embedded in the JDK downloaded from Oracle’s official website.

But in fact, you can also develop JVM, as long as it conforms to the official specification document of Oracle virtual machine, such as some big manufacturers at home and abroad ali, IBM, Google, etc., the production environment of the application of JVM belong to the research.

From the emergence of the Java language until today, there have been many classic virtual machine emergence and annihilation, let us take a look at the development trajectory of the Java virtual machine family.

2.1. Start the Century – Sun Classic VM

  • Release date: January 1996
  • Developed and produced by: Sun
  • Attributes: Pioneer Century (released as the world’s first Java commercial Virtual machine)

The virtual machine does not work with a hybrid interpreter and compiler, meaning that it will either run with a slow interpreter or a compiler with a long compilation process. In this virtual machine, only the interpreter is officially provided, and if you want to use the compiler, you need to implement it independently.

2.2 hero short of breath – Sun Exact VM

  • Release date: December 1998
  • Developed and produced by: Sun
  • Attributes: Hero shortness of breath (sets the stage for subsequent virtual machines and is a connecting character)

This virtual machine is released by Sun Company in JDK1.2, the main purpose is to solve various hidden problems existing in Sun Classic VM. At the same time, this virtual machine is also a preliminary prototype of modern high performance virtual machine, which for the first time realizes two levels of real-time compiler, compiler + interpreter mixed working model two ideas. Exact memory management allows the VIRTUAL machine to know exactly what type of data is stored at a certain location in memory, which greatly improves GC collection efficiency. However, it is interesting that this virtual machine is much more advanced than the previous ClassicVM, but it was soon replaced by the better HotSpotVM, which is really short of breath.

2.3 the world’s fastest – BEA JRockit VM

  • Release date: February 2001
  • Developed and produced by: BEA Company
  • Properties: World’s fastest (speed-focused virtual machine with response times down to the millisecond/subtle level)

The virtual machine was officially released by BEA in February 2001. It focused on the server application and did not pay much attention to the startup speed of the application. All the code was compiled and executed. It also provides a comprehensive Java runtime solution: The JRockit Real Time component, which can accurately control the response speed to the millisecond level, and the MissionControl component, which has good support for financial, military command, telecommunications network and other types of projects, is a service component. Java programs in production can be managed, monitored, and analyzed with minimal resource overhead. In 2008, BEA was acquired by Oracle, and HotSpot in JDK1.8 ported the best features of JRockit. Now it’s cannon fodder, lost in the long river of history.

2.4 martial Overlord – Sun HotSpot VM

  • Date of issue: May 2000
  • Developed and produced by: Sun
  • Properties: Virtual machine one (JDK default, best overall performance, currently the most widely used server virtual machine)

The virtual machine was not originally developed by Sun, but was designed by LongviewTechnologies as a non-java application virtual machine. It was acquired by Sun in 1997, integrating its excellent features (JIT compilation, hot code detection, etc.). The result was HotSpotVM. The greatest characteristic of the virtual machine can be by performing a counter, has compiled in Java program running process to find the most value of the code (code execution times enough), and then inform the JIT compiler to compile, through the compiler and interpreter work together, in the application response time and achieve the best balance between performance, This avoids the JRockit compile time problem and solves the Classic virtual machine execution problem.

This virtual machine is the most widely used server and also the most special virtual machine. Oracle acquired BEA and Sun in 2008 and 2009, respectively, and got two excellent Java virtual machines: JRockit and HotSpot, and Oracle has ported and combined the best of both virtual machines to create a better HotSpot VM, which is embedded in JDK1.8 or better known as “HotRockit”.

2.5. Stand out — IBM J9 VM

  • Release date: September 2017
  • Developed and produced by: IBM
  • Attributes: Unique (IBM related software, hardware, platform bundled release of virtual machines)

The IBM J9 VM is marketed in a similar way to HotSpot, but the J9 is a multi-purpose VIRTUAL machine designed with everything from servers to desktop applications to embedded. However, due to the agreement with Oracle, IBM does not allow individual J9 virtual machines to be distributed. Therefore, J9 virtual machines are generally used by IBM’s own products or platforms, such as IBM WebSphere products and Java programs deployed on AIX/Z operating systems. So J9 was normally used as a virtual machine for IBM internal applications, but was made an open source contribution to the Ecalipse Foundation in September 2017, named OpenJ9.

2.6 Performance Warcraft – Azul Zing VM

  • Release date: 2010
  • Developed and produced by: Azul Systems
  • Property: Performance Warcraft (can manage TB heap memory, low latency, high throughput, the most fluent GC VM)

Azul Zing VM was a fork from HotSpot VM before Sun was acquired, but after years of internal research, many details of the implementation of the VM were modified, and a GC was written for it to support several terabytes of heap space management. Performance monitoring tools such as ZVision/ZVRobot are also available to facilitate user monitoring of the running status of the JVM, such as lock status contention, object allocation process, hot code detection and other details can be monitored. In general, Zing has high throughput, low latency, fast warm-up and easy to monitor, but unfortunately it costs money and it doesn’t come cheap. There is, however, an open source version: the Zulu VM.

2.7 stillborn – Apache Harmony VM

  • R&d and Production: Apache Foundation
  • Property: Dead In the womb (worst VIRTUAL machine in Java history)

This virtual machine is more interesting, its own background is relatively strong, typical JVM family “rich second generation”, IBM and Intel open source JVM, behind the world’s two top factories support. The virtual machine itself is expected to become the mainstream OF F/OSS system, but it has no choice but to be cheated by Sun Company, and finally in the womb. Harmony is an open source implementation of Java, but due to licensing disputes, the Harmony project could never call itself a Java Virtual machine or an implementation of the Java core language library. The Harmony project has also been working hard to obtain the JCK license from the JCP (Java Community Process). However, due to Sun’s attitude, the JCP only allowed Harmony to be licensed to a TCK with a condition that it could only be used on J2SE, not all Java implementations (including J2ME and J2EE). The two have been in conflict since the beginning. This conflict continued when Oracle took over Java. Harmony, which had a “bright future”, was finally dragged to death by Sun’s shady hand. In December 2010, the Apache Foundation, dissatisfied with the JCP organization’s handling, announced its withdrawal from the JCP organization.

2.8. Novelty – Kilobyte VM and CDC/CLDC HotSpot VM

Kilobyte VM for KVM, CDC/CLDC is HotSpot counterpart to it, a version of the JavaME product line. The two VMS are different from other VMS. The previous VMS have an executable platform based on an OPERATING system (OS), whereas the KVM is a kernel VM that can directly connect to or control a keyboard to execute programs. Mainly used in smart controllers, sensors, elderly/button mobile phones and other products.

2.9. A Hundred Schools of Thought – Other virtual machines

During the high growth phase of Java (1990-2008 or so), enterprises implemented their own virtual machines. During this period, there were many excellent virtual machines, but when Orcale acquired Sun and BEA, the period of contention for Java virtual machine came to an end, and finally, Orcale’s “HotRockit” version of HotSpot, which integrated HotSpot and JRockit’s good genes, became the dominant one. Until now, there weren’t many virtual machine types to choose from, but the ones that survived so far had their advantages. Here are a few more virtual machines that have emerged (not all of them obsolete, but some are still in use) :

  • TaoBao VM: A customized ultra high performance VIRTUAL machine developed by TaoBao based on HotSpot
  • Dalvik VM: A Virtual machine developed by Google for android that does not follow the Java VIRTUAL machine specification
    • Before Android 5.0, the dex file is generated after compilation, based on the register model
    • After Android 5.0, Google released a new virtual machine, ART VM to replace it
  • Lemur VM: a virtual machine developed by Ali Cloud that is compatible with Both Android and Java applications
  • Squawk VM: A virtual machine previously developed by Sun for small embedded environments
  • Liquid VM: A VM developed by BEA that controls the hardware directly to avoid switching from user mode to kernel mode
  • Microsoft VM: A virtual machine developed by Microsoft to support the running of Java applications in IE3
  • Jikes RVM: A meta-cyclic virtual machine developed by IBM for studying JVMS (Academic: for research)
  • Maxine VM: Metacycle virtual machine developed by Oracle and former Sun employees, also serves the same purpose
  • Grall VM: Oracle’s latest open source, full-stack, cross-platform virtual machine that can be used by execution platforms in any language
  • .

We mentioned one concept above: meta-cyclic virtual machine. What does that mean? Metacyclic virtual machines refer to virtual machines written in the language itself, that is, JVMS developed in Java.

Ok ~, the development history of the JVM family is now temporarily stopped, relatively speaking, currently in the J2EE field is the absolute position is HotSpot, and in the J2ME field is the leader is J9, there is no doubt.

In fact, from the latest Grall virtual machine released by Oracle, the future of virtual machine is definitely towards the goal of more precise modularization, stronger GC mechanism, mixed multi-language execution and multi-core parallel execution.

Java Virtual Machine JVM architecture

Java virtual machines are generally composed of multiple modules, which can be divided into a classloading subsystem, an execution engine subsystem, a runtime data area, a garbage collection subsystem, a local interface, and a local method library. In the process of Java program startup/execution, these modules play their respective roles and cooperate with each other to form the entire Java execution platform – JVM.

The categories listed here cover the entire JVM as a whole, not a single dimension, and are covered briefly in this article and in more detail in subsequent articles.

3.1 Class loading subsystem

The class loading subsystem is used to load compiled class files, but it is only responsible for loading the formatted class bytecode information into memory. As long as the formatted class file can be loaded, it is not responsible for whether the loaded class file can be executed. This is within the scope of the execution engine subsystem.

In the class loading subsystem, the core knowledge points are divided into three parts: class loader, parent delegate model and class loading process.

3.2 Execution engine subsystem

The execution engine subsystem acts as a “translator” for the JVM, “translating” in-memory class bytecode instructions into machine language for hardware execution. Bytecode can be translated into mechanical instructions by both interpreter and even compiler. In the HotSpot VIRTUAL machine, a hybrid interpreter and even compiler is used.

3.3. Runtime data area

The runtime data area is the focus of the entire JVM, where all code written by the developer is ultimately loaded before execution. This area is also one of the reasons why Java avoids the need for manual memory management that C does.

It is also one of the few areas in the JVM “black box” that developers have access to.

Java runtime data area can be divided into PC program counter, local method stack, virtual machine stack, metadata space (method area) and heap space, which is also the core content of subsequent analysis. However, the Java Virtual Machine specification defines it as abstract, and the details of the runtime data structure are not specified in the JVM specification, leaving most of the decisions to be made by the JVM implementation vendor.

3.4 garbage collection subsystem

The garbage collection subsystem is not explicitly required in the Java Virtual Machine specification, but before the invention of unlimited memory hardware, most JVM implementations had garbage collection modules. The main task of GC module is to help Java program manage memory. The GC system is responsible for the clear of garbage objects, the management of living objects and the recovery of memory fragments. At the same time, the final reason why Java can avoid manual memory management like C is also due to the GC mechanism, which automatically manages the memory for the developer, thus shielding the developer from direct contact with the memory.

The JVM’s GC mechanism focuses on three areas: garbage collector, garbage collection algorithm, and GC tuning.

3.5. Local Interfaces and local method libraries

The purpose of the native method interface is to integrate different programming languages for Use in Java. Its original purpose is to integrate C/C++ programs. Since Java was born at a time when C language was rampant, it was necessary to have a module that could call C code to get a foothold, so a special area of memory was set up to handle methods marked as native. When it is executed, the function of the local method interface is to register the local method called in the local method stack, and load the NATIVE method written by C from the local method library when the execution engine processes it. In short, a native method interface is a Java interface that calls non-Java code, which generally refers to functions in native method libraries written in C.

OK~, so far a simple analysis of the ARCHITECTURE of the JVM, let you have an understanding of the JVM as a whole.

The JVM is an architecture-on-platform platform that is divided into modules that work together to execute Java code as the virtual machine runs. It is important to note, however, that while there is a lot of knowledge in the JVM, very few areas of the development process can be directly touched by the developer, and often the so-called JVM tuning is only optimized for GC and runtime data areas. Of course, it also includes JVM configuration optimization, depending on the machine, project, at startup, specify the corresponding parameters to enable the best configuration.

HotSpot VIRTUAL machine source code

Since most deployed Java applications currently use the HotSpot Virtual machine by default, most of the upcoming articles will be based on the HotSpot Virtual machine as well. If you are interested in the source code of open-JDk1.8, you can also go to the Oracle official website to download, but the official website will often be disconnected, so attach the network disk resource link/extract code: M3GM (note: download is open-source JDK1.8). At the same time, the source directory structure is as follows:

├─ SRC Source code │ ├─ CPU ├─ Heavy Metal Exercises for Heavy Metal Exercises │ ├─ ├─ exercises, ├─tools, ├─ exercises, ├─ exercises, exercises, exercises, exercises, exercises, exercises │ │ ├─ Heavy Metal Flag School for the Server compiler. It's a great tool for you to use. │ │ ├─ Java │ ├─LogCompilation Will - XX: + LogCompilation output log (hotspot) log) into more easy to read format tool │ │ └ ─ ProjectCreator tool generate Visual Studio project files │ └ ─ hotspot vm │ ├─ API │ ├─ C1 Client Compiler │ ├─ CI Dynamic Compiler public service/Interface to VM │ ├─ ├─ GC_Implementation GC │ │ ├─ Code management │ ├─ Compiler to call dynamic compiler from VM │ ├─ GC_Implementation GC │ │ ├─ GC_Implementation GC │ ├─ Heavy Metal Exercises for G1 Garbage-First GC. ├─parallelScavenge Implementation of the ParallelScavenge Avenge (Server VM default, Do not use the old generational GC framework) │ │ ├ ─ parNew parNew GC implementation │ │ └ ─ Shared common GC implementation │ ├ ─ gc_interface GC interface │ ├ ─ interpreter interpreter, │ ├─ ─ Memory ├─ Object system implementation of the Oops HotSpot VM │ ├─ Oops HotSpot VM │ ├─ Object System implementation of the Oops HotSpot VM │ ├─ Oops HotSpot VM │ ├─ Heavy Metal Flag Exercises for Prims HotSpot VM, │ ├─ Services │ ├─ Shark │ ├─ Shark │ ├─ Services │ ├─ Shark │ ├─ Shark │ ├─ Shark │ ├─ Shark │ ├─ Shark │ ├ ─ garbage, ├ ─test │ ├ ─ garbage ├ ─test - An LLVM-based JIT compiler (not used in the official versionCopy the code

This is what HotSpot’s source code is all about.

Five, the summary

This chapter gives a brief overview of what the JVM is made of, without going into every detail. The purpose of this chapter is to give a brief overview of the JVM and continue to do so in subsequent articles.