Note source: Silicon Valley JVM complete tutorial, millions of playback, the peak of the entire network (Song Hongkang details Java virtual machine)

Update: gitee.com/vectorx/NOT…

Codechina.csdn.net/qq_35925558…

Github.com/uxiahnan/NO…

[TOC]

4. Vm stack

4.1. Overview of virtual Machine stack

4.1.1. Background of the emergence of the virtual machine stack

Due to its cross-platform design, Java’s instructions are designed on a stack basis. Different platforms have different CPU architectures, so they cannot be register-based.

Advantages are cross-platform, small instruction set, compiler easy to implement, disadvantages are performance degradation, to achieve the same function requires more instructions.

4.1.2. Preliminary impressions

There are many Java developers who, when it comes to Java memory structures, have a very coarse-grained understanding of the area of memory in the JVM as just the Java heap and the Java stack. Why is that?

4.1.3. Stacks and heaps in memory

The stack is the unit of runtime and the heap is the unit of storage

  • The stack takes care of the execution of the program, that is, how the program executes, or how it processes data.
  • The heap solves the problem of data storage: where and where does the data go

4.1.4. Basic contents of virtual machine stack

What is a Java virtual machine stack?

Java Virtual Machine Stack (Java Virtual Machine Stack), also known as the Java Stack. When each thread is created, it creates a virtual Stack, which holds Stack frames that correspond to each Java method call. This Stack Frame is thread private.

The life cycle

Lifecycle and thread consistency

role

Manages the execution of a Java program, which holds local variables of a method, partial results of a method, and participates in method calls and returns.

The characteristics of the stack

Stack is a fast and efficient way to allocate storage, access speed second only to sequential counter.

There are only two direct JVM operations on the Java stack:

  • Each method executes with a push (push, push)
  • Exit the stack after execution

There is no garbage collection problem for stacks (stacks overflow)

Interview question: What are the anomalies encountered during development?

Possible exceptions in the stack

The Java Virtual Machine specification allows the size of the Java stack to be dynamic or fixed.

  • With a fixed size Java virtual machine stack, the size of the Java virtual machine stack for each thread can be selected independently at thread creation time. The Java virtual machine will throw a StackOverflowError if the thread request allocates more stack capacity than the maximum allowed by the Java virtual machine stack.

  • An OutOfMemoryError will be thrown if the Java virtual machine stack can be dynamically extended and cannot allocate enough memory when attempting to extend it, or if there is not enough memory to create the corresponding virtual machine stack when creating a new thread.

public static void main(String[] args) {
    test();
}
public static void test(a) {
    test();
}
/ / an Exception is thrown, the Exception in the thread "is the main" Java. Lang. StackoverflowError
// The program is constantly making recursive calls with no exit condition, resulting in continuous stack loading.
Copy the code

Set stack memory size

We can use the -xSS option to set the maximum stack space for a thread. The stack size directly determines the maximum reachable depth of a function call

public class StackDeepTest{ 
    private static int count=0; 
    public static void recursion(a){
        count++; 
        recursion(); 
    }
    public static void main(String args[]){
        try{
            recursion();
        } catch (Throwable e){
            System.out.println("deep of calling="+count); e.printstackTrace(); }}}Copy the code

4.2. Storage unit of stack

4.2.1. What is stored in the stack?

Each thread has its own Stack, and the data in the Stack is stored in the format of Stack frames.

Each method being executed on this thread corresponds to a Stack Frame.

A stack frame is a block of memory, a data set that holds various data information during the execution of a method.

4.2.2. Stack operation principle

The only two operations that the JVM does directly to the Java stack are to push and unload the stack frame, following the “in, out”/” last in, first out “principle.

In an active thread, there is only one active stack frame at a time. That is, only the stack Frame (top stack Frame) of the currently executing Method is valid. This stack Frame is called the Current Frame. The corresponding Method of the Current Frame is the Current Method, and the Class defining this Method is the Current Class.

All bytecode instructions run by the execution engine operate only on the current stack frame.

If another method is called in this method, a new stack frame is created and placed at the top of the stack as the new current frame.

The stack frames contained in different threads are not allowed to reference each other, that is, it is impossible to reference another thread’s stack frame in one stack frame.

If the current method calls another method, when the method returns, the current stack frame will return the execution result of this method to the previous stack frame, and then the virtual machine will discard the current stack frame, making the previous stack frame become the current stack frame again.

Java methods have two ways of returning functions. One is to return a normal function using a return directive. The other is to throw an exception. Either way, the stack frame will be ejected.

public class CurrentFrameTest{
    public void methodA(a){system. Out. Println ("Current stack frame corresponding method ->methodA"); methodB(); System. Out.println ("Current stack frame corresponding method ->methodA");
    }
    public void methodB(a){System. Out. Println ("Current stack frame corresponding method ->methodB");
    }
Copy the code

4.2.3. Internal structure of stack frames

Each stack frame stores:

  • Local Variables
  • Operand Stack (or expression Stack)
  • DynamicLinking (or method reference pointing to the runtime constant pool)
  • Method Return Address (or definition of method normal exit or abnormal exit)
  • Some additional information

Parallel the stacks under each thread are private, so each thread has its own stack, and there are many stack frames within each stack. The size of stack frames is mainly determined by the local variable table and operand stack

4.3. Local Variables

A local variable list is also called a local variable array or a local variable list

  • An array of numbers used to store method parameters and local variables defined in the body of a method. These data types include various basic data types, object references, and returnAddress types.

  • Since the local variable table is built on the stack of the thread, it is the thread’s private data, so there is no data security problem

  • The size required by the local variables table is determined at compile time and stored in the Maximum Local Variables data item in the Code attribute of the method. The size of the local variable scale does not change during method execution.

  • The number of nested calls to a method is determined by the stack size. In general, the larger the stack, the more nested method calls. For a function, the more parameters and local variables it has, which causes the local variable table to swell, the larger its stack frame will be to meet the need for more information to be passed through method calls. In turn, function calls take up more stack space, resulting in fewer nested calls.

  • Variables in the local variable table are only valid in the current method call. During method execution, the virtual machine passes the parameter values to the parameter variable list using the local variable table. When the method call ends, the local variable table is destroyed along with the method stack frame.

4.3.1. Understanding of Slot

  • Local variable scale. The most basic storage unit is Slot.

  • Parameter values are always stored at index0 of the local variable array and end at the index of the array length -1.

  • Local variable table stores various basic data types (8 kinds), reference types and variables of returnAddress type known at compilation time.

  • In the local variable table, types up to 32 bits occupy only one slot (including the returnAddress type), and 64-bit types (long and double) occupy two slots.

  • Byte, short, and char are converted to int before storage. Boolean is also converted to int, where 0 means false and non-0 means true.

  • The JVM assigns an access index to each Slot in the local variable table, which successfully accesses the value of the local variable specified in the local variable table

  • When an instance method is called, its method parameters and local variables defined inside the method body are copied to each slot in the local variable table in sequence

  • If you need to access the value of a 64bit local variable in the local variable table, you only need to use the previous index. (For example, accessing a long or doub1e variable)

  • If the current frame is created by a constructor or instance method, the object reference to this will be placed in slot with index 0, and the rest of the arguments will continue in the argument list order.

4.3.2. Slot reuse

The slots in the local variable table in the stack frame can be reused. If a local variable goes out of its scope, the new local variable declared after its scope is likely to reuse the slots of the expired local variable, so as to achieve the purpose of saving resources.

public class SlotTest {
    public void localVarl(a) {
        int a = 0;
        System.out.println(a);
        int b = 0;
    }
    public void localVar2(a) {{int a = 0;
            System.out.println(a);
        }
        // The slot of A will be reused
        int b = 0; }}Copy the code

4.3.3. Static versus local variables

After the parameter list is allocated, it is allocated according to the order and scope of the variables defined in the method body.

We know that the class variable table has two chances to be initialized. The first is during the “preparation phase”, when system initialization is performed and the class variable is set to zero, and the second is during the “initialization” phase, when the programmer is given the initial values defined in the code.

Unlike class variable initialization, there is no system initialization for a local variable table, which means that once a local variable is defined, it must be initialized manually or it cannot be used.

public void test(a){
    int i;
    System. out. println(i);
}
Copy the code

Such code is incorrect and cannot be used without assignment.

4.3.4. Supplementary notes

The part of the stack frame that is most relevant for performance tuning is the local variable table mentioned earlier. When a method executes, the virtual machine uses a local variable table to complete the method’s delivery.

Variables in the local variable table are also important garbage collection root nodes, as long as objects referenced directly or indirectly in the local variable table are not collected.

4.4. Operand Stack

Each individual Stack frame contains a last-in-first-out operand Stack, also known as the Expression Stack, In addition to the local variable table.

Operand stack, during the execution of a method, data is written to or extracted from the stack according to bytecode instructions, i.e. push and pop.

  • Some bytecode instructions push values onto the operand stack, while others push operands off the stack. Use them and push the results onto the stack
  • For example, copy, swap, and sum operations are performed

The code for

public void testAddOperation(a){
    byte i = 15; 
    int j = 8; 
    int k = i + j;
}
Copy the code

Bytecode instruction information

public void testAddOperation(); 
    Code:
    0: bipush 15
    2: istore_1 
    3: bipush 8
    5: istore_2 
    6:iload_1 
    7:iload_2 
    8:iadd
    9:istore_3 
    10:return
Copy the code

Operand stack, mainly used to store the intermediate results of the calculation process, and as a temporary storage space for variables during the calculation process.

The operand stack is a workspace of the JVM execution engine. When a method is first executed, a new stack frame is created. The operand stack of this method is empty.

Each stack of operands has an explicit stack depth for storing values. The maximum depth required is defined at compile time and stored in the method’s Code property as the value of max_stack.

Any element in the stack is a Java data type that can be arbitrary

  • 32-bit types occupy one stack unit depth
  • 64-bit types occupy two stack units of depth

The operand stack does not access the data by accessing the index, but only once through the standard push and push operations

If the called method has a return value, the return value is pushed into the operand stack of the current stack frame and updates the NEXT bytecode instruction to be executed in the PC register.

The data types of the elements in the operand stack must exactly match the sequence of bytecode instructions, which is verified by the compiler during compiler time and again during data flow analysis during class validation during class loading.

In addition, we say that the Interpretation engine of the Java Virtual machine is a stack-based execution engine, where the stack refers to the operand stack.

4.5. Code tracing

public void testAddOperation() {
    byte i = 15;
    int j = 8;
    int k = i + j;
}
Copy the code

Decompile the class file using the javap command: javap -v Class name. Class

public void testAddoperation(a); 		Code:	0: bipush 15 	2: istore_1 	3: bipush 8	5: istore_2	6: iload_1	7: iload_2	8: iadd	9: istore_3    10: return
Copy the code

The difference between I ++ and ++ I, common in programmer interviews, will be covered in the bytecode chapter.

4.6. Top Of Stack Cashing technology

As mentioned earlier, stack-based virtual machines use more compact zero-address instructions, but completing an operation requires more loading and unloading instructions, which means more instruction dispatches and memory reads/writes.

Because operands are stored in memory, frequent memory read/write operations inevitably affect execution speed. In order to solve this problem, the designers of HotSpot JVM proposed Tos (top-of-stack Cashing) technology, which caches all the top-of-stack elements in the registers of the physical CPU to reduce the number of reads/writes to memory and improve the execution efficiency of the execution engine.

4.7. Dynamic Linking

Dynamic linking, method return address, additional information: some areas are called frame data areas

Each stack frame contains an internal reference to the method that the stack frame belongs to in the runtime constant pool. The purpose of including this reference is for the code supporting the current method to achieve Dynamic Linking. For example, an Invokedynamic instruction

When a Java source file is compiled into a bytecode file, all variable and method references are kept as Symbolic references in the class file’s constant pool. For example, describing a method that calls another method is represented by symbolic references to the method in the constant pool, so dynamic linking is used to convert these symbolic references to direct references to the calling method.

Why do we need a run-time constant pool?

Constant pool is used to provide symbols and constants for instruction identification

4.8. Method invocation: parse and assign

In the JVM, the conversion of symbolic references to direct references to calling methods is related to the method binding mechanism

4.8.1. Static linking

When a bytecode file is loaded into the JVM, if the target method being called is known at compile time and the runtime remains the same, the process of converting the symbolic reference of the calling method into a direct reference is called static linking

4.8.2. Dynamic linking

If the method to be called cannot be determined at compile time, only the symbol of the method to be called can be converted into a direct reference during the program run time. Because this reference conversion process is dynamic, it is also called dynamic linking.

Static and dynamic links are not nouns, but verbs, which is key to understanding.


The Binding mechanism of the corresponding method is: Early Binding and Late Binding. Binding is the process by which a symbolic reference to a field, method, or class is replaced with a direct reference, which happens only once.

4.8.3. Early binding

Early binding is invoked if the target method at compile time, and the run-time remains the same, this method can be bound with subordinate type, as a result, due to clearly define the target method is called which one on earth is, therefore, you can use the static link way of converting symbols refer to reference directly.

4.8.4. Late binding

If the method to be called cannot be determined at compile time, only the related method can be bound according to the actual type at run time, which is called late binding.


As a high-level language, similar to the Java based on object oriented programming language nowadays more and more, even though this type of programming language in grammar has certain difference on the style, but they always maintained a commonality among each other, that’s all support object-oriented features such as encapsulation, inheritance, and polymorphism, since this kind of programming languages have polymorphism, sad, Naturally, there are two types of binding: early binding and late binding.

Any ordinary method in Java has the characteristics of virtual functions, which are the equivalent of virtual functions in C++ (which are explicitly defined using the keyword virtual). If you do not want a method to have the characteristics of a virtual function ina Java program, you can mark the method with the keyword final.


4.8.5. Virtual and non-virtual methods

If a method is called at compile time, that version is immutable at run time. Such methods are called non-virtual methods.

Static methods, private methods, final methods, instance constructors, and superclass methods are all non-virtual. Other methods are called virtual methods.

This can be done during the parsing phase of class loading, as shown in the following non-virtual method example:

class Father{    public static void print(String str){        System. out. println("father "+str);     }    private void show(String str){        System. out. println("father"+str);    }}class Son extends Father{    public class VirtualMethodTest{        public static void main(String[] args){            Son.print("coder");            //Father fa=new Father(); //fa.show("atguigu.com"); }}
Copy the code

The virtual machine provides the following method call instructions:

Ordinary call instructions:

  • Invokestatic: Invokes static methods, and the parsing phase determines the unique method version
  • Invokespecial: Call method, private and parent methods. The parsing phase determines the unique method version
  • Invokevirtual: Calls all virtual methods
  • Invokeinterface: Invokes interface methods

Dynamic call instruction:

  • Invokedynamic: Dynamically resolves the method to be invoked and executes it

The first four instructions are fixed inside the virtual machine, and the method invocation is performed without human intervention, whereas the InvokeDynamic instruction allows the user to determine the method version. Among them, invokestatic and Invokespecial are called non-virtual methods, and the rest (except finA1 modification) are called virtual methods.

About the Invokednamic instruction

  • The JVM bytecode instruction set was relatively stable until Java7 added an InvokeDynamic instruction, an improvement Java made to achieve “dynamically typed language” support.

  • In Java7, however, there is no way to generate invokedynamic instructions directly. You need to use ASM, the underlying bytecode tool, to generate invokedynamic instructions. Until the advent of Java8’s Lambda expressions, the generation of invokedynamic instructions, there was no direct generation in Java.

  • The dynamic language type support added in Java7 is essentially a modification of the Java virtual machine specification, not a modification of the Java language rules. This is a relatively complex area, and the addition of method calls in the virtual machine will most directly benefit the dynamic language compiler running on the Java platform.

Dynamically typed and statically typed languages

The difference between a dynamically typed language and a statically typed language is whether the type checking is done at compile time or at run time. A statically typed language is a dynamically typed language.

To put it more bluntly, statically typed languages judge the type information of variables themselves; Dynamically typed language is the type information used to judge the value of a variable. The value of a variable has type information only when there is no type information, which is an important feature of dynamic language.

4.8.6. The nature of method rewriting

The nature of method rewriting in the Java language:

  1. Find the actual type of the object executed by the first element at the top of the operand stack and call it C.
  2. If a method is found in type C that matches both the description and the simple name in the constant, the access permission is checked. If it passes, a direct reference to the method is returned, and the search process ends. If not through, it returns the Java. Lang. IllegalAccessError anomalies.
  3. Otherwise, search and verify step 2 for each parent class of C from bottom to top according to inheritance relationship.
  4. If didn’t find the right way, it throws the Java. 1 ang. AbstractMethodsrror anomalies.

IllegalAccessError is introduced

The program attempts to access or modify a property or call a method that you do not have access to. Normally, this will cause a compiler exception. This error, if it occurs at run time, indicates an incompatible change to a class.

4.8.7. Method invocation: Virtual method table

In object-oriented programming, dynamic dispatch is frequently used, and it may affect the execution efficiency if you have to search for the appropriate target in the method metadata of the class during each dynamic dispatch. Therefore, to improve performance, the JVM implements this by creating a virtual Method table in the method section of the class (non-virtual methods do not appear in the table). Use index tables instead of lookups.

Each class has a virtual method table that holds the actual entry to each method.

When was the virtual method table created?

The virtual method table is created and initialized during the linking phase of the class load, and the JVM initializes the method table for that class after the class’s variable initializers are ready.

Example 1:

Example 2:

interface Friendly{    void sayHello(a);    void sayGoodbye(a); }class Dog{    public void sayHello(a){}public String tostring(a){        return "Dog";    }}class Cat implements Friendly {    public void eat(a) {}public void sayHello(a) {}public void sayGoodbye(a) {}protected void finalize(a) {}}class CockerSpaniel extends Dog implements Friendly{    public void sayHello(a) {         super.sayHello();    }    public void sayGoodbye(a) {}}Copy the code

4.9. Method Return address

Holds the value of the PC register that called the method. There are two ways to end a method:

  • Normal Execution Completed
  • Unhandled exception, abnormal exit

Either way, the method is returned to where it was called after it exits. When a method exits normally, the value of the caller’s PC counter is returned as the address of the next instruction that calls the method. However, if an exception exits, the return address is determined by the exception table, which is generally not stored in the stack frame.

When a method is executed, there are only two ways to exit the method:

  1. When the execution engine encounters a bytecode instruction (return) from any method, the return value is passed to the upper level method callerNormal exit completion;
    • Which return instruction to use after a method is normally called depends on the actual data type of the method’s return value.
    • In bytecode instructions, return instructions include iReturn (used when the return value is Boolean, byte, char, short, and int), LReturn (Long), freturn (Float), dreturn (Double), and Areturn. There is also a void method declared by the return directive, used by instance initializers, class and interface initializers.
  2. If an Exception is encountered during the execution of a method and the Exception is not handled within the method, the method will exit as long as no matching Exception handler is found in the Exception table of the methodAbnormal completion exit.

During the execution of the method, the exception processing when the exception is thrown is stored in an exception processing table, which is convenient to find the code to handle the exception when the exception occurs

Exception table:from to target type4	 16	  19   any19	 21	  19   any
Copy the code

In essence, the method exit is the process of the current stack frame out of the stack. At this point, you need to restore the local variable table of the upper method, operand stack, push the return value into the operand stack of the caller’s stack frame, set the PC register value, etc., and let the caller’s method continue to execute.

The difference between a normal completion exit and an exception completion exit is that an exception completion exit does not return any value to its upper callers.

4.10. Some additional information

Stack frames also allow you to carry additional information about the Java virtual machine implementation. For example: support information for program debugging.

4.11. Stack related interview questions

  • Example stack overflow? (StackOverflowError)
    • Set the stack size with -xss
  • Can stack size be adjusted to prevent overflow?
    • No guarantee against overflow
  • Is it better to allocate more stack memory?
    • No, it reduces the OOM probability for a while, but it takes up space for other threads because the space is limited.
  • Does garbage collection involve the virtual machine stack?
    • Don’t
  • Are local variables defined in a method thread-safe?
    • Case by case. An object is thread-safe if it is created internally and dies internally without being returned externally, and thread-unsafe if it is not.
Runtime data area Is there an Error? Whether GC exists
Program counter no no
The virtual machine stack Is (SOE) no
Local method stack is no
Methods area Is (OOM) is
The heap is is