In-depth analysis of JVM memory management

The JVM – Java virtual machine

JVM: Java Virtual Machine: Java Virtual Machine. The so-called Java virtual Machine is actually translation software, such as.class.jar files, into the machine language that each operating system can recognize, the JVM is cross-platform, even cross-language, so this is the charm of Java.

JVM, JRE, and JDK

JDK: Java toolkit JRE (Java Runtime Environment) Java Runtime Environment, namely the JVM + a pile foundation Class library, the JRE provides many of the Class library JVM: translation, translated into machine language by the Class So the JDK > JRE > JVM

The structure of the JVM

JVM running process

Runtime data area is the focus!!

The runtime data area is divided into two parts: shared between threads (method area, heap) and thread private (virtual machine stack, program counter, local method stack).

Program counter

It is used to record the bytecode addresses, operations, and so on performed by each line of code in the current thread. Is a bytecode line number indicator. Cause: The CPU’s time slice rotation has stopped in the middle of its execution, and another thread has been cut off. Therefore, there needs to be a record of which line and which operation is currently being performed. The program counter is small, so it will not be OOM

The virtual machine stack

Often said stack, wherein the stack refers to the VIRTUAL machine stack of the JVM. A stack is a first in, second out (LIFO) structure, similar to a magazine.

Each stack frame is a bullet (1 method =1 stack frame). The stack frames are pushed one by one (the bullets are pushed into the magazine) and then fired one by one.

The structure of stack frames

Size limit – The default size of the Xss stack is 1 MB. You can use the -xSS parameter to adjust the size, for example, -xSS256K. If the size exceeds the limit, it will be OOM. The stack frame of the stack has a size (the magazine has a size, the bullets have a limit).

The local variable table stores local variables, it stores references to the eight basic types and objects, objects go in the heap, references go in here. A 32-bit and 64-bit type will occupy two Spaces in one.

The operand stack holds the operands executed by our method. The stack is empty at the beginning, and when there are operations, the stack will be frequently pushed out of the stack for calculation.

People wang = new Man(); People wang = new Man(); wang.wc(); Dynamically connected to wang. Men’s room at run time

Return Address The address of the return value (the physical address recorded by the program counter). In case of an exception, do not go to this, go to the exception handler table

The process for performing the following operations:

  1. Local variable tables (the first one is always this, except for static ones) declare two variables, x and y stored in 1,2
  2. X =1 pushes the operand stack
  3. Y =2 pushes the operands on the stack (x=1).
  4. Move x and Y off the stack, go to the CPU and perform 1+2, return 3 and put it on the operand stack
  5. Take the return value of 3 from the operand stack and assign it to the local variable table y=2

Local method stack

Function is similar to the virtual machine stack, is the native method, handed to C to execute.

Methods area

Permanent generation, which is used in the HotSpot virtual machine to implement method areas, but not in other virtual machines. So method area ≠ permanent generation)

The method area is used to store information about classes loaded by VMS, including class information, static variables, constants, run-time constant pool, and string constant pool. (Strings are stored in the heap, and references are stored in the constant pool)

When the JVM loads a class, it loads the.class file first. The.class file contains a description of the version, fields, methods, and interfaces of the class + constant pool The constant pool is used to hold the various literal and symbolic references generated during compilation. Literal =String+ Final constant. Symbolic references = full names and descriptors for classes, methods, fields.

Once the class is loaded into memory, the JVM stores the contents of the class file constant pool into the runtime constant pool. During the parsing phase, the JVM replaces the symbolic reference with a direct reference (the index value of the object). For example, when a string constant from a class is in a class file, it is stored in the class file constant pool. After the JVM loads the class, the JVM puts the string constant into the run-time constant pool and, during the parsing phase, specifies the index value of the string object. The run-time constant pool is shared globally. Multiple classes share a run-time constant pool. The same string in a class file will only exist in one run-time constant pool.

The method area is a shared memory area like the heap space, so the method area is shared by threads. If two threads are trying to access the same class information in the method area, and the class has not yet been loaded into the JVM, then only one thread is allowed to load it and the other thread must wait. In the HotSpot virtual machine, Java7 has moved the permanent generation of static variables and runtime constant pools to the heap, while the rest is stored in the JVM’s non-heap memory, and Java8 has removed the permanent generation implemented in the method area. Class metadata is used to replace the previous permanent generation, and the storage location of the metadata is local

Configuration parameters: DK1.7 or earlier (initial and maximum) : -xx :PermSize; – XX: MaxPermSize; After JDK1.8 (initial and maximum) : -xx :MetaspaceSize; -xx :MaxMetaspaceSize after JDk1.8 the size is limited only by the total native memory (if not set)

Why does Java8 use meta-spaces instead of permanent generations, and what are the benefits? 2. Permanent generation memory often runs out or memory overruns occur

The heap

The heap is the largest area of memory on the JVM, where almost all the objects + arrays are stored. Garbage collection, the object of operation is the heap.

When the program starts, it requests heap space, and as the heap grows, it GC (garbage collection). The object new is created on the heap. Basic data types: Methods declared in the body (local variables) are stored on the stack, or on the heap.

Configure the heap size: -xmx: maximum heap size; -xms: minimum heap value (initial value);

The heap is divided into Cenozoic (easily collected by GC) and old-age (not easily collected by GC), and the physical addresses are continuous. An object is created in a younger generation, and if it is still alive after several GC’s, it will be moved to an older generation.

Direct memory

Does not belong to the JVM runtime data area. With NIO, however, you can apply for an area of memory for the JVM to use. The size of the OOM configuration will also occur: -xx :MaxDirectMemorySize

HSDB tools

You can use the HSDB tool to see what the JVM is doing at runtime, using the thread ID to search that tool.

Out of memory

Memory overruns can occur in all parts of the JVM except the program counter.

  • Stack overflow

1. Each stack has a minimum fixed size of 1M. If the thread is built indefinitely, the stack will be built indefinitely, and the machine will OOM if it runs out of memory. 2. The stack frame of a stack has an upper limit of size (there is an upper limit of ammunition for magazines). Stack frame = method, so the infinite execution method, will the Java. Lang. StackOverflowError.

  • Stack overflow

There are too many objects on the heap that are too big, causing memory overflow. Android is mostly heap overflow. 1. Code normal can be adjusted by the heap size. -xms, -xmx parameters. 2. Most of them are caused by memory leaks, so it’s time to check the code (exception holding reference, etc.). 3. Check the adjustment of the object, whether the design is unreasonable, the holding time is too long, the object is too large, manually clean part of the object, and the life cycle of the object is too long.

  • Method area overflow

(1) Run-time constant pool overflow (2) Class objects stored in the method area are not reclaimed in time or Class information takes up more memory than we configured.

  • Native direct memory overflow

Ask for more space, or review your code.

Vm optimization techniques

1. Method inlining: A simple method is not called directly, and a copy of the execution is placed at the place where the call is made (calling a method will add a stack frame to the stack, and then frequently push the stack out of the stack).

2. Data sharing between stack frames (The VM has been optimized) Data is shared between stack frames. In this case, multiple frames are not created. As a (10); 10 This variable is passed on two methods (stack frames), only one copy.

Object and the garbage collection mechanism

Object creation

Object creation process diagram is very important

For other steps, see the diagram. When allocating memory, we will allocate memory in different ways according to whether the memory is neat or notPointer collisions and free lists.

But because the heap is shared by multiple threads, it raises thread-safety issues, so there are two solutionsCAS mechanism and local thread allocation buffer TLAB.

Pointer to the collision

The free list

Compare and swap, first find the free area, then use the CPU CAS instruction, if this memory is empty, then I will say, if I execute the CPU CAS instruction, It is already occupied by someone else (compare compare with what I want) so continue to loop to find the next area see the thread juejin.cn/post/695650…

Local thread allocation buffer TLAB

Since there are not many threads and the memory required is relatively small, we go to the Eden area of the new generation in the heap and directly delimit a piece of memory for each thread to use.

This area is very small, generally occupying 1% and 2% of the Eden area

Composition of objects

Objects are classified into object headers, instance data, and alignment padding

Object

There are two main ways to access objects:Handles (which are managed by a pool of handles drawn from the heap) and direct Pointers

Determine the survival of the object

There are generally two ways to determine whether an object is alive and needs to be collected by GC: reference counting algorithm and reachability analysis (root reachability).

Reachability analysis GC Roots objects (system-defined Pointers outside the heap) : ● Objects referenced in the virtual machine stack (local variables in the stack frame). An object referenced by a class static property in the method area. An object referenced by a constant in the method area. Objects referenced by JNI (commonly known as Native methods) in the stack of Native methods ● Internal references to the VM (class objects, NullPointException objects, OutofMemoryError, system class loaders). ● All objects held by a synchronized key. ● “temporary” objects in the JVM implementation, objects that are referenced across generations (when only partial generations are reclaimed using a generational model of reclamation)

All kinds of reference

Strong > Soft > Weak > virtual

Object Allocation Policy – The complete process of object creation

  1. Some objects are allocated on the stack. Any small object that satisfies the escape analysis can be allocated on the stack (i.e. a local variable in a method that is not referenced by another method or thread).

Benefits: No need to allocate on the heap, fast. The stack is the method is executed, inside the memory is freed without GC

  1. Local Thread Allocation Buffer (TLAB)

  1. Whether it is a large object, the large object is directly allocated to the old (large String, array). The reason is that in the old days, there was no need for frequent GC and movement. And there’s a lot of space in the old days. New generation: old age =1/3:2/3

4.Generally, the object new comes out in the New Generation of Eden. It was then promoted to the old age after being survived by frequent GC.

1. The object is born in Eden, and the GC age stored on the object is null (0 years old).

2. After the first GC, 90% of the objects will be collected by GC, and the remaining 10% will be promoted to the from section with age=1

3. If you survive the second GC, you will advance To the To zone with age=2. Will repeatedly jump from and to and age++ until age reaches a critical value of 15 (either set it yourself or depending on a different algorithm)

Age reaches the critical value of 15, or because the space allocation is guaranteed to advance to the old Tenured zone

Object reclamation

That’s the life of the object, now the death of the object. Which is GC.

Generational collection theory

GC garbage collector, in the new generation and the old recycling algorithm is not the same. The new generation uses itReplication algorithm. The old age substitute isTag-clearing algorithm and tag-sorting algorithm.

Space size: New generation: Old = 1:2. Eden:from:to = 8:1:1

Copying algorithms

Mark-sweep algorithm

Advantages: no declashing, fast, objects don’t need to be moved

Cons: Memory fragmentation

Mark-compact algorithm

A common garbage collector in the JVM

First generation: single thread. Serial–Serial Old– replication algorithm, tag sorting algorithm second generation: multi-threaded parallel. Application. Parallel Scavenge. Copy algorithm and mark organization algorithm. Changing from single thread to multi-thread makes no difference. ParNew–CMS– copy algorithm, mark clearing algorithm

Single threads are parallel to multiple threads

First generation, second generation

CMS garbage collector

Android uses a garbage collector

CMS divides GC into several phases:

1. Initial markup (marking the first layer of reachability analysis) is performed separately first — the execution speed is fast

2. Concurrency flag reachedness analysis outside the root of the leaf node – long time, so with the user thread concurrent execution

3. Reschedule objects that come out of new during execution

4. Concurrent cleanup and reset threads

The phenomenon of The garbage collector suspending The user thread and then performing GC is called Stop The World

G1 Garbage collector

Summary and Interview

Constant pools and strings

JVM memory structure to say!

When does the memory stack overflow?

Java. Lang. StackOverflowError if appeared may be infinite recursion. OutOfMemoryError: Machine does not have enough memory as JVM requests stack memory for thread creation.

Describe the flow of a new object!

Will Java objects be allocated on the stack?

Yes, if the object does not satisfy escape analysis, then the virtual machine will be allocated on the stack under certain circumstances.

If an object is reclaimed, what algorithms are available and what are the most used by the actual virtual machine?

There are two kinds of reference counting method and root reachability analysis, the root reachability analysis is most used.

What are the GC collection algorithms? What are their characteristics?

Copy, mark clear, mark collation. Replication speed, but waste space, no memory fragmentation. Flag clearance space utilization is high, but there is memory fragmentation. The tag defragmentation algorithm has no memory fragmentation, but has low performance when moving objects. Each of the three algorithms has its own strengths and weaknesses.

What does a complete GC process look like in the JVM? How does the object advance to the old age?

Objects are allocated in the Cenozoic first. If there is not enough space, Minor GC; Large objects (requiring large amounts of contiguous memory space) go directly to senile state; Long-lived objects enter the old state. If the object is born in the Cenozoic and still alive after the first MGC, the age is +1; if the age exceeds a certain limit (15), the object is promoted to the old state.

There are several reference relationships in Java, what are the differences between them?

The difference between final, finally and Finalize

In Java, final can be used to decorate classes, methods, and variables (member variables or local variables). We can use the final modifier when we want a class never to be inherited, but note that all member methods ina final class are implicitly defined as final. There are two main reasons to use final methods: (1) To lock methods to prevent inheritance classes from changing them. (2) Efficiency. In earlier Java versions, final methods were converted into inline calls. However, if the method is too large, it may not provide much performance improvement. So in recent releases, the final method is no longer required for these optimizations. Final member variables represent constants that can be assigned only once and never change their value.

As part of exception handling, finally can only be used in try/catch statements and comes with a statement block indicating that the statement must be executed eventually (with or without an exception thrown). Finally is often used in situations where resources need to be freed

Even if the Finalize method of Object determines the unreachable objects through reachability analysis, it is not “must die”, it will still be in the “probation” stage. To really declare an Object dead, it needs to go through two marking processes, one is that no reference chain with GCRoots is found, it will be marked for the first time. Then we do a filter (if the object overrides Finalize), and we can save in Finalize. Therefore, we suggest that we do not use Finalize as far as possible, because this method is too unreliable. In production, it is difficult for you to control the execution of methods or the call order of objects. We suggest that you forget the Finalize method! Because there are better ways in Java to do what finalize methods can do, such as try-finally or some other way to do it better