Java runtime data area

1. Java runtime data area

During the execution of Java programs, the Java VIRTUAL machine divides the memory it manages into different data areas. These data areas have their own purposes and creation and destruction timings. Some areas of memory exist as the virtual machine process is started, while others are created and destroyed depending on the start and end of the user thread. The memory managed by the Java VIRTUAL machine consists of the following runtime data areas:

1.1 Program counter

A program counter is a small memory space that can be thought of as a line number indicator of the bytecode being executed by the current thread. In the conceptual model of Java virtual machine, bytecode interpreter works by changing the value of this counter to select the next bytecode instruction to be executed. It is an indicator of program control flow, and basic functions such as branch, loop, jump, exception handling and thread recovery depend on this counter. Multithreading in Java virtual machines is implemented by switching and allocating processing execution time in turn. At any time, a processor intelligently executes instructions in a thread, that is, each thread has its own thread counter, which is stored independently of each other, so the program counter is thread private. If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instruction being executed. If the Native method is being executed, this counter value should be null. This memory region is the only one where the Java Virtual Machine Specification does not specify any OutOfMemoryError cases

1.2 Java Virtual Machine Stack

The virtual machine stack describes the thread-memory model of Java method execution; Like program counters, they are thread private and have the same life cycle as the thread; Each time a method is executed, the Java virtual machine synchronously creates a stack frame that stores information about local variables, operand stacks, dynamic links, method exits, and so on. When a method is called and executed, it corresponds to the process of a stack frame being pushed into and out of the virtual machine. The local variable table stores various Java virtual machine basic data types (Boolean, int, double, float, char, byte, Long, short), object references, and returnAddress types that are known at compile time. The storage space of these data types in the local variable table is represented by local variable slots, where 64-bit long and double data types occupy two variable slots and the rest occupy only one. The memory space required by the local variable table is allocated at compile time. When entering a method, how much local variable space the method needs to allocate in the stack frame is completely determined, and the size of the variable table does not change during the method run. In the Java Virtual Machine specification, two exceptions are specified for this area of the virtual machine stack: a StackOverflowError is raised if the stack depth of a thread request is greater than the depth allowed by the virtual machine; If the virtual stack can scale dynamically, OutOfMemoryError will be thrown if sufficient memory cannot be allocated while scaling.

1.3 Local method stack

The role of the local method stack is very similar to that of the virtual machine stack, except that the virtual machine stack serves the execution of Java methods for the virtual machine. The Native method stack serves the Native methods used by the virtual machine. StackOverflowError and OutOfMemoryError exceptions are also thrown in the local method stack area.

1.4 Java heap

For most applications, the Java heap is the largest chunk of memory managed by the Java Virtual machine. The Java heap is an area of memory shared by all threads, created at virtual machine startup, whose sole purpose is to hold object instances, and where almost all object instances are allocated memory. All object instances and arrays are allocated memory on the heap. But with the development of JIT compilers and the maturity of escape analysis techniques, allocation on the stack and scalar replacement optimization techniques will lead to subtle changes in how all objects allocated on the heap become less absolute.

The Java heap is the primary area managed by the garbage collector and is often referred to as the GC heap. From the point of view of memory reclamation, the Java heap can be subdivided into: new generation, old generation; In detail, it can be divided into Eden space, From Survivor space, To Survivor space, etc.


1.5 method area

The method area, like the Java heap, is an area of memory shared by threads to store data such as class information that has been loaded by the virtual machine, constants, static variables, and even compiled code by the compiler. For developers who are used to developing and deploying applications on HotSpot virtual machines, many prefer to refer to the method area as “persistent generation”. In essence, the two are not equivalent, simply because the HostSpot virtual machine design team chose to extend GC generation collection to the method area, or implement the method area using persistent generation. So HotSpot’s garbage collector can manage this part of memory as well as the Java heap, eliminating the need to write memory-management code specifically for the method area. The Java Virtual Machine Specification is very relaxed about method areas, and in addition to the fact that the Java heap does not require continuous memory and can be either fixed size or extensible, you can even choose not to implement garbage collection. Garbage collection is relatively rare in this area, but it is not as “permanent” as the name of the permanent generation that data enters the method area. This area of memory reclamation is mainly aimed at constant pool reclamation and type unloading. Generally speaking, the reclamation effect of this area is not satisfactory, especially for type unloading, but the reclamation of this part of the area is indeed necessary sometimes. Several of the most serious bugs on Sun’s previous Bug list were memory leaks caused by earlier versions of the HotSpot VIRTUAL machine not fully reclaiming this area. According to the Java Virtual Machine Specification, OutOfMemoryError is thrown if the method area cannot meet the new memory allocation requirements.

1.6 Runtime constant pool

The Runtime Constant Pool is part of the method area. The Constant Pool Table is used to store various literals and symbolic references generated at compile time. This part of the Table is stored in the runtime Constant Pool of the method area after the Class is loaded. Runtime constant pool relative to the Class file another important feature of the constant pool is dynamic, the Java language does not require constant Must only compile time to produce, that is to say, is not a preset constant pool into the Class file content can enter method area run pool, often during a run can also be new constants in the pool, One feature that developers use most often is the Intern () method of the String class. Since the runtime constant pool is part of the method area and is naturally limited by the method area memory, OutOfMemoryError is thrown when the constant pool can no longer claim memory.

1.7 Direct Memory

Direct Memory is not part of the run-time data area of the virtual machine, nor is it defined in the Java Virtual Machine Specification. However, this portion of memory is also frequently used and can cause OutOfMemoryError exceptions. The NIO (New Input/Output) class was introduced in JDK 1.4, introducing a Channel and Buffer based I/O method that can allocate off-heap memory directly using Native libraries. This is then referenced by a DirectByteBuffer object stored in the Java heap. This can significantly improve performance in some scenarios because it avoids copying data back and forth between the Java heap and Native heap. Obviously, the allocation of native direct memory is not limited by the Size of the Java heap. However, since it is memory, it is certainly limited by the size of the total native memory (including physical memory, SWAP partition, or paging file) and the addressing space of the processor. Parameters such as -xmx are set according to the actual memory, but the direct memory is often ignored. As a result, the sum of each memory area is greater than the physical memory limit (including the physical and operating system level limit), resulting in OutOfMemoryError during dynamic expansion.

2.HotSpot VIRTUAL machine object exploration

2.1 Object Creation

Java is an object-oriented programming language. Objects are created all the time during the running of Java programs. At the language level, creating an object is usually just a new keyword (exception: copy, deserialization). What about creating an object in a virtual machine? When the Java virtual machine reaches a bytecode new instruction, it first checks to see if the instruction’s arguments locate a symbolic reference to a class in the constant pool, and to see if the symbolic reference represents a class that has been loaded, parsed, and initialized. If not, the corresponding class loading process must be performed first. After the class load check passes, the virtual machine next allocates memory for the new objects. The size of the memory required by an object is fully determined after the class is loaded, and the task of allocating space for an object is essentially the same as dividing a certain size block of memory from the Java heap. If memory in the Java heap is perfectly neat, all used memory is placed on one side, free memory is placed on the other, and a pointer is placed in the middle as an indicator of the dividing point. The allocated memory simply moves that pointer to free space by an equal distance to the size of the object. This allocation is called a “Bump The Pointer.” But if the memory in the Java heap is not neat, has been the use of memory and free memory staggered together, that is simply no way pointer collision, the virtual machine, you must maintain a list of records on which memory blocks are available, at the time of distribution from the list to find a large enough space division to the object instance, And updates the records on the List, which is called a “Free List.” The choice of allocation method is determined by the cleanliness of the Java heap, which in turn is determined by the ability of the garbage collector to Compact. Therefore, when using Serial, ParNew and other collectors with compression collation process, the system uses the allocation algorithm is pointer collision, which is simple and efficient; When using a collector based on a Sweep algorithm like CMS, memory can theoretically be allocated using a more complex free list. In addition to dividing up the available space, there is another issue to consider: Object creation is A very frequent activity in virtual machines. Even if only one pointer is changed, it is not thread-safe in concurrent situations. Object B may use the original pointer to allocate memory before the pointer is changed. There are two possible solutions to this problem: one is to synchronize the memory allocation action — in fact, the VIRTUAL machine uses CAS and retry to ensure atomization of the update operation. Additionally one kind is divided the action in accordance with the thread of the memory allocation among different space, namely, each thread in Java heap pre-allocated a small piece of memory, called a local thread allocation buffer (ThreadLocalAllocationBuffer TLAB), which thread to allocate memory, The lock is allocated in the thread’s local buffer, and only needs to be synchronized when the local buffer is used up and a new cache is allocated. The -xx: +/ -usetlab parameter is used to determine whether the VM uses TLAB. After memory allocation is complete, the virtual machine must initialize the allocated memory space (but not the object header) to zero, which can also be done in advance of TLAB allocation if TLAB is used. This ensures that the instance fields of the object can be used directly in Java code without assigning initial values, enabling the program to access the zero values corresponding to the data types of these fields. The Java virtual machine then sets up the Object as necessary, such as which class the Object is an instance of, how to find the metadata information about the class, the Object’s hashCode(which is actually deferred until the Object::hashCode() method is actually called), the Object’s GC generation age, and so on. This information is stored in the ObjectHeader of the object. The object header can be set differently depending on the VM running status, for example, whether biased locking is enabled. More on object headers later.

2.2 Memory Layout of objects

In the HotSpot virtual machine, the storage layout of objects in the heap memory can be divided into three parts: object headers, InstanceData, and alignment Padding.

The object header of the HotSpot VIRTUAL machine object contains two types of information. The first type is the runtime data used to store the object itself, such as HashCode, GC generation age, lock status flag, thread held lock, bias thread ID, bias timestamp, etc. The length of this part of the data is 32 bits and 64 bits in 32-bit and 64-bit virtual machines (without compression pointer enabled). Officially, it’s called a “MarkWord.” Objects need to store a lot of runtime data, in fact, beyond the maximum 32, 64 Bitmap structure can record, but the information in the object header is an additional storage cost independent of the data defined by the object itself. Considering the space efficiency of the virtual machine, MarkWord is designed as a data structure with dynamic definition. In order to store as much data as possible in a very small space, reuse their own storage space according to the state of the object.





2.3 Object Access positioning

Objects are naturally created for later use, and our Java programs manipulate specific objects on the heap using reference data on the stack. Due to reference types in the Java virtual machine specification only it is a pointer to the object of reference, there is no definition of the reference should through what way to localization, the location of the access to the heap object, so the object access method is implemented by the virtual machine, the mainstream way of access are mainly using two kinds of handle and direct Pointers:

  • · If handle access is used, a chunk of memory may be allocated to the Java heap as the handle pool. Reference stores the handle address of the object, and the handle contains the specific address information of the instance data and type data of the object.

  • If direct pointer access is used, the memory layout of objects in the Java heap must consider how to place the information related to the access type data. The direct stored in reference is the address of the object. If only the object itself is accessed, there is no need for the overhead of an additional indirect access.

    ==================END ==================