preface
Garbage collection is a feature of many high-level languages.
For JS, primitive types are stored in stack memory, which is managed by the operating system. Object types are stored in heap memory, which is managed by the engine. Which brings us to V8 garbage collection.
The V8 garbage collector project was code-named Orinoco.
The Weak Generational Hypothesis says that most subjects only survive for a short period of time. Based on this theory, V8 significantly improves GC efficiency by dividing the memory space required to perform GC into the new generation and the old generation, placing objects with different lifetime lengths and using different GC strategies. This sort of sorting is called Generational Layout.
V8’s memory structure:
- New Generation memory area, Young Generation or New Space. Most of the objects are here
- Old Generation memory area, Old Generation or Old Space. The objects that are resident in memory are here
- Large Object Space. As the name suggests. The GC does not reclaim this memory
- Code Space. The only memory that has execution permission
- Map area, Map Space. TODO: Cell and Map
Each area is made up of pages of memory, which is the smallest unit of MEMORY required by V8 and the unit of garbage collection. Except for the large object area, the memory page size is 1 MB.
There is a limit to the size of memory. In a 64-bit environment, the default maximum memory size is 32 MB for the new generation and 1.4 GB for the old generation. Halved for 32-bit environments.
The reason for limiting the maximum value is that the browser side generally doesn’t use much memory, and it affects the efficiency of the GC and the page response. Because GC blocks JS execution, a full 1.4GB GC takes more than 1s at a time. This phenomenon is called total stop-the-world.
Zha! Tile lu! Some people made the Japanese Kyrgkyosportswoman into the cosmos!!
It blocks because both the GC and the program modify the object, and if there is no guarantee that the program will not modify the object being GC, you need to pause the code so that the object can be recycled.
Rather than apply for the maximum amount of memory at once, V8 will apply for more space after the current memory is full. The article “V8 Memory Management and Garbage Collection, Part 1” was tested on Node.
Node is also powered by a V8 engine, so be careful about size when handling large files.
The V8 garbage collector consists of two parts:
- Minor Garbage Collector for the new generation
- The main Garbage Collector, the Major Garbage Collector, is used for the entire heap, both new and old generation
The Major GC applies to the whole heap, which is, of course, given in the official blog post Trash Talk: The Orinoco Garbage Collector. There are no articles about the use of the Major GC in the New generation.
V8 garbage collection is usually triggered when there is not enough memory left to allocate to objects, and sometimes when a threshold for memory usage has been reached.
The new generation
The new generation of memory is divided into two semi-spaces of equal size, called from-space and to-space respectively. From-space is the actual memory used, while to-space is free and only used during GC. In other words, only half of the new generation of memory is actually used.
From-space is divided into Nursery and Intermediate regions. The object first allocates memory at the Nursery, goes through a GC and then moves to Intermediate.
The Minor GC uses the Scavenge algorithm, which is implemented using the Cheney algorithm, as follows:
- Breadth-first traverses the objects in from-space, copying surviving objects To the to-space
- When traversal is complete, empty from-space
- From-space and to-space roles are reversed
The memory Space occupied by the copied objects in to-space is continuous, and fragmentation is not a problem.
The detailed process is described in Understanding V8 Memory Management and V8 Memory Management and Garbage Collection (PART 1).
Cenozoic GC is more frequent.
The transfer of objects from the new generation to the old generation is called promotion. There are two kinds of promotions:
- An object that survived one GC, that is, in Intermediate
- When objects are copied To a to-space, more than 25% of the to-space is already used
V8 Source Code – Memory Management mentions the source code for the 25% statement. But looking at the latest source code, heap.h only mentions the first case
The old generation
Old generation memory is also divided into two parts:
- Old Pointer Space If the object may have Pointers to other objects, save them here. Most of the candidates for promotion are here
- Old Data Space. Only the original object is saved, with no Pointers to other objects
The Major GC uses two algorithms:
- Marks mark-sweep
- Mark-Compact
The two algorithms are commonly referred to together and have names such as Mark-sweep-compact Algorithm or Full Mark-compact.
The Major GC has three steps:
- Marking tags
- Sweeping clean
- A Compaction/Compacting consolidation
Marking tags
Tagging is the process of finding all accessible objects. The marking process of the two algorithms is the same.
Three-color notation is used:
- The value is 00, white, and unreferenced
- The value is 10, gray, and is referenced, but the referenced object has not been traversed
- The value is 11, black, referenced, and the referenced object has been traversed
All objects are marked white, then gray/black marks are added to the accessed objects in depth-first traversal, starting with the Root Set (execution stack and global objects).
Sweeping clean
Clears objects marked white. This creates a discontinuous memory space.
The essence of the purge is to mark the memory address as free, and the code level is to store the memory address in a data structure called free-list.
A Compaction/Compacting consolidation
Modify the memory addresses of surviving objects to consolidate objects on different memory pages so that the memory space is compact and orderly. This is a performance-intensive operation.
There are several ways to say when to execute/not execute a Compaction:
- This is only done when there is not enough room for a new promotion. See V8 Memory Management and garbage Collection (Part 2)
- Only for the highly scattered memory pages, the rest of the memory pages perform the jie. See “Translate” Orinoco: THE V8 Garbage Collector
- When objects on a memory page are referenced many times, they are skipped because of the performance impact. See V8 — What You Need to know about Garbage Collection
Write barriers Write – Barrier
References to objects can exist between the new generation and the old generation, and V8 maintains a list of references through its Write Barrier, eliminating the need to search through the huge old generation memory.
Other memory areas
The remaining three areas, big object area, code area and Map area, belong to the old generation and use the old generation GC algorithm. See interpreting V8 GC Log part II: Partition and GC Algorithms for In-and-out heap Memory.
In many V8 memory maps, these areas are drawn separately from the old generation, Visualizing Memory Management in V8 Engine (JavaScript, NodeJS, Deno, WebAssembly)
The V8 optimization
In addition to the basic GC above, V8 has made some additional optimizations.
As mentioned above, GC has Marking and Sweeping stages. About mentioned below the Parallel Incremental/Concurrent three optimization scheme, whether in Marking phase or Sweeping stage, or is two phase, not yet clear.
According to “Interpreting V8 GC Log (2) : Internal and External Memory Partitioning and GC Algorithm”, Incremental GC takes place during Marking, also known as Incremental Marking; Parallel GC/Concurrent GC occurs in the jie stages, also called Parallel and Concurrent jie. According to Concurrent Marking in V8, Parallel and Concurrent also exist in the marking phase.
The former will prevail for now.
Incremental Marking
Each time a certain amount of memory is allocated or a certain number of write barriers are triggered, the program is paused, Marking for milliseconds to tens of milliseconds, and then resumed.
After such intermittent Marking, by the time Sweep is required, most of the memory has already been scanned, so there is no need to scan again from the beginning.
Parallel/Concurrent cleaning Parallel/Concurrent Sweeping
For objects that have been determined to be reclaimed, a new thread can be used to execute a Sweeping without worrying about conflicting with the main thread, which is the Concurrent Sweeping.
Open multiple threads simultaneous jie, namely Parallel jie.
other
V8 4.X introduced Pretenuring mechanics. When objects created by certain functions are frequently promoted to the old generation, or have a high Survival Rate, objects created after these functions are allocated directly to the old generation.
Other GC algorithms
Reference counting
Principle: Object +1 when referenced, dereference -1. Reclaim objects that are not referenced.
There are problems with reference counting algorithms when circular references occur:
// Scenarios with no circular references
// after f1 is executed, b is referenced by A, but A is not referenced, so a is reclaimed along with B
function f1() {
var a = {}
var b = {}
a.b = b
}
// There is a loop reference scenario
// f2 after execution, both a and B still have references, so neither can be collected
function f2() {
var a = {}
var b = {}
a.b = b
b.a = a
}
Copy the code
With the token-clearing algorithm, a and B are reclaimed because they are not reachable through global objects.
Refer to the link
- Garbage collection and memory leaks in browsers
- V8 garbage collection GC
- V8 memory analysis – graphite document
- Memory management – MDN
- V8 – What you need to know about garbage collection
- Read the V8 Insane, Mark-sweep, and Mark-Compact
- V8 Memory Management and Garbage Collection (part 1)
- V8 Memory Management and Garbage Collection (Part 2)
- V8 Memory Structure
- Understand V8 memory management
- Orinoco: young generation garbage collection – V8 官网
- Orinoco: V8 garbage collector
- Trash Talk – V8 官网
- The Weak Generational Hypothesis
- V8 source code – memory management
- V8 GC Log (1) : Node.js application background and GC basics
- V8 GC Log (2) : Partition of internal and external memory and GC algorithm