The V8 garbage Collector (GC) has changed a lot over the years. Orinoco (V8 garbage collection code) switched from a stop-the-world 1 collection to a mostly parallel and concurrent collection with incremental collection.

The following GC represents garbage collection. garbage collector

Any GC must periodically perform the following basic tasks:

  1. Identify living/dead objects
  2. Recycles memory occupied by dead objects
  3. Compact Indicates the memory fragment that has been reclaimed

A straightforward way to perform these tasks is stop-the-world, which pauses JavaScript execution and executes the tasks sequentially on the main thread. However, this can cause the main thread to stall, and JavaScript and THE UI share the main thread in the browser, with disastrous consequences.

V8 garbage collection strategy

There are many algorithms for automatic garbage collection, and because the life cycle of different objects is different, it is not efficient to solve the problem with only one collection strategy.

The V8 garbage collection strategy is based on generational garbage collection.

It divides memory into two generations: young generation and old generation.

  • Cenozoic: Objects live for a short time. Newborn objects or objects that have been garbage collected only once.
  • Old generation: The object lives for a long time. Objects that have undergone one or more garbage collections.

Different garbage collection algorithms are used for the new generation to improve efficiency.

V8 memory allocation

With default Settings, you can only use about 1.4GB on 64-bit systems and about 0.7GB on 32-bit systems if you always allocate memory.

Type \ System bits A 64 – bit 32 –
The new generation 32M 16M
The old generation 1.4 G 0.7 G

The new generation splits memory into the nursery and intermediate generations. Objects are initially assigned to the nursery, and if they survive the next GC, they are assigned to intermediate children. Then, if they survive another GC, they are moved to the old generation, a process also known as promotion, which is described below.

Attention! If the memory space of the new generation is insufficient, it is directly allocated to the old generation.

There is an important term in garbage collection: the “generation hypothesis.” This basically suggests that most of the subjects died young. In other words, from a GC point of view, most objects that have been allocated are not reachable (possibly scoped) and are garbage. This applies not only to V8 and JavaScript, but also to most dynamic languages.

V8’s generational memory structure takes advantage of this fact. V8’s GC is a collation/move GC, which means it copies surviving objects from the garbage collection. This may seem counterintuitive: copying objects at GC is expensive 2. But we know that according to the “generation hypothesis,” only a small percentage of objects survive the GC, and the rest are garbage objects. This means that we only pay replication costs proportional to the number of live objects, not the total number of objects allocated.

Next, new and old garbage collection algorithms are introduced respectively.

Old-generation GC (Mark-Compact)

Also known as the Major GC, it collects garbage from the entire heap. The following generation GC(also known as Minor GC) occurs only in the generation memory.

1. Making

GC determines whether the object is alive based on whether it is accessible. That is, objects that are reachable at the current runtime are retained, and any unreachable objects can be reclaimed.

Making is the process of finding accessible objects. * * is as follows:

GC starts with a set of known Pointers to objects, called the root set. This includes execution stacks and global objects. It then tracks each pointer to a javascript object and marks that object as accessible. GC tracks each pointer in that object and recursively continues the process until it finds and marks every object accessible at runtime.

Mark A, C, and E as living objects:

2. Sweeping away

Sweeping is the process of clearing the memory Spaces of dead objects (unmarked objects) and adding the Spaces to data structures called free lists.

After the clearing is complete, the memory space will be discontinuous. This fragmentation can cause problems with subsequent memory allocation.

If a large memory needs to be allocated, a garbage collection will be triggered prematurely because there is not enough debris space left to complete the allocation, which is not necessary.

Clear unmarked objects B, D, F:

3) Compaction

To solve Sweep’s memory fragmentation problem, a Compaction ** was proposed.

We move living objects (labeled objects) to one end of the memory space, and when we’re done, we clean up all memory outside the boundary.

Attention! A potential weakness of GC for copying live objects is that when we allocate a lot of long-lived objects, the cost of copying them is high. So we only choose to clean up some highly scattered memory, for other memory only perform purges, not copy and move living objects.

Cenozoic GC (Scavenger)

Old-generation GCS can effectively collect garbage from the entire heap, but the “generation hypothesis” tells us that newly allocated objects need to be collected.

Scavenge recycling algorithm is adopted in the new generation, and Cheney algorithm is mainly used in algorithm implementation. The Cheney algorithm splits memory in two, called Semispace, with one in use and one idle. A semispace that is in use is called From space, and a Semispace that is idle is called To space.

First, during GC, all surviving objects in the From space are moved To contiguous memory in the To space and tags are added. The advantage of this is that memory fragmentation can be removed.

Step 2, swap From space and To space. On the next GC, if the marked object is still alive, it is moved to the old generation.

Finally, update the reference pointer to the living object that has been moved to the old generation.

In the New generation GC, we actually interlace the three steps of mark, move, and pointer update.

Orinoco

An important measure of garbage collection performance is the amount of time the main thread pauses during GC.

For traditional stop-the-world garbage collection, the time spent on GC directly affects the user experience (poor picture quality, poor rendering, and poor latency).

Orinoco, code name for V8 GC, uses the latest and greatest parallel, incremental, and concurrent techniques for garbage collection to free the main thread. There are a few terms here that have specific meanings in the context of GC and are worth defining in detail.

Parallel = Parallel

Parallelism is when the main thread and the worker thread perform roughly the same amount of work simultaneously.

This is still a “total pause” approach, but the total pause time is now divided by the number of threads participating (plus some synchronization overhead). This is the simplest of the three techniques. Since the worker thread is not running any JavaScript, each worker thread just needs to make sure that it synchronizes any objects that other threads may also be accessing.

Incremental

Delta is when the main thread intermittently performs a small amount of work. Instead of doing the entire GC in incremental pauses, we only need to do a small part of the overall GC work. This is more difficult because JavaScript is executed between each incremental work segment, which means that the state of the heap has changed, which can invalidate work done by previous increments. As you can see from the figure, this does not reduce the time spent on the main thread (in fact, it usually increases slightly), but rather spreads it out over time. This is still a good technique for solving one of our original problems: mainthread latency. By allowing JavaScript to run intermittently while continuing GC tasks, the application can still respond to user input and make progress on animation.

Concurrent

Concurrency is when the main thread continuously executes JavaScript while the helper thread does GC work in the background. This is the most difficult of the three techniques: anything in the JavaScript heap can change at any time, invalidating what we’ve done before. In addition, there are now co-reads/writes to worry about, since the worker thread and the main thread may read or modify the same object at the same time. The advantage here is that the main thread is completely free to execute JavaScript — although the overhead is minimal due to some synchronization with the worker thread.

If you want to learn more about the V8 engine’s garbage collection mechanism, check out the official website.

reference

  1. Trash talk: the Orinoco garbage collector by V8官网
  2. NodeJs by Park Ling
  3. Talk about V8 engine garbage collection
  4. Have an in-depth understanding of V8 garbage collection

  1. The purpose of the total pause is to resolve inconsistencies between the application logic and what the garbage collector sees. ↩
  2. Copying or moving objects takes a lot of memory. ↩