The introduction

As one of the most popular JavaScript engines, V8 has attracted a lot of attention since its inception. We know that JavaScript runs efficiently in both browser and Nodejs hosting environments because of the V8 engine behind it. It has even made Chrome the dominant browser. I have to say, the V8 engine has done so much for us in its pursuit of extreme performance and a better user experience, upgrading from the original Full-CodeGen and Crankshaft compiler to the strong combination of Ignition interpreter and TurboFan compiler, to hidden classes, With a number of powerful optimization strategies such as inline caching and HotSpot code collection, V8 is striving to reduce overall memory footprint and improve performance.

This article focuses on the V8 engine’s garbage collection mechanism, explaining the V8 engine’s garbage collection strategy to reduce memory usage throughout the life cycle of JavaScript code execution. Of course, this part of knowledge does not affect our code writing process. In the general case, after all, we rarely meet the browser appears out of memory and cause the program crashes, but at least we have a certain understanding to this aspect, can enhance our in the process of writing code to reduce the memory footprint, avoid subjective consciousness of memory leaks, may be able to help you write more robust and more friendly to V8 engine code. This article is also summed up and sorted out by the author in the process of looking up information consolidation review. If there are mistakes in the article, please correct them.

1. Why do we need garbage collection

We know that when a function is encountered in the V8 engine’s line-by-line execution of JavaScript code, a function execution Context is created and added to the top of the call stack. The handleScope contains all the variables declared in the function. When the function completes execution, the corresponding execution context is popped from the top of the stack, the scope of the function is destroyed, and all variables contained in the function are released and automatically reclaimed. Imagine if the scope was destroyed in the process, the variable is not recycled, namely lasting memory, so will inevitably lead to memory, causing the memory leak cause program performance plummeted even collapse, so use after exists inside should be returned to the operating system to ensure the repeated use of memory.

This process is just like you borrow money from friends and relatives, to borrow more and don’t return on time, then you again next time to borrow money, certainly not so well, or that you did not want to borrow your relatives and friends, in your hand is a little tight (memory leaks, performance degradation), so have borrow have also, borrow again, not hard, after all, come out to mix are also.

But JavaScript as a high-level programming language, not as in C or C + + language need to manually apply for memory allocation and release, V8 engine has automatically help us memory allocation and management, so that we have more energy to focus on the business level of complex logic, this is a benefit for our front-end developer, However, the problem is obvious, that is, not having to manage the memory manually, the code process is not rigorous enough to cause memory leaks (after all, it is someone else’s kindness to you, you have not paid, how can you appreciate it?). .

2. Memory limitations of the V8 engine

While the V8 engine has helped us achieve automatic garbage collection management and free our hands, memory usage in the V8 engine is not unlimited. Specifically, by default, the V8 engine can only use about 1.4GB of memory on a 64-bit system and about 0.7GB on a 32-bit system, and these limits make it impossible to directly manipulate large memory objects in Node. For example, if a 2GB file is read into the memory for string analysis, even if the physical memory is up to 32GB, it cannot make full use of the computer’s memory resources. Why is there such a limitation? This goes back to the original design of the V8 engine, and was originally intended as a JavaScript execution environment on the browser side, where we rarely encountered scenarios that used a lot of memory, so there was no need to set the maximum memory too high. But this is only one aspect. There are two other main reasons:

  • JS single thread mechanismAs a scripting language for browsers, JS is mainly used to interact with users and manipulate the DOM, which also determines its single-threaded nature. Single-threaded means that the code must be executed sequentially and only one task can be handled at a time. If JS is multi-threaded, one thread is deleting the DOM element at the same time, another thread is modifying the element, so it will inevitably lead to complex synchronization problems. Since JS is single-threaded, this means that while V8 is doing garbage collection, all other logic in the program is suspended and waiting until the garbage collection is complete. Therefore, due to the single thread mechanism of JS, the garbage collection process blocks the execution of the main thread logic.

Although JS is single-threaded, in order to make full use of the multi-core CPU computing power of the operating system, a new Web Worker standard is introduced in HTML5. Its function is to create a multi-threaded environment for JS, allowing the main thread to create Worker threads and assign some tasks to the latter. When the main thread is running, the Worker is running in the background without interfering with each other. Wait until the Worker thread completes the calculation and returns the result to the main thread. The advantage of this is that while computationally intensive or high-latency tasks are burdened by Worker threads, the main thread (which is usually responsible for UI interactions) will flow smoothly and not be blocked or slowed down. Web Worker is not a part of JS, but a browser feature accessed through JS. Although it creates a multi-threaded execution environment, sub-threads are completely controlled by the main thread and cannot access browser-specific APIS, such as DOM manipulation. Therefore, this new standard does not change the nature of JS single threads.

  • Garbage collection mechanism: Garbage collection itself is a very time-consuming operation, assuming V8’s heap memory is1.5 G, then the V8 is needed to do a small recycling more than 50 ms, and make a non incremental recovery even need more than 1 s, see its time consuming, and in the 1 s, the browser has been in a state of waiting, will lose response to the user at the same time, if there is the animation is running, can also cause caton frame of animation, Severely affecting application performance. Therefore, if memory usage is too high, the garbage collection process will inevitably be slow, and the longer the main thread waits, the longer the browser will remain unresponsive.

Based on these two points, the V8 engine takes a harsh approach to reduce the impact on application performance by directly limiting the size of the heap. After all, you don’t usually have to operate several gigabytes of memory on the browser side. However, on the Node side, the I/O operations involved can be more complex and varied than on the browser side, and therefore memory overflow is more likely. That’s ok, V8 provides configurable items that allow us to manually resize memory, but this needs to be configured at node initialization, which we can do manually.

We try the following command on the Node command line:

The local node version is V10.14.2. You can run the node -v command to view the version of the local node. The following commands may vary depending on the version.

// You can use this command to view the available V8 engine options in Node and their meanings
node --v8-options
Copy the code

We’ll then see a number of options for V8 in the command line window, but let’s focus on just a few of the options in the red box:

// Set the minimum memory size for a single half-space in the new generation memory, in MB
node --min-semi-space-size=1024 xxx.js

// Set the maximum size of a single half-space in the new generation memory, in MB
node --max-semi-space-size=1024 xxx.js

// Set the maximum value of old memory, in MB
node --max-old-space-size=2048 xxx.js
Copy the code

The above method allows you to manually relax the memory limit used by the V8 engine, and node also provides the process.memoryusage () method to view the actual memoryUsage of the current node process.

  • heapTotal: indicates the total heap size currently allocated by V8.
  • heapUsed: indicates the current memory usage.
  • external: represents the memory occupied by C++ objects within V8.
  • rss(resident set size): indicates the resident set size, which is how much physical memory, including heap, stack, and code snippet, is allocated to the Node process. Objects, closures, etc., are stored in heap memory, variables are stored in stack memory, and the actual JavaScript source code is stored in snippet memory. When using Worker threads,rssWill be a value that is valid for the entire process, while other fields are valid only for the current thread.

When an object is declared in JS, the object’s memory is allocated in the heap. If the currently allocated heap memory is insufficient to allocate new objects, it continues to allocate heap memory until the size of the heap exceeds V8’s limit.

V8 garbage collection strategy

V8’s garbage collection strategy is mainly based on generational garbage collection mechanism, which divides garbage collection of memory into different generations according to object lifetime, and then adopts different garbage collection algorithms for different generations.

3.1 V8 memory structure

In V8 heap structure, in fact, in addition to the new and the old generation, also includes other parts, but the process of recycling mainly appeared in the new generation and the old generation, so for the other parts we don’t need to do too much, small partners are interested can refer to the related data, the V8 memory structure is mainly composed of the following parts:

  • New generation (new_space)This is the area where most objects are initially allocated. This area is relatively small but garbage collection is particularly frequent. This area is divided into two parts, one is used to allocate memory and the other is used to copy objects that need to be retained during garbage collection.
  • The old generation (old_space): After surviving for a period of time, objects in the new generation will be transferred to the old generation memory area, which has a lower garbage collection frequency compared with the new generation memory area. Old and divided intoOld generation pointer areaandOld generation data area, the former contains most objects that may have Pointers to other objects, and the latter only holds raw data objects that have no Pointers to other objects.
  • Large object area (large_object_space): Stores objects larger than the size of other areas. Each object has its own memory. Garbage collection does not move large objects.
  • Code area (code_space)The: code object will be allocated here, the only memory area that has execution permission.
  • The map area (map_space): Stores cells and maps. Each area stores elements of the same size. The structure is simple (there is no detailed understanding here, and if there are clear small friends, please explain).

The memory structure diagram is as follows:

3.2 the new generation

In the V8 engine memory structure, the new generation is mainly used to store short-lived objects. The Cenosphere is composed of two semispace Spaces, maximum 32MB on 64-bit systems and 16MB on 32-bit systems. The Scavenge algorithm is used in the cenosphere garbage recycling process.

Scavenge algorithm is a kind of typical sacrifice space for the algorithm of time, for the old generation of memory, may store a large number of objects, if using this algorithm, in the old generation is bound to cause memory resources waste, but in the new generation of memory, most of the short life cycle of an object, in time efficiency, So it’s still a good fit for this algorithm.

The Cheney algorithm is used in the Scavenge algorithm, which divides the Center-generation memory into two parts called semispace, the two areas in the new_space that we see in the figure above, and the area that is active is called From space. The area that is inactive new space is called To space. Of these two Spaces, only one is always in use and the other is idle. Objects declared in our program are first allocated To the From space. When garbage collection is performed, if there are any living objects in the From space, they are copied To the To space for storage. Non-living objects are automatically collected. When the copy is complete, the From space and To space complete a role reversal, the To space will become the new From space, the original From space will become To space.

Based on the above algorithm, we can draw the following flow chart:

  • So let’s say we’re atFromThree objects A, B, and C are allocated in space

  • If object A is garbage collected after the main thread task is executed for the first time and no other reference is found, it can be collected

  • Objects B and C are still active and therefore copied toToSpace to save

  • The following willFromAll non-viable objects in the space are cleared

  • At this timeFromThe space has been emptied of memory, start andToSpace completes a role reversal

  • When the program main thread is performing the second task, theFromA new object D is allocated in space

  • When the garbage collection task is completed, object D has no other reference, indicating that it can be collected

  • Objects B and C are still active and copied to againToSpace to save

  • Once againFromAll non-viable objects in the space are cleared

  • FromSpace andToSpace continues to complete a role reversal

Scavenge
From
To

3.3 Object Promotion

When an object survives multiple copies, it is considered to be an object with a long life cycle. In the next garbage collection, the object will be directly transferred to the old generation. The process of transferring the object from the new generation to the old generation is called promotion. There are two main conditions for the promotion of the object:

  • Whether the object has experienced onceScavengealgorithm
  • ToWhether the memory ratio of the space has been exceeded25%

By default, we create the object is assigned in the From space, when garbage recycling, the object To copy From the From space To space before, will first check the object’s memory address To determine whether To have experienced a Scavenge algorithm, if the address has been changed, the object will be transferred To the old generation Will not be copied To the To space, can be represented by the following flow diagram:

Scavenge
To
To
25%

25%
To
Scavenge
From
From
From

3.4 the old generation

The Scavenge avenge would obviously waste half of the memory as the insane were managed by so many living objects in the old age, and the new mark-sweep and Mark-Compact algorithms were used instead of the Scavenge avenge.

In the early days we may have heard of an algorithm called reference counting. The principle of this algorithm is simple: the object looks to see if there are any other references to it. If there are no references to the object, the object is treated as garbage and collected by the garbage collector, as shown in the following example:

Obj1 and obj2 are created. Obj2 is referenced by obj1 as an attribute of obj1 and therefore will not be garbage collected
let obj1 = {
    obj2: {
        a: 1}}// Create obj3 and assign obj1 to obj3 so that both objects point to the same memory address
let obj3 = obj1;

// reassign obj1 so that obj1 is now represented only by obj3
obj1 = null;

// Create obj4 and assign obj3.obj2 to obj4
// The object obj2 points to has two references: one as an attribute of obj3 and the other as a variable of obj4
let obj4 = obj3.obj2;

Obj3. Obj2 is referenced by obj4, so it cannot be recycled
obj3 = null;

// At this point obj3.obj2 has no references to it, so the object that obj3 points to can be reclaimed
obj4 = null;
Copy the code

In the above example, the object will eventually be garbage collected after a series of operations, but once we get to the circular reference scenario, there will be a problem. Let’s look at the following example:

function foo() {
    let a = {};
    let b = {};
    a.a1 = b;
    b.b1 = a;
}
foo();
Copy the code

In this example we will object a a1 attribute points to the object b, b1 attributes of the object b to objects a, forming two objects refer to each other, the foo function after the execution, the scope of the function has been destroyed, scope contains variables a and b should have can be recycled, but because the reference counting algorithm, Both variables have references to themselves, so they still cannot be collected, resulting in a memory leak.

So to avoid memory leaks caused by circular references, as of 2012 all modern browsers have abandoned this algorithm in favor of the new Mark-sweep and Mark-Compact algorithms. In the example referenced in the loop above, variables A and B cannot be marked because they are not accessible from the Window global object, so they are eventually recycled.

Mark-sweep is divided into two phases: mark-sweep, in which all objects in the heap are iterated, and mark-sweep, in which living objects are marked, and mark-sweep, in which dead objects are swept. The Mark-sweep algorithm mainly determines whether an object can be accessed and then knows whether the object should be reclaimed. The specific steps are as follows:

  • The garbage collector builds one internallyRoot listTo find the variables that can be accessed from the root node. In JavaScript, for example,windowA global object can be thought of as a root node.
  • The garbage collector then starts from all the root nodes, walks through the child nodes it can access and marks them as active. Places that the root node cannot reach are inactive and will be considered garbage.
  • Finally, the garbage collector will release any inactive memory blocks and return them to the operating system.

The root node can be used in any of the following cases:

  1. Global object
  2. Local variables and arguments of a local function
  3. Currently nested variables and arguments of other functions on the call chain

Mark-Sweep

In order to solve the problem of this memory fragments, Mark – Compact are asked (tag finishing) algorithm, the algorithm is mainly used to solve the problem of memory fragmentation, recycling process object clear after death, in the process of finishing, the meeting will be the object of the activity to one end of the heap memory for mobile, mobile to clean up after the completion of all the memory outside the boundary, It can be represented by the following flow chart:

  • Suppose there are four objects A, B, C, and D in the old generation

  • In the garbage collectiontagPhase to mark objects A and C as active

  • In the garbage collectionfinishingPhase to move live objects to one end of heap memory

  • In the garbage collectionremovePhase to reclaim all memory left of the active object

As mentioned in the previous article, due to the single thread mechanism of JS, the garbage collection process will hinder the execution of the same step task of the main thread, and the logic of executing the main task will be resumed after the garbage collection is completed. This behavior is called stop-the-world. The marking phase also blocks the execution of the main thread. Generally speaking, the old generation will hold a large number of living objects, and if the entire heap is traversed during the marking phase, it will cause serious lag.

Therefore, to reduce the pause time caused by garbage collection, V8 introduced Incremental Marking, in which a portion of objects in the heap are marked and then paused, handing execution back to the JS main thread. Wait until the main thread task is finished and then continue marking from the original pause mark until the whole heap is marked. This concept is similar to the Fiber architecture in the React framework. Only when the browser is idle, it will traverse the Fiber Tree to execute corresponding tasks. Otherwise, the execution will be delayed to minimize the impact on the main thread, avoid application lag, and improve application performance.

Thanks to incremental tagging, the V8 engine went on to introduce lazy sweeps and incremental compaction to make cleaning and recompaction happen incrementally. In order to make full use of the performance of multi-core CPU, parallel marking and parallel cleaning will also be introduced to further reduce the impact of garbage collection on the main thread and improve the performance of the application.

4. How to avoid memory leaks

In the process of writing code, we basically didn’t have much attention how to write the code to effectively avoid memory leaks, or browsers and most of the front frame at the bottom has help us deal with the common memory leak problem, but we still need to know about the common several ways to avoid memory leaks, After all, it is often a point of review during the interview process.

4.1 Create as few global variables as possible

In ES5, when a variable is created in the global scope as a var declaration, or when a variable is created in the function scope without any declaration, it is invisibly mounted to the Window global object, as shown below:

var a = 1; // equivalent to window.a = 1;
Copy the code
function foo() {
    a = 1;
}
Copy the code

Is equivalent to

function foo() {
    window.a = 1;
}
Copy the code

We created a variable a in foo but forgot to declare it using var, so we accidentally created a global variable and mounted it to the window object. There is also a more subtle way to create a global variable:

function foo() {
    this.a = 1;
}
foo(); // equivalent to window.foo()
Copy the code

When foo is called, it refers to the window global object, so this refers to window and creates a global variable unintentionally. During garbage collection, since the Window object can be used as the root node during the tagging phase, properties mounted on the window can be accessed and marked as active to live in memory, thus not being garbage collected. The global scope is destroyed only when the entire process exits. If you encounter scenarios where you must use global variables, be sure to set the global variable to NULL after it is used to trigger the recycle mechanism.

4.2 Manually Clearing timers

In our applications, we often use setTimeout or setInterval timers. The timer itself is a very useful function, but if we are not careful and forget to manually clear the timer at the appropriate time, it may lead to memory leaks, as shown in the following example:

const numbers = [];
const foo = function() {
    for(let i = 0; i <100000;i++) {
        numbers.push(i);
    }
};
window.setInterval(foo, 1000);
Copy the code

In this example, because we did not manually clear the timer, the callback would continue and the numbers variables referenced in the callback would not be garbage collected, resulting in an infinitely increasing number array and a memory leak.

4.3 Use closures less

Closures are an advanced feature in JS, and the clever use of closures can help us achieve many advanced functions. Normally, if we cannot find a variable in the local scope, we will do a one-way search from inside out along the scope chain, but the closure feature allows us to access variables in the inner scope from outside, as shown in the following example:

function foo() {
    let local = 123;
    return function() {
        returnlocal; }}const bar = foo();
console.log(bar()); / / - > 123
Copy the code

In this example, foo returns an anonymous function that internally refers to the local variable in foo and uses bar to refer to the anonymous function definition. This closure allows us to access the local variable in foo’s outer scope. Normally, when foo is finished, its scope is destroyed, but the local variable cannot be reclaimed because there are variables that reference the anonymous function it returns. Only when the anonymous function is unreferenced does the local variable go into garbage collection.

4.4 Clearing A DOM Reference

To avoid retrieving DOM elements more than once, we used to store DOM elements in a data dictionary, as shown in the following example:

const elements = {
    button: document.getElementById('button')};function removeButton() {
    document.body.removeChild(document.getElementById('button'));
}
Copy the code

In this example, we want to call the removeButton method to remove the button element, but because there is a reference to the button element in the Elements dictionary, even if we remove the button element with the removeChild method, It is still stored in memory and cannot be freed until we manually clear the reference to the button element.

4.5 a weak reference

Through the previous several examples, we will find that if we are negligent, it will easily cause the problem of memory leakage. Therefore, in ES6, two effective data structures, WeakMap and WeakSet, are added for us, which are born to solve the problem of memory leakage. The object referenced by its key name is weak reference. Weak reference means that the reference to the object by the key name is not taken into account in the garbage collection process. As long as the referenced object has no other reference, the garbage collection mechanism will release the memory occupied by the object. This means that we do not need to care about the reference of key name to other objects in WeakMap, nor do we need to manually clear the reference. We try to demonstrate the process in Node (refer to ruan Yifeng ES6 standard introduction example, and manually implement it again).

First open the Node command line and type the following command:

Node -- expose-GC // -- expose-GC allows manual garbage collectionCopy the code

Then we execute the following code.

// Perform a manual garbage collection to ensure memory data is accurate
> global.gc();
undefined

The heapUsed field is about 4.4MB in size
> process.memoryUsage();
{ rss: 21626880.heapTotal: 7585792.heapUsed: 4708440.external: 8710 }

// Create a WeakMap
> let wm = new WeakMap(a);undefined

// Create an array and assign it to the variable key
> let key = new Array(1000000);
undefined

// Point the WeakMap key name to this array
// There are two references in this array, one is key and the other is the key name of WeakMap
// Note that WeakMap is a weak reference
> wm.set(key, 1);
WeakMap { [items unknown] }

// Perform a garbage collection manually
> global.gc();
undefined

// Looking at the memory usage again, heapUsed has increased to about 12MB
> process.memoryUsage();
{ rss: 30232576.heapTotal: 17694720.heapUsed: 13068464.external: 8688 }

// Manually clear the array reference to the variable key
// Note that the key name reference to the array in WeakMap is not cleared
> key = null;
null

// Perform the garbage collection again
> global.gc()
undefined

// Check the memory usage and find that heapUsed is back to its previous size (4.8m).
> process.memoryUsage();
{ rss: 22110208.heapTotal: 9158656.heapUsed: 5089752.external: 8698 }
Copy the code

In the above example, we found that although we did not manually clear the key name reference to the array in WeakMap, the memory still returned to the original size, indicating that the array has been reclaimed, so this is the specific meaning of weak reference.

5, summary

This article focuses on the V8 engine garbage collection mechanism, describes the garbage collection strategy and the corresponding garbage collection algorithm in different generations from the new generation and the old generation, and then lists some common ways to avoid memory leaks to help us write more elegant code. If you already know about garbage collection related content, then this article can help you easy to review deepen the impression, if not understand, so the author also hopes this article can help you understand some code underlying knowledge level, as a result of the V8 engine source are implemented in c + +, so the author also didn’t do this aspect of the deep, Interested partners can explore by themselves, there are mistakes in the article, but also hope to be able to point out in the comments section.

6, communication

If you think the content of this article is helpful to you, can you do a favor to pay attention to the author’s public account [front-end], every week will try to original some front-end technology dry goods, after paying attention to the public account can invite you to join the front-end technology exchange group, we can communicate with each other, common progress.

The article has been updated toMaking a blog, if your article is ok, welcome to STAR!

One of your likes is worth more effort!

Grow up in adversity, only continue to learn, to become a better yourself, with you!