Recently, the project has entered the maintenance period, with little demand and idle time, which makes me feel a sense of crisis. Every day, I feel like muddling through, like cooking a frog in warm water. It has been 3 years since I graduated, and I am afraid that when I get to 5 years of experience, my ability is the same as when I got to 3 years of experience. So I began to sort out my own technical points, and I found an article in my favorites, the self-check list of a [qualified] front-end engineer, and I saw two problems in it:

  • JavaScriptWhat is the exact memory storage form of the variable in?
  • Browser garbage collection mechanism, how to avoid memory leaks?

Then all kinds of information, sorted out this article.

After reading this article, you can learn:

  • How is JavaScript memory managed?
  • How does Chrome do garbage collection?
  • What improvements have Chrome made to garbage collection?

Original address welcome star

Memory management for JavaScript

Regardless of the programming language, the memory life cycle is basically the same:

  1. Allocate as much memory as you need
  2. Use allocated memory (read, write)
  3. Release \ return it when it is not needed

Unlike other languages that require manual memory management, in JavaScript, when we create variables (objects, strings, etc.), the system automatically allocates the corresponding memory to the objects.

var n = 123; // Allocate memory for numeric variables
var s = "azerty"; // Allocate memory to the string

var o = {
  a: 1.b: null
}; // Allocate memory for objects and their contained values

// Allocate memory for arrays and their values (just like objects)
var a = [1.null."abra"]; 

function f(a){
  return a + 2;
} // Allocate memory for functions (callable objects)

// Function expressions can also assign an object
someElement.addEventListener('click'.function(){
  someElement.style.backgroundColor = 'blue';
}, false);
Copy the code

When the system finds that these variables are no longer used, it automatically frees (garbage collection) the memory of these variables, so developers don’t have to worry too much about memory problems.

However, we need to understand the memory management mechanism of JavaScript to avoid unnecessary problems, such as the following code:

= = {} {}// false[] [] = =// false
' '= =' ' // true
Copy the code

In JavaScript, data types are divided into two types, simple types and reference types. For simple types, memory is stored in the stack space, and for complex types, memory is stored in the heap space.

  • Basic types: Each of these types occupies a fixed amount of memory, and their values are stored in stack space, which we access by value
  • Reference type: Reference type, value size is not fixed, stack memory store address refers to the heap memory object. It is accessed by reference.

As for the memory space of the stack, only the memory for simple data types is automatically allocated and released by the operating system. As the size of the memory in the heap space is not fixed, the system cannot release the memory automatically. At this time, the JS engine is required to release the memory manually.

Why is garbage collection needed

In Chrome, V8 is limited in memory usage (64-bit around 1.4g /1464MB, 32-bit around 0.7G/732MB), so why limit it?

  1. The ostensible reason is that V8 was originally designed for browsers and is unlikely to encounter scenarios that use large amounts of memory
  2. The underlying reason is the limitations of V8’s garbage collection mechanism (if cleaning up a large amount of memory garbage is time consuming, it can cause JavaScript threads to pause execution, and performance and application plummet).

Said in front of the stack in the memory, the operating system will automatically for memory allocation and release of memory, and memory in the heap, by JS engines (such as Chrome’s V8) manually release, when we don’t have written in the correct code, will make the garbage collection mechanism of JS engine can’t the right to release the memory (leak), As a result, the memory footprint of the browser continues to increase, which leads to degradation of JavaScript, applications, and operating system performance.

Chrome garbage collection algorithm

In JavaScript, the vast majority of objects have a very short lifetime, and most of them are freed after a garbage collection, while a small number of objects have a long lifetime and are always active and do not need to be recycled. In order to improve recycling efficiency, V8 divides the heap into two types: new generation and old generation. In the new generation, objects with a short lifetime are stored, and old generation, objects with a long lifetime are stored.

Newborn areas usually only support 1 to 8M capacity, while older areas support much more capacity. V8 uses two different garbage collectors for each of these areas to perform garbage collection more efficiently.

  • Scavenge: Responsible for the recycling of new generation waste.
  • Main garbage collector – Mark-Sweep & Mark-Compact: Takes care of old generation garbage collection.

The next generation of garbage recyclers – Scavenge

In JavaScript, the memory allocated by any object declaration will be placed in the new generation first, and since most objects live in memory for a short period of time, a very efficient algorithm is required. The Scavenge avenge is used primarily in the New generation. The Scavenge is a typical space-for-time replication algorithm that is ideal for applications where space is not utilized.

Scavange algorithm divides the new-generation heap into two parts, named from-space and to-space respectively. The working mode is also very simple, that is, it copies the living active objects in from-space to to-space, and arranges the memory of these objects in an orderly way. Then the memory of the from-space inanimate object is freed. After that, the from space and to space are interchanged so that the two regions can be reused in the new generation.

The simple description is:

  • Marks active and inactive objects
  • Copy live objects from space to to space and sort them
  • Frees memory for inactive objects in from space
  • Swap the roles of from space and to space

So how does the garbage collector know which objects are active and inactive?

There is a concept called object reachability, which means to start from the initial pointer to the root object (window, global), which is called the root set. From this root set, the child nodes are searched down. The child nodes that are searched indicate that the reference object of the node is reachability, and they are marked. The search process is then recursed until all child nodes have been traversed. The unmarked object node means that the object is not referenced anywhere, which proves that it is an object that needs to be freed and can be collected by the garbage collector.

When does the object of the new generation become the object of the old generation?

In the new generation, it is further subdivided into the nursery generation and intermediate generation. When an object is allocated memory for the first time, it is allocated to the nursery generation in the new generation. If the object still exists in the new generation after the next garbage collection, At this point we move to the Intermediate generation, and after the next garbage collection, if the object is still in the new generation, the secondary garbage collector moves the object to the old generation, a process called promotion.

Old generation garbage Collection – Mark-sweep & Mark-Compact

Objects in the Cenozoic era are promoted to the old age space after meeting certain conditions. The objects in the old age space have been recycled at least once or more, so they are more likely to survive. Two problems occur when the Scavenge algorithm is used:

  • Scavenge is a replication algorithm that duplicates live objects repeatedly
  • Scavenge is an algorithm that trades space for time efficiency, and the older generation support the exploitoring, resulting in space waste

Therefore, mark-sweep and Mark-Compact algorithms are used in old generation space.

Mark-Sweep

Mark-sweep is processed in two stages, the tagging and cleaning phases, which look similar to the Scavenge avenge, except the Scavenge algorithm replicates the insane, and because the insane are the majority of the insane, mark-sweep tags both the insane and the insane, Clear the inactive objects directly.

  • Marking stage: The first scan of the old generation is carried out to mark active objects
  • Cleaning stage: The old generation is scanned for the second time to clear the unmarked objects, that is, to clean the inactive objects

Everything seems perfect, but there is still a problem, cleared objects all over the memory address, resulting in a lot of memory fragmentation.

Mark-Compact

After mark-sweep is completed, a lot of memory fragments are generated in the old generation’s memory. If these memory fragments are not cleared, if a large object needs to be allocated, all the fragmented space will not be allocated at all, and garbage collection will be triggered in advance, which is not necessary.

In order to solve the memory fragmentation problem, Mark-Compact was proposed, which is based on Mark-sweep. Compared to Mark-Sweep, Mark-Compact added a live object defragmentation phase, in which all live objects were moved to one end and the memory outside the boundary was cleaned up.

Full Stop Stop – The – World

Since garbage collection is performed in the JS engine, and the Mark-Compact algorithm needs to move objects during execution, it cannot execute very fast when there are many live objects. To avoid inconsistencies caused by competing memory resources between the JavaScript application logic and the garbage collector, The garbage collector pauses the JavaScript application, a process known as stop-the-world.

In the Cenozoic era, the application of the Scavenge algorithm is relatively efficient due to the small space, the small number of survivable objects, and the fast execution of the Scavenge algorithm. This is not the case in the old generation. If there are many live objects in the old generation, the garbage collector will pause the main thread for a long time, causing the page to become stagnant.

Optimize the Orinoco

In order to improve the user experience and solve the total pause problem, Orinoco uses incremental marking, lazy cleanup, concurrency, and parallelism to reduce the main thread hang time.

Incremental marking

To reduce the pause time for full heap garbage collection, incremental markup splits the original marked full heap object into tasks that are executed between JavaScript application logic, allowing pauses of 5 to 10ms for the heap to mark. Delta marking is enabled when the heap size reaches a certain threshold, after which execution of the script pauses and delta marks each time a certain amount of memory is allocated.

Lazy cleaning – the Lazy

Incremental markup only marks active and inactive objects, and lazy cleanup is used to actually clean up memory. When incremental tag is completed, if the current available memory to rapidly perform code, actually we don’t need to immediately clean the memory, the process can be delayed, let JavaScript logic code execution, also need not one-time finish clean up all the active objects memory, garbage collector will demand one by one to clean up, Until all the pages are clean.

Incremental markup and lazy cleanup have reduced the maximum pause time of the host thread by 80%, making the user’s interaction with the browser much smoother. In terms of implementation mechanism, the object pointer in the heap may change due to the JavaScript code executed between each small incremental markup. The need to use write barriers to record these changes in reference relationships also exposes the disadvantages of incremental tags:

  • Does not reduce the total pause time of the main thread, and even increases it slightly
  • Because of the cost of a write-barrier mechanism, incremental markers can reduce the throughput of an application

Concurrent – Concurrent

Concurrent GC allows garbage collection to take place without suspending the main thread. Both can take place at the same time, with only occasional pauses to allow the garbage collector to perform special operations. However, this approach also faces the problem of incremental collection, in which the write barrier operation is performed because the reference relationship of the objects in the heap can change at any time due to the JavaScript code executing.

Parallel to the Parallel

Parallel GC allows the main thread and the worker thread to perform the same GC work at the same time. This allows the worker thread to share the GC work with the main thread, so that the time spent in garbage collection is equal to the total time divided by the number of threads involved (plus some synchronization overhead).

V8 current garbage collection mechanism

In 2011, V8 introduced incremental tagging. Until 2018, Chrome64 and Node.js V10 enabled Concurrent tagging, while Parallel technology was added to concurrency, resulting in significantly shorter garbage collection times.

Secondary garbage collector

V8 uses parallel mechanism in the new generation garbage collection. In the collation and sorting stage, that is, when active objects are copied from FROm-to to space-to, multiple helper threads are enabled to collate in parallel. Due to multiple threads to a new generation of heap memory resources of the competition, is likely to have an activity object by multiple threads copy operation problem, in order to solve this problem, V8 in the first thread to copy and copy after the completion of the event object, must be to maintain replicated object pointer forwarding address after this activity, So that other helper threads can find the active object and determine whether the active object has been copied.

Main garbage collector

V8’s legacy garbage collection enables Concurrent marking tasks if the size of memory in the heap exceeds a certain threshold. Each worker thread keeps track of the pointer and reference to each tagged object, and the concurrent marking happens in the background worker process when JavaScript code executes. When an object pointer in the heap is modified by JavaScript code, Write barriers technology tracks worker threads as they make concurrent markers.

When concurrent mark complete or dynamically allocated memory to limit of time, the main thread will perform the final mark step quickly, this time the main thread will hang, the main thread will once again to scan the root set to ensure that all the objects are done tag, as auxiliary thread has been tag object, the main thread of the scanning is to check the operation, After verification, some helper threads will clean up memory, and some helper processes will clean up memory. Since both are concurrent, the main thread JavaScript code will not be affected.

The end of the

Most JavaScript developers don’t need to think about garbage collection, but knowing the ins and outs of garbage collection can help you understand how memory usage is, and how to detect memory leaks based on memory usage. Preventing memory leaks is an important part of improving your application’s performance.

reference

  • NodeJs internals: V8 & garbage collector
  • Orinoco: V8 garbage collector
  • Trash talk: the Orinoco garbage collector
  • Chrome garbage collection mechanism and memory leak analysis
  • Stack and garbage collection in JavaScript
  • Super detailed node/ V8 / JS garbage collection mechanism
  • Garbage collection technology
  • Class 4 JavaScript memory Leaks and how to avoid them
  • Memory management
  • Memory management and V8 garbage collection mechanisms in Nodejs