Until recently, the JS garbage collection mechanism was stuck in the ‘allocated memory is no longer needed’ phase. The question is, how does the browser determine that ‘allocated memory is no longer needed’?

  • Introduction of memory
  • Introduction to Garbage Collection

Introduction of memory

MDN: High-level languages like C typically have low-level memory management interfaces, such as malloc() and free(). JavaScript, on the other hand, allocates memory when creating variables (objects, strings, etc.) and “automatically” frees them when they are no longer in use. The latter process is called garbage collection. This “automatization” is a source of confusion and makes JavaScript (and other high-level languages) developers feel they can afford not to care about memory management. This is wrong.

Memory life cycle

  1. Allocate as much memory as you need
  2. Use allocated memory (read, write)
  3. Release \ return it when it is not needed

JavaScript memory allocation

To spare programmers the trouble of allocating memory, JavaScript does so when variables are defined.

Value initialization

var n = 123; // Allocate memory for numeric variables
var s = "azerty"; // Allocate memory to the string

var o = {
  a: 1.b: null
}; // Allocate memory for objects and their contained values

// Allocate memory for arrays and their values (just like objects)
var a = [1.null."abra"]; 

function f(a){
  return a + 2;
} // Allocate memory for functions (callable objects)

// Function expressions can also assign an object
someElement.addEventListener('click'.function(){
  someElement.style.backgroundColor = 'blue';
}, false);
Copy the code

Allocate memory through function calls

Some function calls result in allocating object memory:

var d = new Date(a);// Assign a Date object

var e = document.createElement('div'); // Assign a DOM element
Copy the code

Some methods allocate new variables or new objects

var s = "azerty";
var s2 = s.substr(0.3); // s2 is a new string
// Since strings are invariants,
// JavaScript may decide not to allocate memory,
// only the range [0-3] is stored.

var a = ["ouais ouais"."nan nan"];
var a2 = ["generation"."nan nan"];
var a3 = a.concat(a2); 
// The new array has four elements, which are the result of a joining a2
Copy the code

Using value

The process of working with values is actually reading and writing to allocated memory. Reading and writing may be writing the value of a variable or an object property, or even passing the parameters of a function.

Release memory when it is no longer needed

MDN: Most memory management problems are at this stage. The hardest task here is to find out that “the allocated memory is really no longer needed.” It often requires the developer to determine which chunk of memory in the program is no longer needed and to release it.

The high-level language interpreter has a “garbage collector” embedded, whose main job is to track the allocation and use of memory so that when allocated memory is no longer used, it is automatically freed. This can only be an approximation, as it is impossible to determine whether a block of memory is still needed.


Introduction to garbage collection mechanism policies

Concept of reference

Garbage collection algorithms rely heavily on the concept of references.

In a memory-managed environment, an object that has access to another object (implicitly or explicitly) is said to reference another object. For example, a Javascript object has references to its stereotype (implicit references) and to its attributes (explicit references).

The concept of “object” specifically refers not only to JavaScript objects, but also to function scopes (or global lexical scopes).

Reference counting garbage collection

This is the most rudimentary garbage collection algorithm. This algorithm simplifies the definition of “whether an object is no longer needed” to “whether the object has other objects referring to it”. If there are no references to the object (zero references), the object will be collected by the garbage collection mechanism.

var o = { 
  a: {
    b:2}};// Two objects are created, one referenced as an attribute of the other, and the other assigned to the variable o
// Apparently, none of them can be garbage collected


var o2 = o; The o2 variable is the second reference to "this object"

o = 1;      // The original reference o to "this object" is now replaced by O2

var oa = o2.a; // Reference the a attribute of "this object"
// Now "this object" has two references, one to O2 and one to oa

o2 = "yo"; // The original object is now zero references
           // He can be recycled
           // However, the object of its property A is still referenced by OA, so it cannot be reclaimed yet

oa = null; // The object with the a attribute now has zero references
           // It can be garbage collected
Copy the code

Reference counting defect

This algorithm has a limitation: it cannot handle circular references. In the following example, two objects are created and referenced to each other, forming a loop. They leave function scope when called, so they are no longer useful and can be recycled. However, the reference-counting algorithm takes into account that they all have at least one reference to each other, so they are not recycled.

function f(){
  var o = {};
  var o2 = {};
  o.a = o2; // o references o2
  o2.a = o; // o2 references o

  return "azerty";
}

f();
Copy the code

Mark-clear algorithm

This algorithm simplifies the definition of “whether an object is no longer needed” to “whether an object is available”.

This algorithm can be divided into two stages, one is the mark stage and the other is the sweep stage.

  1. In the marking phase, the garbage collector iterates from the root object. An identity is added to each object that can be accessed from the root object, and the object is identified as reachable.
  2. In the garbage collector phase, the heap memory is traversed linearly from beginning to end. If an object is not identified as a reachable object, the memory occupied by the object is reclaimed and the identity previously marked as a reachable object is cleared for the next garbage collection operation.

Take a quick look at the following two images

  • In the marking phase, if the root object 1 can be accessed from B and from B to E, then B and E are reachable, and in the same way, F, G, J, and K are reachable.
  • During the collection phase, all objects not marked as reachable are collected by the garbage collector.

This algorithm is better than the previous one because “objects with zero references” are always unreachable, but the opposite is not necessarily so, see “circular references”.

Since 2012, all modern browsers have used the mark-sweep garbage collection algorithm. All of the improvements to JavaScript garbage collection algorithms are based on improvements to the mark-sweep algorithm, not on improvements to the mark-sweep algorithm itself and its simplified definition of whether an object is no longer needed.

When to start garbage collection

In general, unreferenced objects are not immediately reclaimed when the tag clearing algorithm is used. Instead, garbage objects accumulate until memory runs out. When memory runs out, the program is suspended and garbage collection begins.

Mark – Clear algorithm defects

  • Objects that cannot be queried from the root object are cleared
  • Garbage collection can result in a large amount of memory fragmentation, as shown in the image above. After garbage collection, there are three memory fragments in memory. Assuming that one square represents one unit of memory, if an object occupies three units of memory, it will cause the Mutator to remain suspended. The Collector attempts garbage collection until it is Out of Memory.

ChromeV8 garbage collection algorithmGeneration GC

This is consistent with the idea of a Java reclamation strategy. The purpose is to distinguish between “temporary” and “permanent” objects; Recycle more “young generation” and less “tenured generation” to reduce the number of objects to traverse each time, thus reducing the time of each GC. The V8 engine used in Chrome is a generational recycling strategy.

“Temporary” and “permanent” objects are also called “new generation” and “old generation” objects

V8 generation recycling

V8 Memory limits

There is a limit to how much memory javascript can use in Node.

  1. About 1.4GB for 64-bit systems.
  2. About 0.7GB for a 32-bit system.

Corresponds to generational memory, by default.

  1. The memory size of the 32-bit system is 16MB for the new generation and 700MB for the old generation.
  2. On a 64-bit system, the memory size of the new generation is 32MB, and that of the old generation is 1.4GB.

On average, the new generation is divided into two equal memory Spaces, called Semispace, each of 8MB (32-bit) or 16MB (64-bit).

This limit can be adjusted at node startup by passing –max-old-space-size and –max-new-space-size, as in:

node --max-old-space-size=1700 app.js // The unit is MB
node --max-new-space-size=1024 app.js // The unit is MB
Copy the code

The above parameters take effect when V8 is initialized and cannot be changed dynamically once they take effect.

Why does V8 have memory limits

  • The ostensible reason is that V8 was originally designed as a JavaScript engine for the browser, and is unlikely to encounter a large memory scenario.
  • Further down the line is the limitations of V8’s garbage collection mechanism. Since V8 needs to ensure that the JavaScript application logic is different from what the garbage collector sees, V8 blocks the JavaScript application logic during garbage collection and re-executes the JavaScript application logic until the garbage collection is complete, This behavior is known as stop-the-world.
  • With a 1.5GB heap, V8 can take more than 50ms to do a small garbage collection and more than a second to do a non-incremental garbage collection.
  • In this case, the browser will not respond to the user within 1s, resulting in suspended animation. Animations, if any, will also be significantly affected.

V8 New Generation algorithm (Scavenge)

The insane are recycled mainly through the Scavenge algorithm. The Cheney algorithm is used in the application of Scavenge.

  • The Cheney algorithm is a duplicative garbage collection algorithm that splits the heap memory in two, with only one space in use and one that is idle.
  • The space in use is called From space, and the space in idle is called To space.
  • When objects are allocated, they are first allocated in the From space, and when garbage collection begins, the live objects in the From space are checked and copied To the To space, without the space occupied by the non-live objects being freed.
  • After the replication is complete, the roles of the From and To Spaces are reversed.
  • In a nutshell, garbage collection is done by copying living objects in both Spaces.

    The disadvantage of the Scavenge algorithm is that it uses only half of the heap memory, but because it replicates only viable objects and only a small number of viable objects for short life cycles, it performs exceptionally well in terms of time efficiency.


promotion

The above is described in the pure Scavenge algorithm, but in generational garbage collection, objects that survive in the From space are examined before being copied To the To space, and under certain conditions objects that survive longer are moved To older generations, a process known as object promotion.

There are two criteria for an object to be promoted. One is whether the object has experienced Scacenge recycling:

The other case is when more than 25% of the To space should be used, then the object is promoted directly To the old generation space.

V8 Mark-Sweep (Mark-Compact)

The Scavenge avenge has two problems due to the large number of exploiture.

  • First, there will be more live objects, and the efficiency of copying live objects will be reduced.
  • The other problem, again, is the waste of half the space. For this reason, V8 in its old generation mainly used a combination of Mark-sweep and Mark-compact for garbage collection.

Mark-sweep (mark-sweep)

This algorithm has been mentioned before, but here it is again.

  • Unlike the Scavenge, Mark-sweep does not split memory in two, so there is no such thing as wasting half the space. Mark-sweep iterates over all objects in the heap memory in the marking phase and marks living objects, and in the subsequent cleaning phase, only unmarked objects are cleared.
  • That is, Scavenge only copies living objects, while Mark-sweep only cleans dead objects. Living objects only account for a small part in the new generation, and dead objects only account for a small part in the old generation, which is why both recycling methods can be processed efficiently.
  • However, a big problem with this algorithm is that there are too many memory fragments. If a large memory needs to be allocated, a garbage collection will be triggered prematurely because there is not enough debris space left to complete the allocation, which is not necessary.
  • So it’s based on thatMark-CompactAlgorithm.

Mark-compact (Mark-compact Algorithm)

Mark-compact moves the living object to one end of the memory space after marking it, and then cleans up all memory outside the boundary.


Memory problems

  1. Does the current Chrome browser still have memory leaks?
  2. What is the difference between a memory leak and an overflow?
  3. When does Chrome start memory reclamation?
  4. Surely reclaiming allocated memory is better than not reclaiming at all?

Ps: Please do not reprint, only study exchange use.