JavaScript garbage collection mechanism

When learning about closures and using them, I often see one of the disadvantages of closures in various blogs:

  • Improperly used closures can cause memory leaks in IE(pre-IE9)

The garbage collection algorithm used by the JavaScript engine in IE9 is reference counting, which for circular references will cause the GC to fail to reclaim memory that “should be reclaimed”. This causes a senseless memory footprint, known as a memory leak.

A Memory Leak refers to a waste of system Memory caused by a dynamically allocated heap Memory that is not released or cannot be released by the program for some reason, slowing down the program and even crashing the system.

JS garbage collection mechanism

An entry flag is added when a variable enters the execution environment, and an exit flag is added when a variable leaves. Tag cleanup is when GC marks all variables at run time, then removes those variables that are still in the environment or are still referenced by variables in the environment, and clears any variables that are still marked.

So, I have the following questions:

  • What is a circular reference?
  • What is GC? How does it work?
  • Why will reference counting algorithms cause memory to fail to be freed?
  • How many garbage collection algorithms does JavaScript (or JavaScript engine) have?

What is the GC

GC stands for Garbage Collection. Garbage is unused memory space (which may have been used before and will never be used again). The GC is the garbage collector, and because it works inside the JavaScript engine, it works “somewhat” silently for us front-end developers (note the quotes here).

Essential basic concepts

  • A HEAP is a memory space that holds objects dynamically, and objects are reference types in JavaScript.
  • Mutator is an obscure term that stands for the application itself in GC, so let’s just say that mutator requires a lot of memory.
  • Allocator, where the Mutator submits requests that require memory. Allocator is responsible for allocating enough memory from the heap for the Mutator to use.

  • Active/inactive objects: Represents objects referenced by mutator, for example
var a = { name: "bar" }; // 'this object' is referenced by A and is an active object.

a = null; // 'this object' is not referenced by A. This object is an inactive object.
Copy the code

Several commonly used GC algorithms

Reference counting method

As the name suggests, let all object implementations keep track of how many “programs” refer to them, and let each object know its “popularity index.” Here’s a simple example:

var a = new Object(a);// the reference count for 'this object' is 1.
var b = a; // The reference count for 'this object' is 2 (a,b)
a = null; // reference_count = 1
b = null; // reference_count = 0
// The next GC is to reclaim 'this object'
Copy the code

This approach has advantages and disadvantages:

advantage

  • Garbage can be collected immediately, and when the referenced value is 0, the object immediately connects itself to the free list as free space, i.e. It is recycled as soon as it becomes garbage.
  • Since it is an instant collection, the ‘program’ does not pause to use GC alone for a long period of time, so the maximum pause time is short.
  • You don’t have to go through all the live and inactive objects in the heap.

disadvantage

  • Counters need to be large because you can’t predict the upper limit of references. For example, if you have 32 bits (2 to the 32nd) of objects referencing an object at the same time, then the counter needs 32 bits.
  • The biggest disadvantage is that you can’t solve the problem that circular references can’t be recycled, which is what happened in IE9 before
function f() {
  var o = {};
  var o2 = {};
  o.a = o2; // o references o2, whose number of references is 1
  o2.a = o; // o2 refers to o, and the reference to o is 1

  return "azerty";
}

f();
Copy the code

Fn is supposed to reclaim the memory in fn’s scope after execution, but because o has a property that references O2, the number of references to O2 is always 1, as is O2, and it is not intended to be used as a closure, it should be destroyed. Because the algorithm destroys objects with zero references, none of which are zero here, the GC will not reclaim them, and this is a memory leak.

This algorithm has gradually been replaced by the ‘mark-clear’ algorithm, which is the most used in V8 engines

Mark clearing algorithm

The GC garbage collection process is divided into two phases

  • Mark phase: Mark all active objects.
  • Cleanup phase: Destroy unmarked (and therefore inactive) objects.

Mark phase

Root can understand our global scope, the GC from the global scope of variables, the scope step by step to traverse (yes, depth traversal), when the traverse to the heap object that the object is referenced, and play a tag, continue to the recursive traversal (because certainly exist in the heap object references another object in the heap). Walk until you reach the last (deepest level of scope) node.

Sweep phase

Iterate again, this time through the heap, retrieving unmarked objects.

Advantage:

  • The realization of simple, marked that is to play or not play two possibilities, so a binary bit can be expressed
  • Fixed the circular reference problem

disadvantages

  • Cause fragmentation (somewhat similar to disk fragmentation)
  • If a block size is never found, the free linked list (the list of addresses that hold all the free address Spaces in the heap) is traversed all the way to the end

Replication algorithm

The copy algorithm is very simple to understand with this diagram. It simply copies live objects in one space to other Spaces.

Divide a memory space into two parts, one is From space, the other is To space, copy the live objects in the From space To the To space, then free the whole From space, and then swap the identity of the From space and To space at the moment, and then complete a GC.