Introduction to the

This article mainly introduces the garbage collection strategy in JS and Google V8 engine optimization on garbage collection and common some will cause memory leakage operations, hope to help you.

What is garbage collection

Garbage Collection means Garbage Collection. Our program generates a lot of Garbage in the process of working. The Garbage is unused memory or used memory space that will not be used in the future.

Of course, not all languages have GC. General high-level languages have GC, such as Java, Python, JavaScript, etc., and there are also languages with or without GC, such as C, C++, etc., which requires us to manually manage memory, relatively troublesome.

How is garbage produced and how do you judge it

Garbage is simply memory that is not used by the program or that has been used before and will not be used again.

How do you know if an object can be accessed, how do you know if an object can be accessed? There’s a technical term for accessibility. Judge by whether the object is reachable. Reachable does not need to be recycled, unreachable does need to be recycled.

Let me give you a quick example

let test = {name: 'randy'}

test = [1.2.3]
Copy the code

Js data types are divided into basic data types and reference data types. Reference data types are stored in the stack as references, which are actually stored in the heap.

In the example above, we first created a test variable to point to the object {name: ‘randy’}, and then to the new array [1, 2, 3], so that the previous {name: ‘randy’} could not be accessed (not reachable) and would become garbage.

Why recycle

From the above example we can see that generating garbage will lead to waste of memory space, one or two is ok, too many of our program may become more and more slow, and eventually crash.

Therefore, garbage collection is needed to help us automatically clean up the garbage, freeing up more memory for the current program, so that the program will continue to run smoothly.

Garbage Collection strategy

Two of the most common garbage collection strategies are token scavenging and reference counting.

Mark clearance

Mark-sweep is currently the most commonly used algorithm in JavaScript engines. Up to now, most browser JavaScript engines use mark-sweep, but the major browser manufacturers have optimized this algorithm. And JavaScript engines in different browsers differ in how often they run garbage collection.

As its name suggests, the algorithm is divided into two phases: mark phase, which marks all live objects, and clear phase, which destroys unmarked (and thus inactive) objects.

When the engine uses the mark-clearing algorithm, it needs to go through all the objects in memory to mark from the starting point, and this starting point has many, we call a set of root objects, and the so-called root object, in the browser environment, includes and not only the global Window object, the document DOM tree, and so on.

Mark clearing procedure

The general process of the mark clearing algorithm is as follows

  1. The garbage collector assigns a tag to all variables in memory at run time.
  2. It then iterates through each root object, removing the markers of reachable variables.
  3. Thereafter, variables that are still marked are treated as variables ready for deletion, all remaining marked garbage is cleaned up, and the memory space they occupy is destroyed and reclaimed.

Mark clearing advantage

The realization of the mark clearing algorithm is relatively simple, and the marking is no more than two cases. And the execution efficiency is high.

Mark clearance defect

Tag removal algorithm after removal, the rest of the object memory location is the same, can also lead to free memory space is discrete, the memory fragments (pictured), and because the remaining free memory is not a block, it is made up of different sizes of memory memory list, it will get sucked out of the memory allocation problem.

Assume that the size of memory allocated for a new object is size. Since the free memory is discontinuous and discontinuous, a one-way traversal of the free memory list is required to find blocks greater than or equal to size before it can be allocated (as shown in the figure below).

So how do you find the right block? There are three allocation strategies we can use

  • First-fit, find greater than or equal tosizeThe block returns immediately
  • Best-fit, traverses the entire free list, returning greater than or equal tosizeThe smallest block of
  • Worst-fit, traverse the entire free list, find the largest chunk, and then cut it into two parts, onesizeSize and return that part

Among the three strategies, worst-fit seems to have the most reasonable space utilization, but in fact, it will cause more small pieces and form memory fragmentation, so it is not recommended to use it. For first-fit and best-fit, First-fit is a smarter choice considering the speed and efficiency of allocation

In summary, there are two obvious drawbacks to the tag clearing algorithm or strategy

  • Memory fragmentation, free memory blocks are discontinuous, easy to appear many free memory blocks, but also may occur when allocating the required memory object is too large to find the appropriate block.
  • Slow allocationBecause even usingFirst-fitA policy whose operation is still oneO(n)The worst case scenario is to traverse to the end each time, and large objects are allocated more slowly because of fragmentation.

Reference counting method

Reference Counting (Reference Counting), which is an old garbage collection algorithm, it no longer simplifies the definition of whether an object needs to be referred to by other objects. If there is no Reference to the object (zero Reference), the object will be collected by the garbage collection mechanism. Currently, this algorithm is rarely used. Because it has a lot of problems.

Reference counting procedure

Reference counting keeps track of how many times each value is referenced. When a variable is declared and a reference type value is assigned to the variable, the number of references to the value is 1.

If the same value is assigned to another variable, the number of references to the value is increased by one. Conversely, if the variable containing a reference to the value changes the reference object, the number of references to the value is reduced by one.

When the number of references to this value goes to zero, there is no way to access the value, and the memory space it occupies can be reclaimed.

This way, the next time the garbage collector runs, it frees the memory occupied by values that are referenced zero times.

Let’s look at an example

let name1 = { name:'randy' }; //count==1 
let name2 = name1;            //count==2
b = null;                     //count==1
a = null;                     / / count = = 0 are cleared
Copy the code

Advantages of reference counting

The reference count is recycled when the reference value is zero, the moment it becomes garbage, so it can recycle garbage immediately. The tag clearing algorithm needs to be performed every once in a while, and the thread must be paused to perform a GC while the application is running.

In addition, the token removal algorithm traverses the heap for both active and inactive objects, while reference counting only needs to be cleared when the reference time count is zero.

Disadvantages of reference counting

First, it needs a counter for each reference variable, so counters need to take up a lot of space.

There is also the problem that circular references cannot be recycled.

function cycle(){
  const obj1 = {};
  const obj2 = {};
  obj1.a = obj2;
  obj2.a = obj1;
}
cycle();
Copy the code

The cycle function is no longer needed, so the memory of O1 and O2 should be freed, but they reference each other so that the memory is not recycled. This is a circular reference.

V8 engine garbage collection

The V8 engine takes garbage collection one step further. The generational garbage collection mechanism is used to divide objects into new generation and old generation. Use different garbage collection strategies for these two parts.

Objects in the new generation are objects with a short lifetime, and objects in the old generation are objects with a long lifetime or resident memory.

Generational memory

By default, the memory size of the new generation is 16MB and the memory size of the old generation is 700MB in the 32-bit system. The memory size of the new generation is 32MB and the memory size of the old generation is 1.4GB in the 64-bit system.

On average, the new generation is divided into two equal memory Spaces, called Semispace, each of 8MB (32-bit) or 16MB (64-bit).

New generation waste recycling

The New generation is recycled through an algorithm called the Scavenge, which uses a replicative approach known as the Cheney algorithm.

The Cheney algorithm splits the heap into two parts, the space that is in use and we’ll call it the used area, and the space that is idle and we’ll call it the free area.

Newly added objects are initially stored in the usage area, and when the usage area is nearly full, a garbage cleanup operation is performed.

When garbage collection starts, the new generation garbage collector will mark the active objects in the used area. After marking, the active objects in the used area will be copied into the free area and sorted. Then, the garbage collection phase will enter, that is, the space occupied by the inactive objects will be cleared. Finally, swap roles, turn the original use area into idle area, and turn the original idle area into use area.

Why swap the used area and the free area? In order to make the newly added objects are initially stored in the use area, the free area always remain free state.

The Insane are exceptionally time efficient because of the short life cycle of the Cenozoic and the Scavenge, which only replicates the exploiture. The Scavenge explodes the heap memory so that at most half of the memory is always used.

promotion

The process of moving objects from the new generation to the old generation is called promotion.

There are two main conditions for the promotion of objects:

  1. When an object is copied from the use area to the free area, its memory location is checked to determine whether the object has been screcycled. If it has, the object is moved directly from the use area to the old generation, or copied to the free area if it has not. In summary, if an object is copied from the used area to the free area for the second time, the object is moved directly to the old generation.

  2. When an object is copied from the used area to the free area, if the free area is more than 25% used, the object is promoted directly to the old generation. The reason for the 25% threshold is that when the Scavenge is completed, the free space will become used and subsequent memory allocation will take place in that space. If the ratio is too high, subsequent memory allocation will be affected.

Old generation garbage collection

The Scavenge algorithm, which accounts for a large proportion of the exploiture, presents two problems:

  1. Because there are so many live objects, copying live objects is inefficient.
  2. usingScavengeThe algorithm will waste half of the memory, and since the old generation takes up much more of the heap than the new generation, the waste will be significant.

As a result, V8 used a combination of Mark-sweep and Mark-compact for garbage collection in its old days.

Mark-sweep means marked Sweep, which is the marked Sweep garbage collection we talked about earlier. Earlier we talked about the memory fragmentation and slow allocation problems associated with tag clearing. Hence the following Mark-Compact.

Mark-compact means Mark Compact. The mark-up algorithm moves living objects (that is, objects that do not need to be cleaned up) towards one end of memory, so that living objects are contiguous in memory space and there is no more fragmentation.

conclusion

The generation mechanism in V8 exploits new, small, short-lived objects as the new generation and uses the Scavenge algorithm to quickly clean them up, while the large, old, long-established objects are recycled using a combination of Mark-sweep and Mark-compact. It can be said that this mechanism greatly improves the efficiency of the garbage collection mechanism.

Memory leaks

The engine has optimized, but is not to say that we can completely don’t need to care about recycling this, our code to take the initiative to avoid some not conducive to engine still do recycling operations, because not all useless garbage collection object memory can be recycled, that when the memory no longer used, not in time, we call it a memory leak.

Let’s talk about common memory leaks.

Improper closure

Closures have different definitions in different literatures. The author understands that a closure is a function that returns a new function that uses local variables of an external function.

For example

function say(){
  const name = 'randy'
  return function(){
    return name
  }
}
let newSay = say()
newSay()
Copy the code

The above say method returns a new function that uses the local variable name of the external function, so the closure is generated. The name variable will not be released, resulting in a memory leak.

So what’s the solution?

We just need to set the variable to NULL after calling this method.

newSay = null;
Copy the code

Unexpected global variables

Local variables in a function are no longer needed at the end of the function execution, so the garbage collector recognizes and frees them. However, with global variables, the garbage collector has a hard time determining when they are not needed, so global variables are usually not collected. It is ok to use global variables, but at the same time we want to avoid some extra global variables.

Let’s look at the following example

function fn(){
  Implicit global variable test1 was created without a declaration
  test1 = {name: 'randy'}
  
  // Inside the function this points to the window, creating the implicit global variable test2
  this.test2 = {name: 'randy2'}
}

fn()
Copy the code

Free DOM reference

In normal development, we will use variables to cache references to DOM nodes when performing DOM operations. However, when removing nodes, we should release cached references synchronously, otherwise the free subtree cannot be released.

let root = document.querySelector('#root')
let div1 = document.querySelector('#div1');

root.removeChild(div1)
Copy the code

The DOM variable must be set to null before it is collected by the garbage collection mechanism.

div1 = null
Copy the code

Uncleared timer

In our normal development, we may use setTimeout and setInterval, but do you clear the timer after each use? It can also cause a memory leak if not cleaned.

let timer = setTimeout(() = > {
  console.log('randy')},1000)

let inter = setInterval(() = > {
  console.log('randy')},1000)
Copy the code

We must remove it after use

timer = null;
inter = null;
Copy the code

The same applies to requestAnimationFrame in the browser. We need to cancelAnimationFrame when we don’t need it.

Uncleaned event listeners

AddEventListener may be used to monitor events in our daily development, but is removeEventListener used to remove events after listening?

const say = () = > {console.log('randy')}
window.addEventListener("resize", say)
Copy the code

Be sure to use removeEventListener when you’re done with it.

window.removeEventListener("resize", say)
Copy the code

In VUE we also use eventBus for event propagation.

eventBus.on("say", say)
Copy the code

We must remember the off method to clear.

eventBus.off("say", say)
Copy the code

Uncleared console

What console also causes memory leaks?

The reason we see data output on the console is because the browser keeps a reference to the information data of the object we output, which is why the uncleaned Console also causes a memory leak if it outputs objects.

So we usually use plug-ins to clean up the console when going into production.

An uncleaned Map Set

Due to the popularity of ES, we may use maps and sets of ES6. Map, Set and Object are strongly referenced, that is, objects stored in Map and Set are not collected by garbage collection.

Therefore, there are WeakSet and WeakMap. Objects in these two objects are weak references. If an object is only referenced by weak references, it is considered inaccessible (or weakly accessible), so it may be recovered at any time without interfering with the garbage collection mechanism.

In simple terms, if the object is no longer referenced by other objects, the garbage collection mechanism will automatically reclaim the memory occupied by the object, regardless of whether the object still exists in A WeakSet or a WeakMap.

Refer to the article

Do you really understand garbage collection

Talk about V8 engine garbage collection

There may be a memory leak in your program

Afterword.

This article is the author’s personal study notes, if there are fallacies, please inform, thank you! If this article has been helpful to you, please click the “like” button. Your support is my motivation to keep updating.