This is the 11th day of my participation in the August Wenwen Challenge.More challenges in August

The garbage collection

Trash talk: the Orinoco garbage collector · V8

Raw data is stored in stack space, and reference data is stored in heap space, which solves the memory allocation problem of data.

However, some data is no longer needed after being used. This kind of data is called garbage data. If these data are kept in memory all the time, the memory will occupy more and more space.

C and C++ can actively allocate and destroy memory through malloc and free. Forgetting to free a piece of data that is no longer needed can cause a memory leak.

JavaScript, Java, Python, and others use automatic garbage collection strategies.

How is the data in the call stack recycled?

Analyze the collection mechanism through code

function foo(){
    var a = 1
    var b = {name:Geek Bang}
    function showName(){
      var c = 2
      var d = {name:"Geek Time"}
    }
    showName()
}
foo()
Copy the code

When showName is executed, the stack and heap are called as follows

You can see that the primitive type data is stored in the execution context, while the reference type is actually stored in the heap, just pointing from the stack to the corresponding object in the heap.

There is also the concept of having a pointer to the current execution state, called ESP, that points to the execution context of the currently executing function. For example, when you finish executing showName, ESP will move down to foo

The execution context for showName is still stored in stack memory, but it is invalid. For example, if foo calls another function inside, the void memory block will be overwritten and used to store the execution context of the other function.

Conclusion: When a function finishes executing, the JavaScript engine destroys its execution context on the stack by moving ESP down.

How is data reclaimed from the heap?

When foo completes execution, ESP should point to the global execution context. Although the showName and Foo execution context are invalid, the two objects stored in the heap are still occupying memory.

At this point, it becomes obvious that we need to reclaim these two occupied Spaces in the heap, which is where the JavaScript garbage collector comes in.

Intergenerational hypothesis

The first thing you need to understand is what The Generational Hypothesis is, which is a big term in garbage collection, and that’s The basis for all of our garbage collection strategies.

The intergenerational hypothesis has the following two characteristics:

  1. Most objects have very short events in memory. In short, many objects become inaccessible as soon as they are allocated memory
  2. Undead objects live longer

These two features apply not only to JavaScript, but also to most dynamic languages, such as Java, Python, etc

How does V8 implement garbage collection

In V8, the heap is divided into new generation and old generation, with the new generation storing short-lived objects and the old generation storing long-lived objects.

The Cenozoic zone usually supports only 1 to 8M capacity, while the old generation zone supports much more capacity.

For the new generation and the old generation, the V8 engine adopts two different garbage collection periods in order to achieve more efficient garbage collection.

  • Secondary garbage collector, mainly responsible for the new generation of garbage recycling
  • Main garbage collector, mainly responsible for old generation garbage collection

The garbage collector workflow

Regardless of the type of garbage collector, it’s the same set of execution processes.

  1. Marks active and inactive objects in the space. (An active object is an object that is still in use, and an inactive object is an object that no one references.)
  2. Reclaim the memory occupied by the inactive objects, that is, all the memory marked as inactive objects are cleaned up uniformly after the marking is complete.
  3. Do memory defragment. Why do you want to do memory defragment? After objects are reclaimed frequently, a large amount of discontinuous space will be generated in memory, and these discontinuous space is called memory fragmentation. When there is a lot of fragmentation in memory, it is possible to run out of memory if you want to allocate a large amount of memory. To avoid running out of memory, you need to defragment. This step is optional, however, because some garbage collectors do not generate memory fragmentation, such as the paralgarbage collector.

Secondary garbage collector

Mainly responsible for garbage recycling in the new generation area. This is where most of the small objects are allocated, and although the area is small, garbage collection is frequent.

The Scavenge algorithm is used.

Principle:

1. Divide the Cenozoic Space into two halves, one half is from-space and the other half is to-space, which means that half of the Space will always be empty.

New objects are stored in from-space. When from-space is almost full, a garbage cleanup operation is performed.

For the first garbage collection, from-space is garbage-marked, and active objects are sorted into to-space.

3, empty from-space, then flip the two Spaces, convert the empty from-space To to-space, and convert the to-space with live objects To from-space. When second garbage collection (that is, From – Space fast full) To find just that four objects still alive, the first four (4) and a new active objects into the To area, then he lived for two of the four piece because object promotion strategy is handed over To the old living area, at this time – Space is only new To join the activities of the object, Next time, there is only one active object in from-space. Then, when from-space is almost full, the garbage collection cycle begins.

Object promotion strategy: Objects that survive two garbage collections are moved to the old area.

Main garbage collector

The garbage collector is mainly responsible for the recycling of the old area, in addition to from the new promotion to the object of the old area, some big objects will be assigned to the old living area directly, so the old characteristics of the object is the object takes up the space is large | | object long survival time = = = true

The main garbage collector uses a Mark-sweep algorithm for garbage collection.

  1. Mark phase

    Starting with a set of root elements (call stack and global objects), the set of root elements is recursively traversed. In this traversal, elements with variable Pointers are called active objects, and objects without variable Pointers are considered junk data

For example, after executing showName in foo, ESP moves down, pointing to foo’s execution context, and when traversing the call stack, ESP finds no variable that refers to 1003, so 1003 is marked as garbage, while 1050 is pointed to by B, and therefore is an active object.

  1. Garbage removal phase

    Refer to the below

Remove data section and vice garbage collector is different, because it is operated in a space, therefore is to remove the garbage data, but if in a block of memory to be executed multiple times tags – clear algorithm, can produce a large number of discontinuous fragments of the memory, this may lead to object cannot allocate enough memory in a row, Hence another algorithm, the Mark-compact algorithm, which essentially copies the living blocks to the unoccuated ones, thus making full use of the gaps between different memory blocks

The potential problem is that when we allocate a lot of long-lived objects, we pay a high price for copying them. That’s why we chose to mark-de-clutter only some highly fragmented pages and mark-de-clutter only others, rather than all mark-de-clutter.

The pause

Because JavaScript runs on the main thread, once the garbage collection algorithm is executed, it is necessary to pause the JavaScript script that is being executed and resume the script execution after garbage collection is completed. This behavior is called total pause.

Several ways to handle total pauses

  • The Parallel Parallel

  • Incremental Incremental

  • Concurrent secondary threads run concurrently with the main thread

V8 currently uses cross-helper thread allocation in the new generation of secondary garbage collection

Concurrency tokens are used in the main garbage collector, the old generation

Free time to handle garbage collection tasks

GC can publish “idle tasks”, like Chrome has the concept of idle time, which means that Chrome has 60 frames per second and the browser has 16.6 milliseconds to render each frame of animation. If each frame is rendered early, Chrome can choose to run the idle tasks created by GC between that frame and the next frame.