As Park said, Node is very sensitive to memory leaks. Once an online application has tens of thousands of traffic, even a single byte of memory leak will cause a pile up. Garbage collection will take more time to scan objects, and the application will respond slowly until the process runs out of memory and the application crashes.

Although the memory problem is not to be ignored from a long time ago, but the daily development of the time did not encounter performance bottlenecks, until recently did a million PV level marketing project, due to the number of visits, the concurrent volume has reached an order of magnitude. Some small, usually unnoticed problems are magnified, which is when you start to notice memory problems. But Node is so sensitive to memory leaks. To that end, I went to brush up on the memory handling mechanism in V8. So, what about the memory mechanism in V8?

V8’s memory mechanism

Memory limits

Unlike other backend languages, Node has few limitations on memory usage. When using memory in a Node, only part of the system memory is used, which is about 1.4GB on a 64-bit system and 0.7GB on a 32-bit system. This is due to Node using the V8 engine that was originally running in the browser.

The V8 engine was originally designed to run in a browser, and is more than adequate for the general use of the browser to meet all the requirements of the front-end page.

While it is not common for a server to operate large amounts of memory, the limitation can be lifted in the event of such a requirement. When you start the Node program, you can pass two parameters to adjust the size of the memory limit.

Node --max-nex-space-size=1024 app.js // Unit: KB Node --max-old-space-size=2000 app.js // unit: MBCopy the code

These two commands correspond to “new generation” and “old generation” in the Node heap respectively.

A special case that is not limited by memory

In Node, large files that exceed the V8 memory limit can be read using Buffer. The reason is that the Buffer object, unlike other objects, does not go through V8’s memory allocation mechanism. This is because Node is not the same as a browser. In a browser, JavaScript handles strings directly to meet most business needs, while Node handles network and file I/O streams. Manipulation strings are far from sufficient for transport performance.

Allocation of memory

All JavaScript objects are stored in the heap

When we declare variables in our code and assign values, the memory of the objects we use is allocated in the heap. If the allocated free memory is not enough to allocate new objects, the heap continues to be allocated until the heap size exceeds V8’s limit.

V8’s garbage collection mechanism

Generational garbage collection

V8’s garbage collection strategy is based on “generational garbage collection,” which divides memory into “New Space” and “Old Space.” Objects in the new generation are objects with a short lifetime, and objects in the old generation are objects with a long lifetime or resident memory. The –max-old-space-size command is used to set the maximum size of the old generation memory, while the –max-new-space-size command is used to set the size of the new generation memory.

Why the old and the new?

There are many kinds of garbage collection algorithms, but no one is competent for all scenarios. In practical applications, different algorithms need to be used according to the life cycle of objects, so as to achieve the best effect. In V8, the garbage collection of memory is divided into different generations based on the lifetime of the object, and then a more efficient algorithm is applied to each memory.

Recycling in the new generation

The Scavenge algorithm is used to recycle garbage in the new generation.

Scavenge

The Scavenge algorithm splits the heap memory in half, with each portion of space called semispace. Of the two Semispace Spaces, only one is in use and the other is idle. A semispace that is in use is called From space, and a Semispace that is idle is called To space. When we allocate objects, we allocate them first From the From space. When garbage collection begins, the living objects in the From space are checked and copied To the To space, while the space occupied by non-living objects is freed. After the replication is complete, the roles of the From space and To space are reversed. In a nutshell, garbage collection is done by copying live objects between two Semispace Spaces.

How can an object in the new generation be in the old generation?

In the Cenozoic era, objects with long survival period will be moved to the old generation, which mainly meets one of two conditions:

1. Whether the subject has been Scavenge. When an object is copied From the From space To the To space, its memory location is checked To determine whether the object has been screcycled, and if so, copied From the From space To the older generation.

2. The memory usage of the To space exceeds the 25% limit. When an object is copied From the From space To the To space, if more than 25% of the To space is already used, the object is copied directly into the old generation. The reason for this is that after the Scavenge, the To space becomes the From space, where subsequent memory allocation takes place. If the ratio is too high, subsequent memory allocation will be affected.

Garbage collection in old generation

The Scavenge algorithm is unscientific due to the large proportion of the exploiture. Copying too many objects causes efficiency problems, and it wastes twice as much space. Therefore, V8 in its old generation mainly used a combination of “Mark-sweep” algorithm and “Mark-compact” algorithm for garbage collection.

Mark-Sweep

Mark-sweep means marked Sweep and is divided into two phases: marking and sweeping. In the marking phase, all objects in the heap are traversed and the surviving objects are marked, and in the subsequent cleaning phase, only objects outside the mark are cleared.

One of Mark-Sweep’s major problems, however, is that memory becomes fragmented after a tag Sweep collection. If you need to allocate a large object, you can’t allocate it. This is where Mark-Compact comes in.

Mark-Compact

Mark-compact stands for mark-sweep and is a variation of mark-sweep. After marking live objects, mark-Compact moves the live objects to one end during the defragmentation process, and when the move is complete, it cleans up memory directly outside the boundary.

Incremental Marking

Due to the single-threaded nature of Node, V8 needs to suspend application logic for each garbage collection and then resume application logic after garbage collection. This is called “full pause”. In generational garbage collection, a small garbage collection only collects the new generation, and the surviving objects are relatively few, even if the total halt will not have much impact. However, in the old generation, there are many living objects, and the marking, cleaning and sorting of garbage collection need a long time, which will seriously affect the performance of the system. This is why incrementing Marking has been proposed. It starts with the marking phase, changing the actions that are supposed to be completed in one sitting to incremental tags, breaking them down into many small “steps” that allow the JavaScript application logic to execute for a short time after each step, alternating between garbage collection and application logic until the marking phase is complete.

Memory leak detection tool

node-heapdump

It allows snapshots of V8 heap memory for post-mortem analysis. Introduced into the program

var heapdump = require("node-heapdump");Copy the code

Node-heapdump can then take a snapshot of the heap by sending SIGUSR2 to the server:

$ kill -USR2 Copy the code

This snapshot is stored by default in a file directory, which is a large JSON file that can be opened and viewed using Chrome’s developer tools.

node-memwatch

Note that Node-memwatch is only supported up to Node V0.12.x. If you are using a later version, you will not be able to install it. In this case, you can use node-watch-next with the same API.

Unlike Node-heapdump, it provides two event listeners to provide information about memory leaks and garbage collection:

  1. Stats event: Each time a full heap collection is performed, the change time is triggered, passing memory statistics

  2. Leak event: An object whose memory has not been freed after five garbage collections will trigger a Leak event, passing relevant information.

node-profiler

Node-profiler is a tool developed by Alinode team to capture heap snapshots similar to Node-heapdump. However, node-profiler is implemented differently and is easier to use. Attached is their tutorial: How to use Node Profiler

alinode

The Alinode official said as follows:

Alinode is the Node.js application service solution produced by Ali Cloud. It is a set of runtime environment and service platform improved based on community Node. We built in powerful support features to help developers quickly see performance details, quickly locate problems, and get to the root of the problem.

The above content is from

A tour of V8: Garbage Collection V8