Recently, I wrote a small tool with Node, which needs to stay in the process. After a few days of observation, I found that the memory usage continues to increase (although it is not obvious, but I still noticed it, I am really cool). Suddenly, I don’t know how to troubleshoot nodeJS memory leaks.

I took the time to look at the relevant materials (Google is great, the fruit is really cool), it seems that there is a relatively complete methodology + tools for this part. I hope this article can not only provide specific tools for you to use, but also provide enough theoretical knowledge to assist you to think, of course, maybe I think too much ~~ wow

Found the problem

Since I don’t have much o&M experience and don’t know what awesome tools I need to help me monitor the indicators with one click, if you’re like me, we’ll have to manually build a rudimentary but adequate monitoring script.

Don’t tell me you’re as shell-shy as I am, just stick with Node. Little useless talk ~

Install the PM2 and write a script that periodically prints the memory usage of the target application, assuming that the target application is also managed in pM2.

const exec = require('child_process').exec; var Later = require('later'); var schedule = Later.parse.text('every 5 mins'); SetInterval (function(){exec('pm2 jlist', {// Prints the basic status information of the pm2 application, with json string timeout: 2000 }, (err, data, stderr)=>{ if (err) { console.error(err, err.stack); // an error occurred return; } // Write the result to log data = json.parse (data); Console. log(data[0].monit. Memory /(1024*1024)); // output directly to pm2 log}}); }, schedule);Copy the code

Then, after waiting for a while, you’ll get the relevant memory data in the corresponding log file. Then, you just need to generate a spreadsheet with a spreadsheet. I recommend using Google Drive Spreadsheet:

The graph above is a graph drawn from the memory data I collected for about 2 days. It shows that the memory usage is on the rise. That’s right, a leak!!

A word of caution: Fixing memory leaks can take a long time, so you’re better off finding a temporary solution to make ends meet, such as periodically restarting the program.

Set up the environment

In line with the strategy of actual combat, we started from the construction of memory leak monitoring environment. At first, I looked at node-memory-leak-tutorial, thinking it would be easy to build, but I ran into this error. It should be a very common error to see Issus, try switching to NodeJS 6.3.1 for testing according to other people’s solution, and can indeed bypass the error:

// In the project directory node-debug leak.jsCopy the code

The terminal will then launch your Chrome, stop at the breakpoint of the code, take a deep breath and you can hit Execute.

Note: If a snapshot cannot be created, refresh the snapshot information several times

I also tried other tools:

  1. node-memwatch
  2. node-webkit-agent
  3. node-heapdump

Since they all need to be compiled for the operating system, my native environment is Win7 64bit, which is not an ideal NodeJS environment, at least in my opinion, otherwise I wouldn’t have had the nasty “.net Framework “issues. Net Framework 3.5 is a very important software installation package for Windows 7. If you install the newer version 4.0+, you will not be able to run it. Install it on Windows 7. NET Framework 3.5 Framework is not easy! It is suggested that docker be used to build a container specially for analysis. I will not bother with it here. Its your turn~~

Nodejs memory analysis of the theoretical posture

Before you start listening to my serious nonsense, I recommend you read a few documents:

It may take a long time to read all of these at once, so I have helped you to read them, according to my understanding, the summary is as follows:

  • The V8 memory management of javascript is similar To that of the Java JVM, with new generations (to-space and from-space), older generations, etc.
  • Troubleshooting for memory leaks To analyze memory snapshots, you can use existing tools to create snapshots in the DevTool profile panel or code.
  • You can import the created snapshot file to the DevTool profile for analysis.
  • The best practice for snapshot generation is to make sure the program is warmed up, then do snapshot 1 (GC is triggered first), then do some interaction with the program (for example, for Web services that are HTTP requests), then create snapshot 2 again, and so on to generate multiple versions of snapshots.
  • Make proper use of the features provided by DevTool’s profile to select the right view.
  • Understand the meaning of the fields in Profie:
    • Yellow flags on objects indicate javascript direct references, red indicates indirect dependent references, and less important are unshaded objects that are referenced by other resources (e.g., Natvie Code).
    • The profile groups objects into groups based on their constructor, and the corresponding “Shallow Size” for each group represents the immediate memory footprint of that group of objects (e.g. The memory usage of the primitive type data of the class object itself), Retained Size represents the total memory footprint of the Retained objects (equal to its own Shallow Size + the Shallow Size of the Retained objects [+ the Shallow Size of the Retained objects [+) []);
    • For performance reasons, the attributes of the object’s integer type are not shown in the profile, but they are not lost, just not shown by the tool.
  • We should be alert to objects with large or small distances. In short, the difference between distance and other objects of the same type means that there may be problems.
  • Avoid using anonymous functions. Functions with names make analysis easier. In fact, using OOP is recommended because it is easier to locate variables that need to be traced.
  • Context references created by closures (anonymous functions, timers, and so on) can easily cause undetected memory leaks;
  • Console related functions (log, error, etc.) can’t be released during actual analysis, see #1741, so you can replace console related functions in your test code (so you don’t need to change the logic of the code under test);
  • Closures of event listeners on objects are the most leaky, and even with once, they may not fire once, causing the callback to reference data indefinitely.

Ok, that’s a lot of poses to read for a long time. But that doesn’t mean you can beat the odds. There’s one thing we didn’t discuss: if your project is big enough, how do you set up a test environment for your project?

In my opinion, the following steps should be followed:

  1. Break the entire project into separate pieces and write test code for each of the smaller pieces
  2. For timer related logic, it is best to switch to manual triggering, or use the test library (SinonJS) to simulate time fragments
  3. Eliminate dependent third-party libraries as much as possible initially, and then test them as appropriate (if you suspect they are the problem).
  4. Low level exception forgery (socket, file, etc.) requires forgery of corresponding methods (sinonjs.stubs is not recommended because it stores parameter data on each call and affects your observation, instead, shanzhai is recommended)
  5. Eventually it will be necessary to put it in the online environment for a period of time to see if the problem is really fixed

Let’s focus on rule 5, which means that you need to find a way to export snapshots online for local analysis. Here’s how to do it:

First, you install the V8-Profiler library in your online environment, which provides snapshot-creating functionality.

Then, take a look at the following boilerplate code, which loads the V8-Profiler library in your project and provides an external instruction telling it to create a snapshot file.

var fs = require('fs'); var profiler = require('v8-profiler'); / / -- -- -- -- -- -- -- -- -- -- -- -- -- -- - / / test target function LeakingClass () {} var leaks = []; setInterval(function(){ for(var i = 0; i < 100; i++){ leaks.push(new LeakingClass); } console.error('Leaks: %d', leaks.length); }, 1000); / / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- / var/instruction service koa = the require (' koa); var route = require('koa-route'); var service = koa(); var snapshotNum = 1; Service.use (route.get('/snapshot', function *(){var response = this; var snapshot = profiler.takeSnapshot(); snapshot.export(function(error, result) { fs.writeFileSync((snapshotNum++) + '.heapsnapshot', result); snapshot.delete(); response.body = 'done'; }); })); Service. Listen (2333, '127.0.0.1); // It is recommended to bind the Intranet IP address and do not allow external networks to access the serviceCopy the code

Each request to http://127.0.0.1:2333/snapshot, you will be in the project root directory to generate a snapshot of the file, and then download it to a local disk can analyze the at any time in the chrome.

conclusion

In the actual process of troubleshooting, it is found that the most difficult to test is the leakage problem of the dependent third party library. After all, you can’t understand their implementation. However, it is impossible for all logic to be completed by itself, so in the face of various third-party class libraries, it is recommended to choose the authoritative and mainstream ones as far as possible. The rest of the small functional modules, you have to spend time reading their implementation code.

If your business has adopted production consumer model, your test scripts must ensure the task, the rate of production and consumption keep same rate (or simply to ensure consumers to deal with a batch of task time must be less than the batch creation time interval), or because the task is not to deal with, will inevitably produce task accumulation, looks like a memory leak, But in fact, this situation is reasonable, it just shows that you have too few customers.

In addition, it is important to run the test code as frequently as possible for as long as possible to clearly expose the problem, for example:

SetInterval (function memoryleakBlock(){// Code block to be tested}, 100);Copy the code

Note that memoryleakBlock above avoids referencing global variables so that you can run it overnight and go to work the next day to see the result (if it is still running).