NodeJS – Memory Leak Hunt With Memwatch and Heapdump

sky-header

In this tutorial/article we will try to find and correct memory leak in application using memwatch module.

As a code example I will use code from my previous tutorial which you can check out How to combine Express, Angularjs and Socket.io. It is not required to use the same code as me, you can shortcut it and try it in yours but it would make it easier for you to follow the article. The code is really simple so it will be easy to find and isolate the leak culprit. Trying it out on your application without basic understanding (in case if it’s large piece of codework) can lead you to some sort of “rage quit” and search for some other solution (been there).

I’ll write about memwatch, heapdump. Well at least my experience with those modules.
So let’s get started. After you grabbed the code from previous article(or not), you’ll need to install memwatch
npm install memwatch

require it in your code

var memwatch = require('memwatch');

in our example(index.js) it will be

[js]
var express = require('express'),
    bodyParser = require('body-parser'),
    http = require('http'),
    app = express(),
    session = require('express-session')({
        secret: "aliensAreAmongUs",
        resave: true,
        saveUninitialized: true
    }),
    sharedsession = require('express-socket.io-session'),
    port = 3016,
    memwatch = require('memwatch');//this one
[/js]

(yeah I know it’s a mess up there but it’s a mess for educational purposes 🙂 )
Ok, so with memwatch we have few options. First is to use leak event which get’s fired when your heap usage has increased for five consecutive garbage collections.
To quickly clear things up:

  • Garbage Collector (GC) collects all your unreferenced objects, all the stuff you don’t need anymore. Let’s say visitor comes to your site, you create everything you need for that specific user, his object, some listeners for his actions or other objects with other stuff. Once he is gone, someone needs to clean that up and that someone is GC. To clean up that junk GC needs to know you’re not using it anymore, and the way it figures that out is by counting references to some object.
  • Heap is the place everything gets thrown at and it’s stays there until GC removes what’s supposed to be junk.

We can also get event when GC get’s called. Add this code to the end of your index.js (or in case you decided to use your code, add it in same file where you required memwatch)

[js]
memwatch.on('leak', function(info) {
    console.log("leak:",info);
});

memwatch.on('stats', function(stats) {
    console.log("stats:",stats);
});
[/js]

For the following example I was manually clicking send button and refreshing the page for quite a long time. If you don’t want to do that (trust me, it’s boring) you can skip down below memwatch data to adjust your code, automate it and speed things up.
So the leak info i received from constant refreshing the page and clicking send button is this:

1.
stats: { num_full_gc: 1,
num_inc_gc: 3,
heap_compactions: 1,
usage_trend: 0,
estimated_base: 11974312,
current_base: 11974312,
min: 0,
max: 0 }

2.
stats: { num_full_gc: 2,
num_inc_gc: 9,
heap_compactions: 2,
usage_trend: 0,
estimated_base: 13176208,
current_base: 13176208,
min: 0,
max: 0 }

3.
stats: { num_full_gc: 3,
num_inc_gc: 12,
heap_compactions: 3,
usage_trend: 0,
estimated_base: 13603352,
current_base: 13603352,
min: 13603352,
max: 13603352 }

4.
stats: { num_full_gc: 4,
num_inc_gc: 15,
heap_compactions: 4,
usage_trend: 0,
estimated_base: 14104408,
current_base: 14104408,
min: 13603352,
max: 14104408 }

and finally we got leak event!

leak: { start: Thu Jun 11 2015 12:11:46 GMT+0200 (CEST),
end: Thu Jun 11 2015 12:34:44 GMT+0200 (CEST),
growth: 2286688,
reason: 'heap growth over 5 consecutive GCs (22m 58s) - 5.7 mb/hr' }

stats: { num_full_gc: 5,
num_inc_gc: 19,
heap_compactions: 5,
usage_trend: 0,
estimated_base: 14261000,
current_base: 14261000,
min: 13603352,
max: 14261000 }
***to speed things up, make this change in your views/public/anglar.js code:

[js]
fenixApp.controller('mainController', function($scope, socket) {
    $scope.recievedTroughSocket = "still waiting for data...";
    $scope.sendWithSocket = function(msg){
        socket.emit("something", msg);
        setTimeout(function(){location.reload();}, 50);//this one
    }
    socket.on("greetings", function(data){
        console.log("user data: " + JSON.stringify(data));
        $scope.recievedTroughSocket = data.msg;
        $scope.sendWithSocket("some random msg: " + Math.random()*1000);//and this one
    });
});
[/js]

Now we know we have a leak in our app and a way to detect it. So how do we figure out just what is not cleaning up?
Well for that we can use HeapDiff, it takes a heap snapshot before and after some part of code you don’t trust. Kinda like those commercials for losing weight just in reverse (in this case you get the skinny picture before and the fat one later).

Let’s try it out and see what happens. We can use “stats” event to get the snapshot. Just to be sure we don’t pickup more than we need from our HeapDiff we will take first snapshot on first stats event and second snapshot on second stats event then repeat using simple Boolean switch.

[js]
var snapshotTaken = false,
    hd;
memwatch.on('stats', function(stats) {
    console.log("stats:",stats);
    if(snapshotTaken==false){
        hd = new memwatch.HeapDiff();
        snapshotTaken = true;
    } else {
        var diff = hd.end();
        snapshotTaken = false;
        console.log(util.inspect(diff, {showHidden:false, depth:4}));
    }
});
[/js]

and the result is something like this:

[js]
{ before:
   { nodes: 88212,
     time: Thu Jun 11 2015 13:35:22 GMT+0200 (CEST),
     size_bytes: 12320416,
     size: '11.75 mb' },
  after:
   { nodes: 91151,
     time: Thu Jun 11 2015 13:36:12 GMT+0200 (CEST),
     size_bytes: 13153432,
     size: '12.54 mb' },
  change:
   { size_bytes: 833016,
     size: '813.49 kb',
     freed_nodes: 2664,
     allocated_nodes: 5603,
     details:
      [ { what: 'Arguments',
          size_bytes: 0,
          size: '0 bytes',
          '+': 1,
          '-': 1 },
        { what: 'Array',
          size_bytes: 433584,
          size: '423.42 kb',
          '+': 2671,
          '-': 1026 },
        { what: 'Buffer',
          size_bytes: -96,
          size: '-96 bytes',
          '+': 0,
          '-': 2 },
        { what: 'BufferPool',
          size_bytes: -136,
          size: '-136 bytes',
          '+': 0,
          '-': 1 },
        { what: 'Client',
          size_bytes: -160,
          size: '-160 bytes',
          '+': 0,
          '-': 1 },
        { what: 'Closure',
          size_bytes: 20376,
          size: '19.9 kb',
          '+': 341,
          '-': 58 },
        { what: 'Code',
          size_bytes: 345680,
          size: '337.58 kb',
          '+': 1242,
          '-': 687 },
        { what: 'Cookie',
          size_bytes: -120,
          size: '-120 bytes',
          '+': 0,
          '-': 1 },
        { what: 'Decoder',
          size_bytes: -96,
          size: '-96 bytes',
          '+': 0,
          '-': 1 },
        { what: 'Encoder',
          size_bytes: -104,
          size: '-104 bytes',
          '+': 0,
          '-': 1 },
        { what: 'Error',
          size_bytes: -24,
          size: '-24 bytes',
          '+': 1,
          '-': 2 },
        { what: 'HTTPParser',
          size_bytes: 32,
          size: '32 bytes',
          '+': 1,
          '-': 0 },
        { what: 'IncomingMessage',
          size_bytes: 3952,
          size: '3.86 kb',
          '+': 20,
          '-': 1 },
        { what: 'Native', size_bytes: 0, size: '0 bytes', '+': 1, '-': 1 },
        { what: 'Number',
          size_bytes: 800,
          size: '800 bytes',
          '+': 59,
          '-': 9 },
        { what: 'Object',
          size_bytes: 4072,
          size: '3.98 kb',
          '+': 197,
          '-': 28 },
        { what: 'ReadableState',
          size_bytes: 7872,
          size: '7.69 kb',
          '+': 42,
          '-': 1 },
        { what: 'RegExp',
          size_bytes: 72,
          size: '72 bytes',
          '+': 1,
          '-': 0 },
        { what: 'ServerResponse',
          size_bytes: 1080,
          size: '1.05 kb',
          '+': 10,
          '-': 1 },
        { what: 'Session',
          size_bytes: -96,
          size: '-96 bytes',
          '+': 0,
          '-': 1 },
        { what: 'SlowBuffer',
          size_bytes: 0,
          size: '0 bytes',
          '+': 1,
          '-': 1 },
        { what: 'Socket',
          size_bytes: 5224,
          size: '5.1 kb',
          '+': 32,
          '-': 1 },
        { what: 'String',
          size_bytes: -2208,
          size: '-2.16 kb',
          '+': 522,
          '-': 225 },
        { what: 'TCP', size_bytes: 0, size: '0 bytes', '+': 1, '-': 1 },
        { what: 'Timeout',
          size_bytes: 2400,
          size: '2.34 kb',
          '+': 30,
          '-': 0 },
        { what: 'Timer', size_bytes: 32, size: '32 bytes', '+': 1, '-': 0 },
        { what: 'Url',
          size_bytes: -128,
          size: '-128 bytes',
          '+': 0,
          '-': 1 },
        { what: 'WebSocket',
          size_bytes: 1920,
          size: '1.88 kb',
          '+': 20,
          '-': 0 },
        { what: 'WritableState',
          size_bytes: 3344,
          size: '3.27 kb',
          '+': 22,
          '-': 0 },
        { what: 'XHR', size_bytes: 1040, size: '1.02 kb', '+': 10, '-': 0 },
        { what: 'exports.StringDecoder',
          size_bytes: -168,
          size: '-168 bytes',
          '+': 0,
          '-': 1 } ] } }
[/js]

Now this is a fairly simple application and all you do here is use various modules. As you can see in our heapdiff output above, whats clogging our memory is:

{ what: 'Array',
size_bytes: 433584,
size: '423.42 kb',
'+': 2671,
'-': 1026 },
{ what: 'Code',
size_bytes: 345680,
size: '337.58 kb',
'+': 1242,
'-': 687 },
{ what: 'Closure',
size_bytes: 20376,
size: '19.9 kb',
'+': 341,
'-': 58 }

now before you start blaming yourself or your code, do some research on modules you use, for example look at warning on express-session readme:

Warning The default server-side session storage, MemoryStore, is purposely not designed for a production environment. It will leak memory under most conditions, does not scale past a single process, and is meant for debugging and developing.

from: https://www.npmjs.com/package/express-session

so let’s try to solve that issue first and do as we’re told. I think redis/mongo solution for sessions is better but that would require a new article/tutorial so let’s just use something simple for our purposes and save sessions to some tmp file with “session-file-store” module.

npm install session-file-store

and make a few changes in your code:

[js]
var express = require('express'),
    bodyParser = require('body-parser'),
    http = require('http'),
    app = express(),
    //changed from here to...
    Session = require('express-session'),//capital S
    SessionStore = require('session-file-store')(Session),
    session = Session({
        store: new SessionStore({ path: './tmp/sessions' }),
        secret: "aliensAreAmongUs",
        resave: true,
        saveUninitialized: true
    }),
    //...over here
    sharedsession = require('express-socket.io-session'),
    port = 3014,
    memwatch = require('memwatch'),
    util = require("util");//this is good module, use it
[/js]

Now comment out some “console.log”s so it doesn’t spam your console or log with unnecessary info, for a test run you can also comment out stats listener so we can pay attention to leaks, your choice. Start it up and open your app in browser to spam the server and keep refreshing, I’ve used 3 tabs and made a coffee in meantime. Let your app run for a while and keep an eye on memory usage. I’m using htop on my vbox.

After some time, I’ve got few leak events so this didn’t help much or solve the problem. Reading heapdiff output between every second stats event also didn’t provide much useful information. Let’s try to tweek that a bit so it provides better information about what’s leaking exactly.

Changing code to take an initial snapshot and than make a heapdiff on leak event:

[js]
var snapshotTaken = false,
    hd;
memwatch.on('leak', function(info) {
    console.log("leak:",info);
    var diff = hd.end();
    snapshotTaken = false;
    console.log(util.inspect(diff, {showHidden:false, depth:4}));
});
memwatch.on('stats', function(stats) {
    console.log("stats:",stats);
    if(snapshotTaken==false){
        hd = new memwatch.HeapDiff();
        snapshotTaken = true;
    }/* else {
        var diff = hd.end();
        snapshotTaken = false;
        console.log(util.inspect(diff, {showHidden:false, depth:4}));
    }*/
});
[/js]

Now you have some pointers but that is still not enough to pinpoint the exact leak. Time to use “big guns”.

npm install heapdump

[js]
    ...,
    memwatch = require('memwatch'),
    heapdump = require('heapdump'),
    util = require("util");
[/js]

change your code in “leak” listener to this:

[js]
memwatch.on('leak', function(info) {
    console.log("leak:",info);
    var diff = hd.end();
    snapshotTaken = false;
    console.log(util.inspect(diff, {showHidden:false, depth:4}));
    heapdump.writeSnapshot(function(err, filename) {
        console.log('dump written to', filename);
    });
});
[/js]

When you get leak event, a heapdump file will be stored in your application’s root folder which you can inspect with google chrome. Open up Google Chrome, press F12, click on profiles, right click on “profile” in left column and load your heapdump file.

Now you’ll have something like this to play around with:

 

Here we can see that most of what’s in memory is not related to our code at all.
This will require further investigation. Let’s try to leave our app to run for a longer time, it could always be a false positive and maybe we don’t have a leak at all! Now delete all your previous heapdump files and before we start our application again, let’s rewrite some of our functions to help us detect leaky stuff.
Currently we use anonymous functions in our io part of the code so let’s give them a name to easily track them in heapdump.

[js]
io.on('connection', function someNameForNewConnectionHandler(socket){
    //console.log("connected");
    socket.emit("greetings", {msg:"hello"});
    socket.on("something", function gotSomething(data){
        //console.log("client["+socket.handshake.session.myCustomData.userID+"] sent data: " + data);
    })
});
[/js]

Now try again, but this time leave your app running for a longer time and open a couple of clients in tabs, I’ve opened few in firefox and chrome. I’ll check what happens in about half an hour. Currently my memwatch status is:

stats: { num_full_gc: 10,
num_inc_gc: 25,
heap_compactions: 10,
usage_trend: 0,
estimated_base: 20498903,
current_base: 22456648,
min: 15749976,
max: 22456648 }

after 15 minutes

stats: { num_full_gc: 29,
num_inc_gc: 110,
heap_compactions: 29,
usage_trend: 7.7,
estimated_base: 25122633,
current_base: 28484464,
min: 15749976,
max: 28484464 }

at 20 minute mark I have closed 2 out of 4 tabs in chrome.
at 25 other 2 chrome tabs, so that leaves us with firefox only.

now i’ll force heapdump with $ kill -USR2
(from heapdump documentation)
and shutdown app

after disconnecting 4 tabs from chrome, memory has dropped after ~5 minutes

stats: { num_full_gc: 41,
num_inc_gc: 171,
heap_compactions: 41,
usage_trend: 9.4,
estimated_base: 26965772,
current_base: 24183304,
min: 15749976,
max: 30590680 }
stats: { num_full_gc: 42,
num_inc_gc: 171,
heap_compactions: 42,
usage_trend: 8.3,
estimated_base: 26671752,
current_base: 24025576,
min: 15749976,
max: 30590680 }

now let’s take a look at our heapdump file

Looks pretty much the same as the first heapdump. And memory was stable for half an hour. It seems we don’t have memory leak in our code.
Now that was a long lesson for a false positive.

OR

maybe I’m wrong, leave me a comment if you have any advice on finding memory leaks. I find this whole process of leak hunting in nodejs extremely difficult.

Memwatch isn’t useful enough, at least to me it isn’t. I’ve also noticed around stackoverflow and github that this project is pretty much dead and it hasn’t had an update in over 2 years.
Heapdump is helpful but it does require a bit of tinkering (like naming anonymous functions for easier heap browsing).