Tips & Tricks

Node.js Profiling in WebStorm. Part 2: Memory Profiling.

We continue talking about our new Node.js profiling features in WebStorm 10. In our previous post we explored CPU profiling. In this one we’ll dive into the strategies for memory problems investigation and see how WebStorm can help us apply them.

heap_profiling-cover

Table of content

Background information

JavaScript is a language with automatic memory management. That means that the JavaScript engine itself takes care of allocating space for your objects and collecting objects when they are not needed.

Node.js, io.js, and Google Chrome use the same open-source JavaScript engine, called V8.

In this article we will talk about the V8 JavaScript heap, i.e. only about the memory area that is managed by the JavaScript engine and contains JavaScript objects.

So why should we then care about memory profiling? Although memory is allocated, freed, and defragmented automatically, it is still our application code that requests object creation, and the application should indicate when objects are not needed. The application is responsible for the object lifecycle. That is why it’s important to use an optimal memory consumption strategy and, of course, watch out for errors.

Memory problems are especially dangerous for Node.js applications. Node.js servers tend to run for a long time, so any memory leaks would accumulate over time.

The three main concepts around application memory management are memory budget, memory leaks, and the speed of memory consumption. Let’s talk about them.

Memory budget

Our application operates a limited amount of memory, and big data structures such as caches or queues may eventually consume a significant part of memory. It may result in running out of memory or frequent garbage collection invocations. Taking a heap snapshot in this case will help us find out which of the queues or caches run “out of the budget.”

Memory leaks

How does Garbage Collector (GC) determine that an object is not needed? There is a special group of objects, called GC roots that are known to exist. Hence, they are used as a starting point to determine which objects can be collected.

GC roots include objects in the current call stack (i.e. referenced from the code that is executed right now), objects that belong to the JavaScript engine, such as compilation cache, and objects backed by native (non-JavaScript) objects. Garbage Collector iterates over the graph of objects (heap) starting from the GC roots, following the references between objects. Any object that cannot be reached from any GC root is garbage and subject to collection.

Memory leaks occur when objects that are not needed for the application anymore, but are still referenced from one of GC roots, tend to accumulate over time. Leaks may be small, but still dangerous due to their tendency to grow. The causes of memory leaks may include errors in application cache management, inaccurate use of closures, or keeping detached DOM nodes in JavaScript variables.

Memory consumption problems may arise if you allocate objects too frequently, forcing too-frequent Garbage Collector cycles, which in turn leads to performance degradation.

Relevant articles

We highly recommend that you read the JavaScript Memory Profiling guide on the Chrome website. There is also another great article by Addi Osmany that we highly recommend: Taming The Unicorn: Easing JavaScript Memory Profiling In Chrome DevTools.

We believe that we can also contribute to the task of memory analysis and make it easier for the developers.

WebStorm offers you the following features that can help hunt down memory leaks:

  • Taking V8 heap snapshots in the runtime,
  • Heap snapshots view,
  • Heap snapshots difference view (in WebStorm 10.0.3+).

Taking heap snapshots in WebStorm

To take heap snapshots while running a Node.js app in WebStorm, select Allow taking heap snapshots in Node.js Run Configuration in V8 Profiling tab:

node-run-config

You will need to install the v8-profiler node module, which contains the JavaScript binding for the V8 takeSnapshot() method. WebStorm generates a special proxy code, which runs the in-process server. When it’s time to take a heap snapshot, WebStorm sends this server a notification through a Communication port specified in the Run configuration.

So when the application is up and running, you can use a new Take Heap Snapshot action in the Run/Debug tool windows. You’ll be asked to select a file where the snapshot will be saved. You can choose to open it immediately:

save_snapshot

The Show hidden data check-box allows you to see internal V8 objects and hidden links.

Note: You can also take heap snapshots in Google Chrome and then open them in WebStorm.

Finding memory leaks in heap snapshots

In this article we’ll be using the Interesting kind of JavaScript memory leak example from Meteor’s blog. It’s a great case with a perfect explanation that illustrates the potential danger of closures and why we might need memory profiling.

Here is the code:

var replaceThing = function () {
  var originalThing = theThing;
  var unused = function () {
    if (originalThing)
      console.log("hi");
    };
    theThing = {
      longStr: new Array(1000000).join('*'),
      someMethod: function () {
      console.log(someMessage);
    }
  };
};
setInterval(replaceThing, 1000);

In this example the value of originalThing will be kept in the lexical scope of theThing.someMethod closure. Since originalThing gets reassigned again and again, we are going to have a chain looking like:

theThing -> someMethod -> (lexical scope) -> originalThing -> someMethod -> (lexical scope) -> etc.

Please refer to the original article for the detailed explanation.

We are going to check if we can observe the predicted situation in the heap snapshot. So let’s run the example:

  1. Create Node.js Run Configuration and select Allow taking heap snapshots.
  2. Start Run Configuration and after a few seconds record the first snapshot.
  3. Wait for about 30 seconds and record the second snapshot.

Now let’s open the first snapshot. We’ll see the Containment view:

containment_view

Containment view

The Containment view shows the relationships between objects. In the main tree GC roots occupy the first level and you can unfold objects to see their children.

If you select any object in the main tree, its details are shown below: the path to the object from one of the GC roots and all other nodes that reference this object. Usually a heap dump is a dense graph, where each object has many links to and from it. There are many cyclic references and references from system objects.

In general, the Containment view is essential for understanding why an object is not collected, i.e. who holds a reference to it.

However, we see a really huge tree with many links and internal objects so… what will our first steps be?

First of all, we can use the Search feature. WebStorm searches for objects by:

  • variable, parameter, or function name (“Link names”),
  • class names,
  • text string contents,
  • snapshot object IDs (every object inside a heap snapshot has its unique ID),
  • marks given by the user (we’ll talk about those later).

Since we expect a memory leak to occur around the inner closure and originalThing variable, let’s search for it:

search_action

On this screenshot you can also see the “artificial” groups of GC roots that are created by V8.

The search results are shown in the bottom view:

search_results_and_mark_action

We can see that there are multiple instances of the originalThing variable, each kept in a corresponding context object. So our prediction about the memory leak has come true.

Mark action

To easily differentiate same-named objects and to be able to return to them again, we can use the Mark action. The object is then associated with a text mark and can later be found via search.

You can easily navigate from the search results (and from other views) to the same object in the main tree (i.e. the main tree in the Containment view), or jump to the source code, by using context actions:

jump_to_source

Just to complete the picture, let’s see the same leaking chain starting from theThing object:

origin_of_chains

There are other ways to explore the snapshot.

Very often you have no assumptions what could be causing the leak or whether there’s a leak at all. One way to hunt for leaks is working with the Biggest objects view.

Biggest objects view

The Biggest objects view sorts objects by their retained size and shows the top list. Looking at the biggest objects might help in situations when there is a single root of memory accumulation, such as incorrectly written cache or an overflowed queue.

biggest_obj

In our case we observe many longString string objects.

In the Biggest objects view we do not see any specific links that identify the object, but there is a Details view in the bottom that shows the path to the selected object from one of the GC roots and retainers (objects that also hold the references to the current object).
You can navigate to the same object in the Containment view using the Navigate in Main Tree action.

Another way of investigating heap snapshots is to use the Summary view.

Summary view

summary_view

In the Summary view all the objects are grouped by their type. If one type dominates the view, it might point to the memory leak. You can navigate to the parent object where the objects of this type are hold (using the Navigate in Main Tree action or just by looking into the Details view) and check if it is the leak root.

Another good hint is to look at the Distance parameter for the objects of the same type (distance is the number of steps between the current object and the GC root on the main path). If objects of the same type have different distances, it means they were created by different parts of the code or with a different recursion level. This may be a clue, if you don’t expect different creation paths for objects of this type.

In our case, we see that string objects take the most space. The Different distances rule also works well in our case—we see our heavy strings are all created at different depths.

summary_view_dist

Comparing Heap snapshots

Starting with WebStorm 10.0.3, it is possible to see the difference between heap snapshots in a special view.

This feature allows you to get a clearer picture: you can concentrate more on the objects that keep reappearing or got heavier, and less on the “background” stuff.

In an ideal situation and after a long running time, Heap Diff would directly point to the memory leaks. However, you should always expect some dynamic changes between snapshots: event objects will be created, compilation and recompilation would occur in runtime, and so on.

Another idea behind Heap Diff is that it assists in analyzing slow-running memory leaks.

Suppose you have a leak that will only show up after hours of work. If you just analyze a single snapshot, other “healthy” objects might be bigger than the leak that may not have accumulated enough, making it difficult to detect the leak with the usual strategies (such as biggest objects or class groups).

With Heap Diff, you are likely to notice a memory growth sooner, since other heavy objects are likely to be filtered out.

So let’s examine the difference between the two snapshots that we took in our experiment.

Opening the Diff view

Use the Compare with action, select the second snapshot in the dialog, and specify whether it is a subsequent or previous snapshot. A Heap snapshot view for the second snapshot will also be opened in the other tab.

open_dff

Summary Diff view

In the Summary Diff we can see aggregated information about each type of objects:

  • The difference in the number of objects of each type (Count Diff),
  • The number of created and destroyed objects (Count),
  • The difference in the sum of self-sizes of objects (Size Diff),
  • The sum of self-sizes in the “before” and “after” snapshots (Size).

If you expand a type node, you will see the objects of that type that were created/destroyed with their retained sizes. (Only the first 50 objects are shown.)

This view helps locate the types which led to the biggest memory growth, or the types with overly volatile behavior, i.e. with objects that were created/destroyed too often.

summary_diff_view

Biggest Objects Diff

In the Biggest Objects Diff view, we show how the size of the objects changed and which objects were added/removed. Growing leaks that are big enough are likely to be highlighted in this view.

biggest_obj_diff

Both in the Summary Diff and the Biggest Objects Diff views, you can invoke context actions to mark objects (the mark will be propagated to the originating snapshots), or to navigate to the same object in the originating snapshot (provided that this snapshot is open).

In our case we observe that the number of strings has increased by 75% (in the Summary view) and we see the same-size big string objects added or not changed (in the Biggest Objects view).

Heap snapshots of web applications

Web apps also leak. Since Google Chrome and Node.js have the same JavaScript engine, you can take a snapshot in Chrome for your web app and open it using WebStorm: Main menu | Tools | V8 Profiling | Analyze V8 heap Snapshot.

Of course, Chrome has its own support for working with heap snapshots. Give WebStorm a try and tell us if our new features for searching by object names, navigation inside a snapshot, and navigation to code were useful and where we are missing something!

Web apps snapshots have a more complicated structure. There is one more player on the scene: the DOM tree. So there are typical leaks related to patterns of working with DOM objects in JavaScript.

The most well-known leak is keeping a reference to a DOM object detached from the document in the JavaScript code. You can find such objects under the Detached DOM trees synthetic GC root, marked with yellow or red.

Here’s a an example of a heap snapshot taken for the website (here it’s bbc.com) opened in WebStorm:

bbc

Taking heap snapshots in remote environments

In case you want to hunt for memory leaks somewhere other than the local environment, you can use a programmatic way of taking snapshots. Follow the instructions in v8-profiler module description and then open the taken snapshots using Main menu | Tools | V8 Profiling | Analyze V8 heap Snapshot.

Caching snapshot data

Analyzing a heap snapshot takes WebStorm a significant time. However, once a snapshot is analyzed and opened, you can return to it quickly: all indexes are kept in the IDE system directory.

Conclusions

The example that we’ve investigated throughout this article perfectly illustrates that even if a piece of code hardly raises any suspicions at first sight, it can cause big problems with memory due to internal references. The bottom line is that heap profiling is essential for JavaScript applications.

Memory monitoring and profiling is not a straightforward task. We on the WebStorm team believe that some of the features we provide in WebStorm will help you avoid memory leaks in your project and will save you time. We will continue to improve the heap introspection features.

Thank you for reading. Your feedback is welcome!

image description