Unusual Ways of Boosting Up App Performance. Boxing and Collections

This is a first post in the series. The other ones can be found here:

Many developers today are familiar with the performance profiling workflow: You run an application under the profiler, measure the execution times of methods, identify methods with high ‘own time,’ and work on optimizing them. This scenario, however, does not cover one important performance aspect: the time distributed among numerous garbage collections in your app. Of course you can evaluate the total time required for GC, but where does it come from, and how to reduce it? ‘Plain vanilla’ performance profiling won’t give you any clue about that.

Garbage collections always result from high memory traffic: the more memory is allocated, the more must be collected. As all we know, memory traffic optimization should be done with the help of a memory profiler. It allows you to determine how objects were allocated and collected and what methods stay behind these allocations. Looks simple in theory, right? However, in practice many developers end up with the words, “Okay, so some traffic in my app is generated by some system classes whose names I see for the first time in my life. I guess this could be because of some poor code design. What do I do now?”

This is what this post is about. Actually, this will be a series of posts where we share our experience of memory traffic profiling: what we consider ‘poor code design,’ how to find its traces in memory, and, of course, what we consider best practices.* Here’s a simple example: If you see objects of a value type in the heap, then surely boxing is to blame. Boxing always implies additional memory allocation, so removing is very likely to make your app better.

The first post in the series will focus on boxing. Where to look and how to act if a ‘bad memory pattern’ is detected?

*Best practices described in this series allowed us to increase the performance of certain algorithms in our .NET products by 20%-50%.

What Tools You Will Need

Before we go any further, let’s look at the tools we’ll need. The list of tools we use here at JetBrains is pretty short:

  • dotMemory memory profiler.
    The profiling algorithm is always the same regardless of the issue you’re trying to find:

    1. Start profiling your application with memory traffic collection enabled.

    2. Collect a memory snapshot after the method or functionality you’re interested in finishes working.

    3. Open the snapshot and select the Memory Traffic view.

  • ReSharper plugin called Heap Allocations Viewer. The plugin highlights all places in your code where memory is allocated. This is not a must, but it makes coding much more convenient and in some sense ‘forces’ you to avoid excessive allocations.

Boxing

Boxing is converting a value type to the object type.  For example: Boxing example

Why is this a problem? Value types are stored in the stack, while reference types (object) are stored in the managed heap. Therefore, to assign an integer value to an object, CLR has to take the value from the stack and copy it to the heap. Of course, this movement impacts app performance.

How to Find

With dotMemory, finding boxing is an elementary task:

  1. Open a memory snapshot and select the Memory Traffic view.
  2. Find objects of a value type. All these objects are the result of boxing.
  3. Identify methods that allocate these objects and generate a major portion of the traffic.

Boxing shown in dotMemory The Heap Allocations Viewer plugin also highlights allocations made because of boxing. Boxing shown by the HAV plug-in

The main concern here is that the plugin shows you only the fact of a boxing allocation. But from the performance perspective, you’re more interested in how frequently this boxing takes place. E.g., if the code with a boxing allocation is called once, then optimizing it won’t help much. Taking this into account, dotMemory is much more reliable in detecting whether boxing causes real problems.

How to Fix

First of all: before fixing the boxing issue, make sure it really is an issue, i.e. it does generate significant traffic. If it does, your task is clear-cut: rewrite your code to eliminate boxing. When you introduce some struct type, make sure that methods that work with this struct don’t convert it to a reference type anywhere in the code. For example, one common mistake is passing variables of value types to methods working with strings (e.g., String.Format):Fixing boxing

A simple fix is to call the ToString() method of the appropriate value type:Fixing boxing 2

Resizing Collections

Dynamically-sized collections such as DictionaryListHashSet, and StringBuilder have the following specifics: When the collection size exceeds the current bounds, .NET resizes the collection and redefines the entire collection in memory. Obviously, if this happens frequently, your app’s performance will suffer.

How to Find

The insides of dynamic collections can be seen in the managed heap as arrays of a value type (e.g. Int32 in case of Dictionary) or of the String type (in case of List). The best way to find resized collections is to use dotMemory. For example, to find whether Dictionary or HashSet objects in your app are resized too often:

  1. Open a memory snapshot on the Memory Traffic view.
  2. Find arrays of the System.Int32 type.
  3. Find the Dictionary<>.Resize and HashSet<>.SetCapacity methods and check the traffic they generate.

Finding resized Dictionary in dotMemoryThe workflow for the List collections is similar. The only difference is that you should check the System.String arrays and the List<>.SetCapacity method that creates them.Finding resized List in dotMemoryIn case of StringBuilder, look for System.Char arrays created by the StringBuilder.ExpandByABlock method. Finding resized StringBuilder in dotMemory

How to Fix

If the traffic caused by the ‘resize’ methods is significant, the only solution is reducing the number of cases when the resize is needed. Try to predict the required size and initialize a collection with this size or larger. Predicting collection size

In addition, keep in mind that any allocation greater than or equal to 85,000 bytes goes on the Large Object Heap (LOH). Allocating memory in LOH has some performance penalties: as LOH is not compacted, some additional interaction between CLR and the free list is required at the time of allocation. Nevertheless, in some cases allocating objects in LOH makes sense, for example, in the case of large collections that must endure the entire lifetime of an application (e.g. cache).

Enumerating Collections

When working with dynamic collections, pay attention to the way you enumerate them. The typical major headache here is enumerating a collection using foreach only knowing that it implements the IEnumerable interface. Consider the following example:Enumerating collections example

The list in the Foo method is cast to the IEnumerable interface, which implies further boxing of the enumerator.

How to Find

As with any other boxing, the described behavior can be easily seen in dotMemory.

  1. Open a memory snapshot and select the Memory Traffic view.
  2. Find the System.Collections.Generic.List+Enumerator value type and check generated traffic.
  3. Find methods that originate those objects.

Finding enumerators using dotMemory As you can see, a new enumerator was created each time we called the Foo method.

The same behavior applies to arrays as well. The only difference is that you should check traffic for the SZArrayHelper+SZGenericArrayEnumerator<> class.Finding array enumerators using dotMemory  The Heap Allocation Viewer plug-in will also warn you about hidden allocations: HAV plug-in warning about enumerator allocation

How to Fix

Avoid casting a collection to an interface. In our example above, the best solution would be to create a Foo method overload that accepts the List<string> collection.

Fixing excessive enumerator allocations

If we profile the code after the fix, we’ll see that the Foo method doesn’t create enumerators anymore.

Memory traffic after the fix

In the next installment of this series, we’re going to take a look at the best approaches for working with strings. Stay tuned!


Get dotMemory

This entry was posted in dotMemory Tips&Tricks, How-To's and tagged . Bookmark the permalink.

23 Responses to Unusual Ways of Boosting Up App Performance. Boxing and Collections

  1. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1649

  2. Awesome article covering essential topics that too few developers are aware of!

  3. Richard Moss says:

    Thanks for the helpful article. I’ve been using HeapView for a while now and I do find it forces me to rethink some of the “gung-ho” closure LINQ statements I write and cutting down on boxing (of course, if you follow your advice above using using ToString() in string.Format then Resharper will tell you it’s not needed (if you don’t specify a IFormatProvider anyway) so you can’t win :) ).

    I had noticed the constant complaints about enumerators without knowing why – interesting to see a concrete example.

    I mostly focus purely on performance profiling, I should spend more time getting to grips with memory profilers (which seem to go out of their way to be convoluted and unhelpful).

  4. Pingback: Dew Drop – July 11, 2014 (#1812) | Morning Dew

  5. Chris Staley says:

    Is there any plan to work the results of these findings into the refactoring recommendations made by ReSharper? Specifically, the fixes listed under the “Boxing” and “Enumerating Collections” sections contradict what ReSharper recommends. In the former case, it will tell you that calling ToString() is redundant, and in the latter case it would encourage you to use IEnumerable for the parameter type. Based on your results, it sounds like ReSharper should instead be encouraging developers to write their code as listed here instead.

    • ^ THIS

      After installing the Heap Allocations Viewer plugin, this is driving my OCD brain nuts!

    • Alexey Totin says:

      First of all, thanks for pointing this out.
      In most cases, the answer is – apply ReSharper corrections as this will make your code easier to read. The suggested fixes are needed ONLY in case a particular code generates significant traffic.
      Actually, I try to emphasize this point in this and upcoming posts – before fixing, use memory profiling to determine whether a certain issue is really an issue. No premature optimization!

      • Chris Staley says:

        I forgot to say that I really enjoyed this post and I’m looking forward to reading the rest of the series.

        I’m also a passenger on the “don’t prematurely optimize” train, and I frequently sell tickets to it as well. However, I think what we have here is a little different since we’re not talking about introducing potentially unneeded complexity and/or making the code harder to read/maintain. In fact, that’s the beauty in your fixes: they’re so simple to implement. All things being equal (and perhaps even if some are not), I would think that ReSharper should choose the most performant path by default when making refactoring recommendations. Imagine if your algorithms had been 20%-50% more performant from Day 0.

        • Sergey Shkredov says:

          I think that combining run-time information and static analysis is a way to go for performance profilers. As an alternative an attribute based approach was discussed here: https://roslyn.codeplex.com/discussions/544763 where code may be marked with [PerformanceCritical] attribute and thus affect code inspections.

        • controlflow says:

          R# suggestions are all about conciseness and simplicity of your code, all about composability (LINQ) and increasing level of abstraction (use IEnumerable when possible).

          HeapView plugin knowledge highly rely on how CLR and C# compiler implemented today, but not tomorrow. For example, next CLR version will solve enum boxing when calling .ToString() on the JIT side (saw this somewhere on BCL team blog). Roslyn will change something else for sure.

          R# transformations only rely on semantics of code, having no knowledge on how particular code snippet performs. And it feels perfectly right! In 99% of cases calling .ToString() over value type to reduce boxing is useless for overall performance, in other 1% cases profiler will show you the issue.

      • Andir says:

        I believe that such “performance-hurt” suggestions should be an option (“do not suggest if performance degradation possible”). Just because them so easy to implement and someone on the team can “optimize” some code for “readability” accidentally.

        • controlflow says:

          The only problem is you cannot really know how particular suggestion affects the performance of code. Boxing may be eliminated by compiler or JIT, we have now knowledge about this.

  6. Zack Peine says:

    Thanks for the article, very useful information to know.

    In your example on boxing you say:
    don’t use int i = 5;
    string.Format(“i = {0}”, i);
    instead use string.Format(“i = {0}, i.ToString());

    I have stopped using your suggested approach through the course of using Resharper though because it calls the .ToString() redundant. Any idea why .ToString() in that case isn’t the suggested approach via Resharper?

  7. Dhiraj Soude says:

    Thanks for such a nice explanation. looking ahead for more from you..

  8. RS says:

    > String.Format(“i = {0}”, i);
    vs
    > String.Format(“i = {0}”, i.ToString());

    So you save a heap allocation where “i” gets boxed… by allocating a string instead? How does this save anything?

    I bet you if you actually measure these, the latter will be slower while putting just as much pressure on the GC as the former.

    • Alexey Totin says:

      The string is allocated in both cases. But the first example additionally allocates one Int32 object in the heap

  9. Steve says:

    I find this article very contradictory. I love resharper, and could not do without it, however, I have always thought it odd that it complains whenever you call .ToString() on a value in something like string.format etc., stating that the call to .ToString() is unnecessary . Even though this is true, the reason to use it as you state above, due to the fact that if you pass in a a value to string.format or similar and don’t call .ToString(), if the value is a Value type, then a boxing operation will occur. Knowing this, why does ReSharper advice that the call to ToString() is unnecessary. I have recently been convinced that this kind of thing is a micro-optimisation, and as such is a valid call, so hence why I find it odd that you talk about it now.

    • Alexey Totin says:

      The best explanation to your concern is given here.
      Everything we talk about in this article is a micro-optimization. The only way to understand whether you need a certain fix is to profile your app. That’s why memory profiling on regular basis is a good idea.

  10. Pingback: Reading Notes 2014-07-21 | Matricis

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">