Unusual Ways of Boosting Up App Performance. Strings

This is the second post in the series. The other ones can be found here:

This post will focus on best approaches of working with strings.

Changing String Contents

String is an immutable type, meaning that the contents of a string object cannot be changed. When you change string contents, a new string object is created. This fact is the main source of performance issues caused by strings. The more you change string contents, the more memory is allocated. This, in turn, triggers garbage collections that impact app performance. A relatively simple solution is to optimize your code so as to minimize the creation of new string objects.

How to Find

Check all string instances that are not created by your code, but by the methods of the String class. The most obvious example is the String.Concat method that creates a new string each time you combine strings with the + operator.

To do this in dotMemory:

  1. In the Memory Traffic view, locate and select the System.String class.

  2. Find all methods of the String class that create the selected strings.

Consider an example of the function that reverses strings:Reversing strings example

An app that uses this function to revert a 1000-character line generates enormous memory traffic (more than 5 MB of allocated and collected memory). A memory snapshot taken with dotMemory reveals that most of the traffic (4 MB of allocations) comes from the String.Concat method, which, in turn, is called by the Reverse method.

Traffic from string objects shown in dotMemory

The Heap Allocations Viewer plug-in will also warn you about allocations by highlighting the corresponding line of code:

The HAV plug-in highlights string concatenation

How to Fix

In most cases, the fix is to use the StringBuilder class or handle a string as an array of chars using specific array methods. Considering the ‘reverse string’ example, the code could be as follows:

String concatenation code fix

dotMemory shows that traffic dropped by over 99% after the fix:

Memory traffic in dotMemory after the fix

Improving Logging

When seeking ways to optimize your project, take a look at the logging subsystem. In complex applications, for the sake of stability and support convenience, almost all actions are logged. This results in significant memory traffic from the logging subsystem. That’s why it is important to minimize allocations when writing messages to log. There are multiple ways to improve logging.*

*Actually, the optimization approaches shown in this section are universal. The logging subsystem was taken as an example because it works with strings most intensively.

Empty Arrays Allocation

A typical LogMessage method looks as follows:

logging_1

 

What are the pitfalls of such implementation? The main concern here is how you call this method. For example, the call

Logging call 1

will cause allocation of an empty array. In other words, this line will be equivalent to

Logging call 2

How to Find

These allocations would be difficult to detect in the memory snapshot manually, but you can use the Heap Allocations Viewer plug-in to find it very quickly:

Empty array creation shown in the HAV plug-in

How to Fix

The best solution is to create a number of method overloads with explicitly specified arguments. For instance:

Logging fix example

Hidden Boxing

The implementation above has a small drawback. What if you pass a value type to, say, the following method?

Hidden boxing example 1

For example:

Hidden boxing example 2

As the method accepts only the object argument, which is a reference type, boxing will take place.

How to Find

As with any other boxing, the main clue is a value type on the heap. So, all you need to do is look at the memory traffic and find a value type. In our case this will look as follows:

Hidden boxing shown in dotMemory

Of course, the Heap Allocations Viewer will also warn you:

Hidden boxing warning in the HAV plug-in

How to Fix

The easiest way is to use generics—a mechanism for deferring type specification until it is declared by client code. Thus, the revised version of the LogMessage method should look as follows:

Hidden boxing code fix

Early String Allocation

The advice to defer variable allocation as much as possible is quite obvious. Still, sometimes stating the obvious is useful.

Consider the code below. Here the logmsg string is created regardless of whether logging is turned on or off:

Deferring string allocation

A better solution would be:

Deferring string allocation fix

Excessive Logging

If you use logging for debugging purposes, make sure log calls never reach the release build. You can do this by using the [Conditional] attribute.

In the example below, the LogMessage method will be called only if the DEBUG attribute is explicitly defined.

Conditional attribute example

That does it for this post. In the next one, we’ll talk about the nuances of using lambda expressions and LINQ queries. To stay tuned please follow @dotMemory twitter or google+ product page!

This entry was posted in dotMemory Tips&Tricks, How-To's and tagged . Bookmark the permalink.

5 Responses to Unusual Ways of Boosting Up App Performance. Strings

  1. Pieter van Ginkel says:

    These all look like very good suggestions, except for the generic LogMessage. Chances are very high that the parameter is passed directly on to string.Format. My guess is that method overload resolution will pick the object typed overload of string.Format and always box the argument. In that case, you’re just moving the box operation and the only thing you’ll have accomplished is that the code is now more complex and you’re generating garbage for all instantiations of the generic method.

    The solution is to create overloads that match overloads of the method you’re passing the arguments to, if there even are any.

    • Alexey Totin says:

      Not sure whether I get you right but this doesn’t depend on method insides. For example:

      ...
      LogMessage("message", 123);
      ...

      One boxing when calling String.Format:

      void LogMessage<T>(string message, T arg0)
      {
      String.Format("{0} {1}", message, arg0);
      }

      No boxing at all:

      void LogMessage<T>(string message, T arg0)
      {
      String.Format("{0} {1}", message, arg0.ToString());
      }

      One boxing when calling LogMessage, no boxing when calling String.Format (variable is already on the heap):

      void LogMessage(string message, object arg0)
      {
      String.Format("{0} {1}", message, arg0);
      }

      One boxing when calling LogMessage, no boxing when calling String.Format:

      void LogMessage(string message, object arg0)
      {
      String.Format("{0} {1}", message, arg0.ToString());
      }

  2. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1654

  3. Pingback: Dew Drop – July 18, 2014 (#1817) | Morning Dew

  4. For modifying string content, it is also performance efficient to use pointers. Of course this breaks string immutability and the developers need to be seasoned enough to be able to deal with that.

    http://philosopherdeveloper.com/posts/are-strings-really-immutable-in-net.html

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">