Dotnet logo

.NET Tools

Essential productivity kit for .NET and game developers

How-To's

ReSharper SDK Adventures Part 7 — Factoring expressions

Welcome to another part of the SDK adventures. Most of the sample plugins shown so far have been fairly simple, with implementation times ranging from a few minutes to an hour, at most. In this part of the series I want to show what developing a fully-fledged, complicated feature is like.

Since complex features take a long time to develop, this post will only show the initial phases of development of a factoring context action that we will later refine and improve. Let’s get started!

Factoring

When writing numeric expressions, you may come accross something like the following:

The above expression is worth analyzing because it’s not perfectly efficient: it’s doing one more multiplication than is needed to get the result. Here’s a better form:

Wouldn’t it be great if ReSharper could help factor out terms in similar expressions? Of course it would, and this is what we’re going to do here. However, as it turns out, this problem is a lot more complicated than it looks.

Checking Availability

If you open up an expression such as the one above in ReSharper’s PSI Viewer (you’ll need to be running in Internal mode for this), you’ll see that it’s basically a combination of IBinaryExpressions of type IAdditiveExpression and IMultiplicativeExpression. The operator precedence has already been applied, but there’s a caveat that additive expressions represent either addition or subtraction, whereas multiplicative ones represent either multiplication or division. We’ll ignore this distinction for now, as a proper solution for all of these is even more complicated than what you’re about to see.

Given that an expression such as a+b+c is really represented as (a+b)+c, we have good reason to implement a flattening algorithm that turns a binary tree into a flat list:

We can now think about checking the applicability of our context action. The analysis is deep, but it begins with finding the ‘root’ additive expression:

At this point, we need to do two things:

  • Flatten the top-level addition so that (a+b)+c becomes just a+b+c. In fact, we keep it as a simple list.

  • Do the same thing for the inner multiplications, so that (a*b)*c can be treated as just a*b*c.

The actual implementation uses our Flatten<T>() function as well as a special FlattenMultiplications() function.

Flattening multiplications is a similar operation that gives us even better data: it returns a list-of-list-of-expressions, i.e. data that we can work with to be able to actually analyze the commonality of terms:

What we do now is build a frequency table of the terms — what they are, where they appear and how often. For example, for the expression a*x*x + b*x + c we would get a table such as:

a x b c
1 1 2 0 0
2 0 1 1 0
3 0 0 0 1

The implementation is not particularly difficult:

Having the terms in a histogram of sorts, we find out the one that’s used most:

Once we have the term, we find out how many times we’re going to extract if (for example, in a*x*x*x + b*x*x it needs to be taken out twice), and we can finally return true. Note that we also keep a record of affected terms, because even a term with a zero in it may either be a multiplicand of the term we’ve extracted, or not.

Applying the Action

The first thing we output is the multiplier term and the opening brace to our subexpression:

Then, we set up a lambda that outputs the sum-of-products in a neat fashion:

And then, of course, we generate the new expression and replace the old one:

Seeing it in Action

Let’s try a simple example. Say we have this method that computes a polynomial:

Executing the factoring action once, we get

Firing it again within the braces we get

And this is what our action is ultimately about.

Conclusion

Our approach works, but suffers from a few methodological deficiencies, namely

  • We currently do not handle subtraction and division well

  • We can only factor out one variable

  • The algorithm is not recursive, so we have to invoke it several times to fully optimize the expression

In the next part of the series, we’ll take a look at fixing these issues. Meanwhile, check out the source code and stay tuned for more! ■

image description