Code Smells: If Statements

This is part of a series investigating code that looks suspicious (“code smells”), and exploring possible alternatives.

The article in this series that has so far provoked the most responses was on iteration.  Today, I’m talking about if statements (conditionals). I’m not intending to go after any sacred cows or anything – for loops and if statements are pretty much the first things we learn in many programming languages.  They’re a fundamental part of languages like Java.  But just because these basic building blocks exist doesn’t mean they should be our goto strategy (see what I did there?) every time.  We could program in assembler but we don’t. We could use a hammer for every job around the house, but we don’t (well…).

If statements are very useful, and often the right tool for the job, but they can also be a smell.  Remember that by “smell” I mean “code that you should look carefully at to see if there’s a simpler/easier to understand, approach”. I don’t mean “if statements are evil never use them”.

In the last two articles we’ve been looking at the smells from a single method, validateQuery.  You may have noticed we’ve been ignoring the series of ifs that make up the main body of the method.  Now seems like a good time to turn our attention to them.  We’re going to start with (more or less) the version of this method that has already benefited from the previous two refactoring blog posts. I’ve made some minor changes though to make the refactoring in this article a bit simpler:

If you remember, the thing that originally drew my attention to this method was the multiple isPresent checks that are the result of wrapping the MappedField mf in an Optional.  I thought Optional was going to solve all my problems, and it didn’t.  But of course the problem was not the introduction of Optional – this code originally had null checks in every place we now check isPresent(), so I should have been asking myself, why do we need to check if something is null so many times?  Mutability was causing some of the confusion, so now we’ve simplified the method to minimise mutability, let’s take a closer look at the rest of the code.

The Smell: If Statements

Looking at the code above you can see 11 different if statements, many of which check more than one condition. Two contain a break, one a return.  There’s only one else, which actually follows one of the if statements with a break, making it almost entirely pointless.  By the end of the method it’s possible that you’ve visited more than one of the conditional branches, and it’s hard to reason about which ones, why, and what that means.

The problem with this code is that it’s actually doing two things – it’s (possibly) validating the query (based on the validateNames parameter value), and it’s also traversing the String property path and converting this into a domain object, the MappedField that represents the element at the end of this path.  I can see why these two things are conflated, as it means iterating over the path only once, but it a) requires a flag to decide whether to do the validation bits and b) makes it really hard to tell which parts are required for validation, which parts are required for navigating the path and which parts are required for obtaining the relevant information to get the appropriate MappedField. On top of that, some of the conditionals are used to control the looping behaviour as well.

Step 1: Place Guard Conditions at the Start

Some of these if statements are controlling the flow – returning from the method if possible.  What I would like to do is move these as early in the method as possible, so my brain can spot these special conditions and then ignore them for the rest of the method.

The first of these is straightforward:

I want to move the guard condition on clazz right to the beginning, outside of the first if.  There are a couple of reasons: 1) all the stuff that happens before the check just isn’t needed for this null check, it will work without them (and we don’t incur the performance hit of String parsing when the clazz is null and doesn’t need to do it anyway) 2) it seems like the right thing to do to check the param values and exit as fast as possible before getting into any further logic.

You have to do a quick check yourself to make sure this doesn’t actually change the behaviour of the method, because potentially the method may have been returning two different values in the case of a null clazz: one value when the property path starts with $ and one where it doesn’t. If we look at the whole method, we return our brand new validatedField in both cases, so it’s fine to move this check right to the top:

While I’m here, I’m going to do another, unrelated refactoring.  What does is mean when the propertyPath starts with a dollar?  Why does that matter?  We could add a comment to this if statement to clarify its meaning, but my preference is usually to wrap these conditions in a method that returns a boolean, and give it a useful name:

MongoDB paths that start with a dollar sign are usually some sort of operation, not a path to a field, so this method is simply ignoring operators and only cares about fields.  Actually if I preferred, I could better reflect this in the code by pushing the “not” into the method and calling it isFieldPath, if I think that’s more understandable.  This is totally a personal choice.

Step 2: Remove Logic for Controlling the Iteration

The abuse of the for loop syntax in this code pains me:

I want to turn this into a proper for loop, or possibly a while. In a different version, I created my own enumeration to manage the looping.  It’s not at all obvious from the code (and there’s certainly no magic refactoring in IntelliJ IDEA) how to fix this problem.  Moving directly to:

is not going to work, because we manage the increment of i inside the method.  So we have to identify the code that manages the looping, and replace those pieces with something that will work in a traditional loop.

This section right in the middle of the code is the bit that manipulates the index and decides when the loop is finished.  Once we’ve identified that, we can work out which bits to remove or change.

The i++ can be replaced with the i++ in the traditional for loop.  But if we remove this line and rely on the loop increment, i is going to be incremented at the end of the loop, not halfway through, so we need to take that into account and make appropriate changes.

02-index-after-increment

If we want to remove the increment, anywhere that was using the value of i after line 76 needs to use (i+1), because that’s what the previous value effectively was.  This is only in one place (line 82, see above), so it’s fairly easy.

With these changes, the tests should all still pass (and they do, fortunately).

Everything after the break statement happens only if there are more elements to process after this current one.  That’s not really obvious from the code as it is, but if we flip the if statement

We get

Now we’re using a proper for loop with a guard condition, we don’t need the break any more (especially as it’s right at the end of the method).  The for loop itself will make sure don’t go past the end of the array.  So we can remove this else statement completely.

Step 3: Extract Method for Readability

As before, now I’m reasonably happy with the state of one of these conditional I’m going to extract the condition into a small method with a useful name so I can better understand what I’m checking.

Now our code is a bit easier to understand. We’re using a traditional for loop to iterate over the array, so we removed the logic around incrementing the index and guarding against going past the end of the array.  The only minor complication around this is the intentional additional increment to skip particular values, but this is a bit easier to see and understand now we’ve removed all the other index management.

Step 4: Replace Multiple Checks of Value

Several if statements check the same condition, so what I would like to do is see if I can evaluate this condition separately, just once, to simplify the reasoning.

06-isArrayOperator-usages

We see fieldIsArrayOperator is checked twice, and both times we care about if it is not an array operator.  My question, therefore, is what do we do, if anything, when it is an array operator?  Here we actually need to either go into depth digging through the code, or apply some domain knowledge.

Line 65:

Don’t run this code if it’s an array operator.

Line 82

Don’t run this if it’s an array operator.

The only code that gets executed if the field name is an array operator is:

With research, testing and/or domain knowledge we can figure out that mf is always going to be empty when the fieldName is “$”, and that if it’s an array operator it’s not going to be a Map.  Therefore I can surmise that zero code gets executed if fieldName is an array operator.

That means I can shortcut all the code under the circumstances that fieldIsArrayOperator is true:

Basically if the fieldName is a dollar sign, skip this and move on to the next. Here I have to run all the tests again to check my hypothesis is correct (they pass, which is good). Now I can do some cleanup.  IntelliJ IDEA confirms that this has achieved what I wanted – now I’m checking for the value right at the beginning, I can remove the check from later statements (lines 68 and 85).

Like before, I want this if statement to call a usefully-named method, so I’m going to replace the variable name with a method call instead.

Step 5: Extract getMappedField Method (again)

The sharp-eyed amongst you will notice that this code does not include the refactoring from the mutable values post, where I extracted a getMappedField method to constrain the mutation of mf to a single method.  I did this because I wanted to be able to simplify the conditions inside that method too.  Now we’ve done that (we removed fieldIsArrayOperator), we can re-extract this method. I’ve decided to give it a different name to last time to reflect what’s really going on, it’s now getAndValidateMappedField.

I still don’t like how many parameters this method has, but we’ll look into that later.

Step 6: Collapse two ifs into an if else

The main content of my getAndValidateMappedField method is:

IntelliJ IDEA wants me to collapse the last one into a “functional” expression, but I hope we can do better than that.  There’s three if statements here and all of them check isPresent.  This seems slightly ridiculous.  Granted, this is the method that contains the mutable mf value, so that’s a contributing factor – we’ll need two isPresent checks because we try getting the value twice.  If we look at the inner ifs though, surely it doesn’t make sense to have one if statement that does something if it’s not present and one that does something if it is – that sounds like an if/else, right?

Now here I might expect IntelliJ IDEA to offer me some magic fix, but I’m assuming the presence of the validateNames check is messing with the IDE’s ability to reason about the code (or: the order of these checks might actually matter under some circumstances).  Here, we can use our ability to reason to figure out that actually these two ifs can be swapped

This should easily translate to an if/else

And we end up with:

This does exactly the same thing, it’s just a bit easier to reason about.  Why? Because 1) we see at a glance at most one of these branches will be executed now it’s an if/else 2) we’ve cut out one of the three isPresent checks so we don’t have to keep reasoning about what it means if it’s not present 3) by working with the positive value (isPresent, rather than !isPresent) first it’s actually easier for humans to process.  It’s also just nice to deal with the exceptional case at the end.

Step 7: Reducing Errors

Going back to the original method, let’s look at the last part of the code again:

11-get-without-check

IntelliJ IDEA tells me, rather alarmingly, that despite all my isPresent checks, my mf.get() here could actually be on an empty field.  Ooops.  The fact that code passes all the tests at this point either means mf really is never empty at this point, or that we don’t have a test that exercises this code path.  I would highly recommend trying to write some tests to get into this code section if this were my production code.

I’m tempted to delete all the code that I can, but lets do the safest thing.  Firstly, we can move the check for interface right to the top

Technically this changes the logic compared to the original version, but it is much safer and if the tests still pass (which they do), it’s probably the right thing to do to check the edge conditions first.

Step 8: Simplify the Logic

I want to simplify all of this. I still hate the multiple checks for isPresent, so is there something we can do here?

Given both the if and the else at the start of this code section return from the method, the rest of the code is effectively inside another else:

We can turn these statements around a little to make it a bit more readable again

This is marginally better because a) we only check isPresent once, and we look at the case in which it IS there first, and deal with the different types of exceptions second and b) again using if/else instead of a series of ifs, we should be able to reason about which branches are executed when – we know only one of the three options is ever executed.

Now we’ve moved all this around a bit we can simplify the code in the isPresent case:

We don’t have so many unpleasant get() calls, and we can see our get() next to the isPresent() check, which is comforting.  I would have liked to use an ifPresent() here, but we can’t combine that with the else code paths.

Final Step: Extract (more) Methods for Readability

Now we’ve been able to simplify the if statements (removing, reordering, flipping etc), we should be reasonably happy with what each of the conditions is checking. At this point, I like to extract methods for all of the conditional logic to document what they do.

Firstly, I’m extracting a method to document what the map condition is checking.  I could leave it as it is:

…but it feels like there must be a more functional way of saying the same thing.  I ended up with:

I’ll be honest, I’m not sold on that being more readable. Of course, it does remove one of the many isPresent() checks, which I guess is a Good Thing, and it is using the Optional in a more functional style, which some people prefer. Which approach you take may be down to team preferences.

Next, I want to clarify what we’re doing here:

I extracted just the second part into a method:

You could put validateNames into the method as well, I chose to keep it out as I would like to keep it visible in the validateQuery method.

The final method looks like:

Summary

We are now down to 47 lines of code (from 62). Some of that is being able to remove code, some is extracting methods, and some is simply because we’ve been able to remove whitespace and comments now the code is a tiny bit more self-explanatory.  It is still a lot more procedural than I like, and new developers to the code will probably still wonder exactly what’s going on, but the method now fits onto a single laptop screen (just!) and is a bit easier for a human to parse.

Note that once again, I have not removed the use of the thing I accused of being a “smell” – we still have if statements (a lot of them actually! But we’ve reduced it from 11 to 8).  The differences are:

  1. We don’t have to clutter our navigation/validation checks with conditionals for managing the looping behaviour, we let a standard for loop take care of that
  2. We use more else statements, so it’s easier to reason around which code branches are executed and when
  3. We’ve moved early-exit code as close to the top of the method as possible, so that we can see those cases and ignore them when we’re looking further down the code
  4. We’ve extracted tiny methods to encapsulate our conditions (once we’d simplified them) so that we understand better what each if statement is really checking, and hopefully why.

Symptoms:

  • Lots of if statements
  • Very few else branches
  • The same / similar conditions appearing in multiple branches (isPresent, fieldIsArrayOperator)
  • Flow control being managed manually

Possible Solutions:

  • Flipping the logic.  Humans find it easier to reason about the positive (rather than negative) values, so try to use positive terms where possible. If a negation is necessary, you might wrap this in a method that’s easier to understand.
  • If a value is being checked regularly (like fieldIsArrayOperator was), it may be easier to reason about it if you pull out just this case and deal with that individually. Then you can remove the check from all the other places.
  • Consider whether an else is going to be more readable than a series of ifs.  Remember that a series of ifs (without else) can potentially lead to all branches being executed, whereas an explicit else shows that this is an either/or.  This is easier for a human to reason about, but it’s also potentially more efficient for the computer too. You may need to consider flipping the logic in order to make use of else statements.
  • Once you’ve done as much as you can for reducing duplicate use of values in the conditions, consider refactoring each condition, even possibly single checks, into its own method with a meaningful name.  This can enormously improve the readability of if statements (see: Extract Method, Decompose Conditional)
  • Cases which are guard cases, basic validation or simple checks for something that might be an exit case could be moved to the top of any method (if this doesn’t impact the method’s overall logic) so that the reader of the code doesn’t have to reason about these cases throughout the method (see: Replace Nested Conditional with Guard Clauses).
  • On the other hand, fall-through exceptional cases may belong at the end of the method or in a final else, so we understand this is a “when all else fails” situation.

This entry was posted in Tips & Tricks and tagged , . Bookmark the permalink.

9 Responses to Code Smells: If Statements

  1. NoMoreForLoops says:

    Hi, very interesting article. Somehow I find the final code to be complex still (though much simpler than the original!).
    Wonder whether it’d be made much simpler still if you used map/remove on the list and didn’t have a for loop at all, e.g., (made-up syntax – too lazy….):
    final String[] pathElements = propertyPath.split(“\\.”);
    final String[] pathNonArrayOperatorElements = remove(pathElements, lambda (x) {isArrayOperator(x);});

    • Trisha Gee says:

      You make a very good point. Sometimes stepping back from the specific solution you have and trying a different approach is better. For these blog posts I’ve generally looked at very small changes you can make to improve things, but I would definitely like to talk about how to step back and see these different approaches. I’ll add another post to the backlog!

  2. Friso says:

    Feel I’m missing something here … why not

    if (clazz == null || isOperator(propertyPath))

    as the first Guard? Likewise negating line 26 could result in a ‘continue’, which would allow for less nesting (but that may be another code smell?)

    • Trisha Gee says:

      Huh, you’re totally right!! I got so caught up in the other conditions I didn’t look to see how to simplify the isOperator guard. Thank you!

      As for the continue… there are definitely lots of other things that can be done here. Some of them are a matter of taste, some are deciding when you’ve had enough and need to move on, and sometimes you do enough to move on to a different smell that may change the code completely. I’ll talk about that in a later post.

  3. Jeremy Bensing says:

    I really enjoyed this series. Thanks so much. The GIFs are really helpful. I’ve been using Intellij for a while now, but have only recently taken the time learn how to get the most out of it. The GIFs coupled with the presentation plugin really helps to illustrate how Intellij can make refactoring a little less painful.

    • Trisha Gee says:

      Thank you so much for your positive feedback. I know the GIFs can be a pain, but I hoped they would help those who aren’t yet aware of some of IntelliJ’s features, so I’m glad you found them useful.

  4. Stefan Rother-Stübs says:

    Enjoyed this series very much so far.

    Generally I would say that is…() methods are a smell for themselves and may be candidates for type/class extraction (the field _is_ an Operator or ArrayOperator) and the use of (good old) polymorphy (I know, I know, everybody’s going functional today, but don’t deny Java’s OO roots).
    But it could hurt readability.

    • Trisha Gee says:

      I would generally agree. In this post, we mostly just simplified the if statements themselves rather than addressing the design smells. I’m sure that there are some missing domain objects in this code, and if we could identify and extract them it would probably remove most of the remaining is… methods/conditionals.

  5. “so my brain can spot these special conditions and then ignore them for the rest of the method”

    And that is precisely why the Java style guide at our company has us try to return early over nesting if statements.

Leave a Reply

Your email address will not be published. Required fields are marked *