Code Smells: Null

During my research on refactoring I’ve seen a number of patterns (smells) crop up again and again. None of these are particularly new, and there are plenty of books, blogs and videos that cover smells and how to deal with them, but I wanted to demonstrate some specific, non-trivial examples and, of course, how IntelliJ IDEA may (or may not) be able to help you.

The first problem I’ve being trying to counter is the use of nulls, particularly when it leads to null-checks scattered around the code.

I thought Java 8’s Optional should solve a lot of these problems. I assumed that a type that specifically states that it may be null, that is designed to let you say what to do in these occasions, is exactly the right solution.

However, things are never as simple as they seem, and I suspect that’s why there’s no magic “Optionalise my project” intention in IntelliJ IDEA.  Rather disappointingly, this is an area that needs me, the developer, to think hard about what should be done.

We have to come to terms with the fact that null means many things.  It can mean:

  1. Value was never initialised (whether accidentally or on purpose)
  2. Value is not valid
  3. Value is not needed
  4. No such value exists
  5. Something went terribly wrong and something that should be there is not
  6. …probably dozens of other things

You can assume some of these don’t apply to your code base if you’re a disciplined team with clear design goals – for example, having final fields that are always initialised in the constructor, using builders or factories to validate correct combinations before instantiation, not using nulls in the core of your application (i.e. pushing all checks to the edges) and so on.

Optional really solves only one of these cases, case 4. For example, you ask the database for a Customer with a given ID.  Previously, you might have used null to represent this, but that could be ambiguous – does it mean the customer wasn’t found? Or that there’s a customer with that ID, but has no values? Or was null returned because the connection to the database failed? By returning an empty Optional, you’ve removed the ambiguity – there’s no customer with that ID.

In a normal, mature, code base that’s been worked on by numerous people, is it possible to tell what all our nulls mean, and what to do about them?  We should be able to make some progress by taking very tiny bites.

The Case Study

I’m using the Morphia project as an example in this blog post, as I have in previous talks and articles about refactoring.  This is a great example project because it’s a) open source b) small enough to easily download and explore and c) mature enough to contain the sort of real life example code you probably have in your own applications.

The Smell: return null

I did a Find in Path for all the places where null is explicitly returned.  I have a theory that maybe we can change these methods into something that returns an Optional.

Find places that return null

Results of “Find in Path” for “return null”

Example 1: Converter.decode()

Given that lots of these *Converter classes seem to return a null value in the decode method, it seems reasonable that we might want to change the Converter superclass (an abstract class called TypeConverter) to return Optional here.  But a closer look at the code shows what’s actually happening: we see exactly the same pattern happening again and again – the method checks if a value passed in was null, and if so returns null:

The first question is can fromDBObject actually be null? The code is sufficiently complicated that it’s difficult to tell, but theoretically it seems plausible, since this is a value from the database.

A quick search shows that all instances of this method are actually only called from one central place, so instead of making this check in 21 different places, we can just do it in a single place and from there on assume fromDBObject can no longer be null.

Solution: @NotNull parameter

My original assumption that we can use Optional here turned out to be wrong.  Instead, I’m going to change the method signature of decode to declare that fromDBObject cannot be null – I’ve opted to use the JetBrains annotations to do this.

Then I move the null check (and subsequent null return) in the place where fromDBObject could actually be null, the single place that calls decode.

Add null check to call site

Moving the null check to the single call site where fromDBObject can be null

It makes the call site a bit more untidy, but constrains this logic to a single place instead of scattered through all implementations. Now, we can either document the fact that fromDBObject will not be null or rely on the annotation to do this for us, so we never have to perform this null check in the Converter implementations.

Next we can tidy up and remove all the duplicated code. Here IntelliJ IDEA can help us.  If we go back to the abstract method in TypeConverter that we annotated with the @NotNull parameter, we are given a warning:

Overridden parameters are not annotated

Warning that implementations don’t annotate this parameter

The quick fix suggested when we use Alt+Enter on this warning lets us apply this annotation to all the implementations of this method in one easy step, so let’s do that.

Quick fix to apply the annotation to all implementations

Use the quick fix to apply to all implementations

Our VCS log shows that all the Converter implementations have been updated

All Converters updated

This has added the @NotNull annotation to the decode method’s parameter, but we still haven’t tracked down and removed all the duplicate null-check code.  The good news is that now we’ve said fromDBObject cannot be null, we can use the inspections to find all of our redundant null checks and remove them.

One way to do this is to navigate into an implementation that we know has the check (in this case I chose StringConverter) to see what warning IntelliJ IDEA gives me.

Null check not needed as value is never null

Null check not needed as value is never null

This turns out to be triggered by the “Constant conditions & exceptions” inspection. We can run just this inspection over the code base to find other examples easily, using Ctrl+Shift+Alt+I and typing the inspection name.

Run inspection by name

Using Ctrl+Shift+Alt+I to run an inspection by name

This returns more results than just my Converters, but by grouping the results by directory I can see all the Converter implementations easily

Inspection results

I can use the preview window on the right to check which code has been flagged, and see what IntelliJ IDEA suggests doing with each problem.  After scanning through and making sure this is the null check I’m looking for, I select the converters directory and press the “Simplify boolean expression” button.

To sanity check the changes, I go into the VCS Local Changes window and use Show Diff (Ctrl+D) to check the changes that were applied to the 36 files there.

Using the diff view to sanity check changes

Using the diff view to sanity check changes

I can even use the diff view to make small edits if I want – in this case I use Ctrl+Y to remove the unnecessary empty lines 26 and 27 of my file (the file on the right). By using Alt+Right to check all the files that have been changed, I find an example where the null check wasn’t removed (this was a test converter and therefore was not in the converters directory), but I can easily use quick fixes even in the diff view to remove it.

This sanity check gave me peace of mind about the automatic changes IntelliJ IDEA made, and let me make some additional tweaks too. The last thing to do is run all the tests, and when they pass (hurrah!) now seems like a really good time to commit all these changes.

If you’re uncomfortable adding a new library to document that a parameter can’t be null and make the IDE warn you if you pass in null values, you can always remove the @NotNull just before committing – after all, it’s done its job now. In this case, you should at least update the Javadoc to state that the parameter is assumed to be not null.

Example 2: Converter.encode()

Earlier we saw that the decode method was only one of several that explicitly returned null values.  Now we’ve dealt with that, we can take a look at another example.

The encode method on the same TypeConverter class can also return null.  But unlike decode, there are valid times when the converter implementations explicitly return null:

Solution: Optional

This is a good place for Optional– we’re explicitly stating that there are some cases where the encode method does not return a value. Optional.empty() seems like a good representation of this fact.  So I change TypeConverter.encode() method to return an Optional instead.  It makes all the implementations a little more untidy, as things need to be wrapped in an Optional, but it is more explicit about the fact that this method sometimes does not return a value. I’d love to show you IntelliJ IDEA magic that does this for you (in some cases, Type Migration will work, but not in my example) but I made this change the hard way – I changed the return type of the super class to Optional and fixed all the places that broke. The good news is that once I’d changed the method signature to return an Optional, IntelliJ IDEA did offer a quick fix to wrap my return values in an Optional.

Wrap value in Optional

Wrap value in Optional

Similarly, null values can also be replaced

Return Optional.empty() instead of null

Return Optional.empty() instead of null

Now we’re returning an Optional, we have to work with this new type at the call site.  In my case, I actually got no compiler warnings as my return type was originally an Object, and obviously an Optional is also an Object (note: this is why we shouldn’t use raw Object types, because it basically removes the benefits of having a strongly typed language. This code would have been better with Generics). I just have to change the one place that calls the encode method to unwrap the Optional.  I’m going to do something that’s pretty nasty and use orElse(null), but since this is the same behaviour the original methods were doing anyway, and it’s restricted to one place in the code, I’m happy with this – I can flag it as tech debt and we can gradually deal with the issue rather than chasing it all around the code.

Dealing with the optional return

This preserves the original behaviour, but should be considered tech debt to be addressed later.

In tests that call the encode method I simply updated them to call encode(o).get() – usually this is unsafe, but a) the tests should not return an empty Optional and b) if they do, they’ll fail and that’s correct. In fact in my case this highlights that there are no tests for the case where no value is returned, so I should add new tests for empty Optional returns.

Changing your API to use Optional is often a non-trivial task, but in cases like this one it can really help clarify the intent – instead of returning nulls when there’s no valid value, it will return you an empty Optional and you can declare what to do – return an alternate value, throw an Exception, or perform some other operation. Beware though, this can get hairy.

Note: This example was OK, but there are other methods in this code that are perfect for returning Optional, but the upstream effect of applying this change was huge – the methods are called in multiple places and the return types used in all sorts of very complex methods, making it nearly impossible to track down the right place to test the Optional and unwrap it vs letting it propagate around the code. The lesson I have learnt is apply this if you can limit the impact to one or two call sites and deal with the Optional return type immediately.

Example 3: Mapper.getId()

Here’s another example of null returns:

This is a great example of where null means two different things. The first null means we were given a null input, therefore we give you a null output.  Seems fairly valid, if a little unhelpful. The second null means “an error happened and I couldn’t give you a value, so here, have null”.  Presumably a null ID is a valid response, given the first case, but it would probably be more helpful to the caller to know what the error was and deal with it accordingly.  Even if the Exception is such that the caller cannot deal with it, it’s pretty unfriendly to catch the Exception and return null, especially when the Exception isn’t even logged.  This is a really good way to hide genuine problems.  Exceptions should be handled where they’re found, or passed on in some useful way, they should not be swallowed unless you really really know it’s not an error.

So this null means “something unexpected happened and I don’t know what to do about it, so I’m going to give you null and hope that’s OK”, which is not at all clear to the caller.  These cases should definitely be avoided, and Optional is not going to be your solution. The solution is to implement much clearer error handing.

In summary

Null is a particularly difficult problem. The main problem with null is we don’t know what it means. It’s an absence of a value, but this could be for any number of reasons. Using null is not a useful way to flag what any of these reasons are, since by its very definition it has no value, no meaning.

Symptoms:

  • Null checks extensively used throughout the application code, without it being clear to you, the developer, what null really signifies or if the value can really be null at all.
  • Explicit return null

Possible Solutions:

  • Optional.  Not a solution to everywhere you find a null.  But if a method is deliberately returning a null value to mean “not found” or “I don’t have one of those”, this is a good candidate for returning an Optional.
  • @NotNull/@Nullable. One of the problems with complex code bases is understanding what are actually valid values. If you’re checking for null parameters, try putting @NotNull on the signature and seeing if null values are ever passed.  If nulls are never passed, you can remove the null check.  If you’re uncomfortable adding a dependency just to use the @NotNull annotation, you can apply the annotation temporarily, get IntelliJ IDEA to tell you if there are any problems, fix up if necessary and run all your tests to make sure everything works.  If it’s all good, you can remove the annotation, add a Javadoc comment (not as safe as the annotation, but still enormously useful for developers who call or maintain the method) and commit the updated code.
  • Exception handling. Returning null for an Exceptional case is very unfriendly, and unexpected from the caller’s point of view.  It can lead to NullPointerException further down the line, or at least more of the pervasive null checks (without a clear idea of why the value is null).  Throwing a descriptive Exception is much more valuable for those downstream.
  • Sometimes, it’s fine.  Field-level nulls (where it should be well understood what can be null and why) are examples where null may be perfectly acceptable.
This entry was posted in Tips & Tricks and tagged , , , , . Bookmark the permalink.

24 Responses to Code Smells: Null

  1. Ted Wise says:

    I don’t know what prompted this blog post, but I’m loving it. Please do lots more of these.

  2. Somebody says:

    I vastly prefer (x == null) ? if : else over Optional. Every single example of Optional I have ever seen requires more code (and that pesky extra Object reference of the Optional itself) to replace a simple null check. I do try to follow Checker Framework’s idea of non-annotated methods are default @NonNull and only return nulls from @Nullable annotated methods.

    • Trisha Gee says:

      The null check is definitely simpler code to using Optional. What it doesn’t really address though is the different meanings of null. In many code bases it’s perfectly fine, because it’s well understood what is null and why, but equally there are many projects (like the example in the blog) where it’s not at all clear if it should be null and why, and therefore what to do. Optional *can* be useful in these cases as it states more clearly what’s the expected return, but I agree that it’s not the prettiest syntax to work with.

      I have found NotNull/Nullable MUCH more useful than Optional in addressing the use of nulls.

  3. Great article, Trisha! Thank you.

    Another useful technique, especially when dealing with established classes and libraries, is to use a wrapper that never returns a null–it can throw exceptions as you suggest or return empty sets, etc. In that way, the nulls are contained and not scattered throughout the app. This can be particularly useful when refactoring legacy code.

    • Trisha Gee says:

      Yes that’s true. It’s not an approach I tried here because I wanted to constrain the solutions to a small part of the code, and I found that both the wrapper approach and even Optional (if used without discipline) can end up wandering around all over the code base and is a non-trivial refactoring to implement. Part of the the problem in my case, though, is that return types are often “Object”, but that’s the topic of another blog post…

  4. Iian says:

    Have you considered creating a class called Result to cover all of these scenarios? You could then subclass it with Success, Error, NotAuthorized, Rejected, or whatever you like, and then match on the type returned. The generic parameter T of course refers to the data you expect to return, but you could also add parameters for the Exception, a string message Description, etc.

    • Trisha Gee says:

      Yes, this is a really good solution too. As I mentioned in the reply above, this is something that may be a larger refactoring to implement (certainly in the quite, um, *interesting* methods in this project’s code base) and therefore I didn’t attempt it. It’s a really nice OO approach though, and lets you much better encapsulate the result and its state.

  5. Joep Lijnen says:

    Sean Parent has an excellent talk on better code, which is available on Youtube.

    He effectively argues that semantics are important for understanding programming code.
    It is when “Null” and “null pointers” are involved semantics are simply lost. There’s no meaningful definitions around anymore. Here’s that part of the talk where he addresses the problems of using null/zero/nothing: https://youtu.be/cK_kftBNgBc?t=26m39s

    I therefore disagree with “We have to come to terms with the fact that null means many things.” When something means many things, it has no meaning anymore; it’s entirely unclear.

    • Trisha Gee says:

      > When something means many things, it has no meaning anymore; it’s entirely unclear.

      I actually don’t think you are disagreeing with me though, since that’s exactly the point! The issue I’m trying to show, though, when I say we need to come to terms with null meaning many things is when you inherit someone else’s code that has evolved over time, and it’s not at all clear what the intent was behind the null, it could mean anything.

      Regardless of whether one thinks of it like me (“could mean anything”) or like you (“means nothing”), using null is definitely a smell and should ideally be replaced with something much more meaningful.

  6. Claes Mogren says:

    Great article, just one comment. I really don’t like null-collections and think they should always be avoided. It just makes everything easier and avoids a lot of unnecessary null-checks. I use Lombok a lot and having @Builder with @Default for collections removes a lot of boiler plate code.

    • Trisha Gee says:

      Null collections are definitely a problem too. The issue is that often one doesn’t have control over the code one is maintaining, nor the ability to choose which libraries to use to help you (hence my suggestions of removing the annotations library if you’re not free to check it in to production code).

      In a project where you can choose the design and architecture up front, where you and a disciplined team are on the same page about the approach to things like null, these issues should not happen. In an inherited, open source, long-maintained code base you often have to chip away at each little problem a bit at a time.

    • Trisha Gee says:

      This would solve some of the null problems in a project like my example project, but there’s definitely not one answer for all the nulls.

      Another thing that would help is prettier syntax for dealing with the problem, like Kotlin’s Elvis operator, but again this doesn’t force developers to have a disciplined approach to nulls, nor does it magically fix all the potentially null objects wandering loose around the application.

  7. Wilfried says:

    This is a great article indeed! But besides the technical I wanted to point out the hidden humour behind this phrase: Code Smells: Null
    Because it would make a perfect geek t shirt. The boss asks for Code Smells and the Developer replies: Null
    According to your enumeration this could mean:

    * Value was never initialised (whether accidentally or on purpose)
    ->What do you mean, Code Smells?!?

    * Value is not valid
    ->Oh yes, I once tested for Code Smells, but there were oh so many..

    * Value is not needed
    ->Pah, a genius never makes mistakes!

    * No such value exists
    ->ok,ok, I know I should check for Code Smells, but…

    * Something went terribly wrong and something that should be there is not
    ->Sorry, my code is so bad, the IDE crashes while checking for Code Smells

    …probably dozens of other things
    -> for sure… :-)

    Sorry, sorry, my post has null value, but I could not resist… :)

  8. Nathan Williams says:

    I like that Optional can remove the need for a temporary local variable on the calling side (to hold the result and branch on whether it’s null, like an Elvis operator), but shoving those blocks into lambda arguments can be painful if they assign to variables that are not effectively final.

    I’d be interested in a post discussing the situation with NotNull annotations. This is an area where standardization appears to have failed badly (https://stackoverflow.com/questions/4963300/which-notnull-java-annotation-should-i-use) and we’re stuck with an IDE-specific setting or something like Checker that is far more complicated, particularly if trying to define all aspects of your build in Gradle.

    • Trisha Gee says:

      I agree. I have found the NotNull annotations VERY useful, but I wish they were more widely used / more an accepted standard. It may be a failing of Java to allow nulls so freely, and the annotations may be a patch over a language flaw, but they are very helpful for the compiler, the IDE, and the developer.

  9. Thomas says:

    Great post! I especially like the mixture of theory and hands on refactoring.

  10. Alfredo Lopez says:

    I loved the article. Thanks a lot

Leave a Reply

Your email address will not be published. Required fields are marked *