Null pointers: an opportunity, not an exception – Code smells series

This post is part of a 10-week series by Dino Esposito (@despos) around a common theme: code smells and code structure.

Last week, we looked more closely at using the classical object-oriented concept of inheritance in our code base. This week, we will look at null and how it presents us with an opportunity to improve our code.

In this series:

As inconsistent and immaterial as it may appear, the value null is not a value, and yet many programming languages insist on treating as if it were. To survive, C# developers have to repeatedly check references to their classes for existence, though Java developers don’t fare much better.

Null checking is boring, and without a doubt every developer will find a way to forget one. It seems that the concept of null references exists thanks to Sir Tony Hoare, a prominent computer scientist, inventor of the Quicksort algorithm and the winner of the Turing award in 1980. The concept of null was introduced in the Algol language, and later Sir Hoare called it “my billion-dollar mistake”.  It is nice to recall how the Algol language was presented to the masses back in the 1960s.

Here is a language so far ahead of its time, that it was not only an improvement on its predecessors, but also on nearly all its successors.

The main issue with null references is that once the language allows references to be null, well, any reference can potentially be null. Hence, null checking is mandatory. But boring. And developers tend to forget it. And programs crash because of that.

For a number of practical and architectural reasons, there’s not much that established programming languages like C# and Java can do other than introducing contrivance and helper operators. New languages, however, free of building a type system from scratch, can do more. And Kotlin, in effect, did. In Kotlin there’s a clear distinction between nullable and non-nullable types. If you assign a null value to a non-nullable type you get a compilation error.

In C#, things are slightly different. Nullability is an exclusive attribute of value types, and any reference type can always take null as a value.

In Kotlin, instead, any type is non-nullable unless it is explicitly made nullable. Kotlin also provides a syntax for safe calls that don’t throw a Null-Reference exception but apply some compensation logic that ultimately results in a no-op and a null value being returned.

In the code snippet below, the contact is passed an empty phone number and the use of the ?. safe call operator actually sets the phoneNumber variable to null, without any sort of exception.

The safe call operator, or safe navigation operator, is also available in C# since version 6.0, but it doesn’t work exactly like the Kotlin operator. While Kotlin proceeds along the entire chain, the C# safe navigation operator stops at the first null reference encountered.

Safe call operators are one way to deal with null references to avoid annoying exceptions in C# classes. ReSharper also does a lot of work in order to detect possible occurrences of null-reference exceptions, as shown below. Yet the original sin – the billion-dollar mistake – remains.

The root of the problem is that null is not a value and it can’t refer to any domain-specific state. In any business ubiquitous language, there’s nothing like null, a non-value. There are, instead, special case values that can be assimilated to what in raw computer memory terms is a null reference. The Special Case pattern (in many places labeled as the Null Object pattern) suggests that you never use null values. And whenever you’re tempted to do so, you return an instance of a special class (often a singleton) that represents an invalid or inconsistent state of the logical entity behind the class. Here’s an example:

A class that returns a Contact object will return a MissingContact instance instead of null or other odd values that may denote a special circumstance or an invalid state.

A similar approach may lead to a proliferation of small and similar looking classes: one for each possible exceptional situation to handle. One way to work around the problem is using a single MissingContact class with a few parameters that can be set to whatever is helpful to figure out the special case. This could be error codes, messages, or exceptions.

The key to the success of the Special Case pattern is that the caller receives an instance of the same type it was expecting.

Now, what if something really weird and unexpected occurs? There might be cases (or just personal development preferences) in which you just want a call to fail. In these scenarios, the use of null as return values is better replaced by using specific exceptions.

In the end, null is a problem because it’s not a value but is perceived as such (kudos to Sir Tony Hoare for recognizing it so lucidly). At the same time, fixing the use of null in your code also represents a great opportunity to improve and refactor for better readability and a lesser chance of that dreaded NullReferenceException.

There’s just one week left in our series, so let’s prepare to yell an acronym when overengineering peeks around the corner. YAGUI! (Yes, that is indeed very close to YAGNI.)

Download ReSharper 2018.1.4 or Rider 2018.1.4 and give them a try. They can help spot and fix common code smells! Check our code analysis series for more tips and tricks on automatic code inspection with ReSharper and Rider.

This entry was posted in How-To's and tagged , , , , . Bookmark the permalink.

15 Responses to Null pointers: an opportunity, not an exception – Code smells series

  1. Jean-Christophe Chalté says:

    The Special case pattern is interesting, as using the type system to leverage a business behaviour, making it more readable, is a powerful habit. But there is another code smell in your solution : a “MissingContact” should not inherit the “Contact” class, as a missing contact is not a contact. You cannot make a “missing duck” quack, nor you should be able to retrieve the first name of a contact that does not exists. It is not the “Contact” that could be a missing one (a contact should only exist in a valid state 😉 ), but more probably the result of the contact retrieval process that could either be an existing contact or a missing one. The class hierarchy could be focused around the “FindContact” process, maybe with a “FindContactResult” base class, with two implementations : a “SuccessfullContactFound” one that exposes a “Contact” property, and a “MissingContact” one that exposes nothing. With those, there are no invalid states possible anymore, and usage is easy with C# pattern matching.

    • Nicolas M says:

      I also thought it was a weird decision to inherit from ‘Contact’, a missing contact is _not_ a contact! A FindContactResult sounds like the right solution, and indeed with C# 7 pattern matching, this kind of design pattern becomes incredibly concise and readable.

    • Talking Sense says:

      There is such no code smell in the given example.

      A missing contact IS a contact in many (in some domains, all) conceivable situations. It may well be a perfectly valid state to not have a contact, for example.

      I have used this pattern with much success over decades. But exactly how varies.

      The point of using it is to avoid using null testing statements. Replacing a null test with a type test all over the place is to miss the point of the pattern entirely.

      You can of course still use pattern matching with an inherited type though.

      • Jean-Christophe Chalté says:

        Of course, if the business requires that a missing contact should behave the same as a “valid” one, then your approach is far better. The suggested modification could be used if the business behave differently between a valid and a missing contact. It is just (one) way to protect both the developer that created the method (all results are clearly stated, including edge cases, so if the result is badly used, it is usually not the implementation fault), and the developer that uses that method (it has to be aware of all edge cases, it knows that if he can uses a specific type result, then this result is well thought of, he knows that if he handle all edge cases, no runtime behaviour issues should arise). It is more than just ” if != null then” case.

  2. @JC

    In our case, we are always doing an abstraction via interface (there was discussion in previous post about when to use interfaces vs classes, in our case we always do interfaces for abstraction).
    As such, when required to implement this MissingContact, all methods throws an InvalidOperationException.
    But accessing properties via getter will give you valid details to be displayed if required in the UI or any report.

    @Author:
    public static MissingContact Instance = new MissingContact();
    In my case, we will be :
    class Contact : IContact
    {
    public static readonly IContact MissingContact= new MissingContact()
    }

    first, you’re missing the readonly attribute, second what is the point to have access to the MissingContact class ? no one should rely on this but rely on the MissingContact instance ?
    It means also in my case MissingContact class is private or nested type.

    • Jean-Christophe Chalté says:

      I see some drawbacks about using a class implementation for a “MissingContract”, and even more with methods that returns a runtime “InvalidOperationException” exception.

      – First, you rely on the user of your method to know that this method can return a specific instance called “MissingContact”. It is “implicit” and nothing in the method declaration (the method signature) explicitly expose this fact. Without consulting the implementation, the user has no way to know that, and you must then rely on two things : a documentation written somewhere, and the fact that your developer remember to handle edge case everywhere needed.

      – Then, at a more high level view, even if your developers handle those edge cases, your team is basically (re)coding the fact that your method can return X(eg. a contact instance) or Y(a missing contact instance) at multiple places. When you will change your method implementation (eg. by adding a new edge case, or removing an old one), you have to recode that information at every other “implementations” of that algorithm, without having any real compile time help.

      Using an abstract “blank” (without any behaviour or exposed methods) “FindContactResult” abstract class is not perfect, but it will make explicit that the developer has to handle a set of specific types of result (one for each implementations), and you can use compile type errors and warnings to handle deleted edge cases, or new unhandled edge case. In F#, it is even better, you can force your developers to always handle all edge cases by making all unhandled edge cases throws errors at build time, but as far as I know, this is not yet possible in C#.

      • Talking Sense says:

        This is the Null Object pattern. The operative word is “pattern”. A design pattern is NOT a code template. You can implement it however you like.

        Usually the “null object” class inherits from the main class (it’s often easier to provide sensible and consistent behaviour that way), but it is not essential.

        But the arguments about what implementation details are good ideas are entirely context sensitive – it depends on the situation. Sometimes your suggestions will be terrible ideas, sometimes not.

        • Jean-Christophe Chalté says:

          I totally agree with you on the fact that the implementation is context sensitive. In fact, a pragmatic approach is usually the better solution in software developement (and even that sentence is not always _the_ rule to apply :) ). All those posts (including the OP one) are only opinions.

          The idea behind what I suggested is that the difficulty in the code we create is most of the time in the expressivity of it. The code is a language that a team shares to create a set of rules. That language is composed of explicit terms (the specific language, method signatures, implementations, and at some degree, documentations, etc.) and implicit terms (usage of a specific pattern, meaning behind a specific term or a specific construction, eg. “an instance of a subclass of A is an A”). Implicit terms are built on different layers, the education, the community, and even the team during the software development process. That implicit should be taken into account but can be problematic to some (newcomers, foreign cultures, changing teams etc.).

          *In my opinion*, the more I code and lead technical teams, the more I am inclined to think that trying to be the most explicit in the code and to rely less on implicit terms is one way (not _the_ way 😉 ) to maintain a good quality of code. As usual, a pragmatic approach is necessary, a purely explicit developement is not ideal (and not doable), but if the underlying technology (in that case, C#) can give that expressivity without too much cost, I try to strive for it.

          Moreover, I have encountered many developers that have a wrong “intuitive” approach to the implicit terms, usually by oversimplifying the ideas behind it (eg. “a pattern is a rule”, “a class is the object it describes”, “an ORM entity is a business class”, “a ConcurrentDictionnary is thread safe” and so on), so even simple “shared” implicit terms can be troublesome sometimes. Explicit FTW ! … (…most of the time :) )

    • Talking Sense says:

      One word of advice. You say you always use interfaces for abstraction. If you always use one way of doing something, you will eventually be doing it wrong. Something to think about.

      Interfaces are very often abused as a construct…

      Also, see comments below about patterns.

  3. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #2643

  4. Talking Sense says:

    “Sir Tony” and “Sir Tony Hoare” are correct, but not “Sir Hoare”.

    For some reason you never use just a surname with “Sir”.

  5. Eugeniy says:

    I really don’t understand what is the problem about null reference and especially about NullReferenceException. I see it a useful tool. Exactly with NullReferenceException thrown automatically by environment we don’t need to check for null and throw something by ourselves)

    val phoneNumber = customer?.phone?.number

    Here I would ask a lot of questions, like why do we want to execute this method in case customer was null? Of course we can throw exception at method start in case of customer was null, but in our case it is also possible to change customer?.phone to customer.phone – and exception will be thrown automatically!

    • Daniel Morais says:

      I think this usage is a bit weird. Contact, in this context should not allow a null value and shoud throw a NullReference Exception (or, in Kotlin, should not be a nullable type). So, something like this:
      val phoneNumber = customer.phone?.number
      where customer should have a value and can have a phone or not.

      BTW, I rather implement something like Optional to represent something like a value that can have a value or not (as something you see in F# and Scala) instead of a MissingContact, specially with the new pattern matching capabilities of C#. This way I don’t need to create a new class for every situation where should result in a Missing Value (But, this solution doesn’t prevent NullReferences…C# 8 are implementing some checking that can turn non nullable references the default)

  6. Brian Strelioff says:

    Will ReSharper play nice with C# non-nullable types once official? For example, how will the ReSharper attributes be treated if the underlying type system is null-aware? Will the attributes be flagged as redundant? Will there be a refactoring to migrate to/from attributes?

Leave a Reply to Talking Sense Cancel reply

Your email address will not be published. Required fields are marked *