How-To's

Join data items that want to go together – code smells series

This post is part of a 10-week series by Dino Esposito (@despos) around a common theme: code smells and code structure.

In the previous post of this series, we presented the value of using data containers instead of plain primitives to provide a representation of data that is closer to the real domain. We used URL and Email custom types to replace strings. Both URL and email addresses, usually identified with strings, are ultimately richer entities which deserve their own business-specific programming model.

This approach can – and should! – be taken further. Which is what we will be looking at in this blog post.

In this series:

Now let us dive into another code smell: Data Clump!

Data Clump

Sometimes, you just find out that a small set of data items – whether plain primitives, value types or even complex types – are constantly passed around together across different sections of your code base. Those distinct values are referred to as a “data clump”.

Data Clump is a good example of a code smell. It is not a problem per-se, but it might be the indicator of a deeper problem that in the long run could easily degenerate into some forms of technical debt.

As an example, suppose you are modeling an object that represents a booking.

public class Booking
{
    public Booking(int bookingId, int roomId, DateTime? from, DateTime? to)
    {
        BookingId = bookingId;
        RoomId = roomId;
        From = from;
        To = to;
    }

    public int BookingId { get; private set; }
    public int RoomId { get; private set; }
    public DateTime? From { get; private set; }
    public DateTime? To { get; private set; }
}

The constructor needs to take a couple of DateTime parameters to define the validity timeframe. Those two parameters will go together every time you create a new Booking in code, and probably every time bookings in a given timeframe are queried.

While sticking to using two parameters every time is not a mistake in itself, using a new TimeInterval class makes the code easier to read – and to extend. The good news is ReSharper and Rider come with the Extract class refactoring to help us out!

Tip: Extract class can be invoked from the context menu, or by using the Refactor This action (Ctrl+Shift+R).

Refactor this - Extract class

The Extract Class refactoring takes a subclass out of the Booking class and adds just the highlighted members to it.

public class TimeInterval
{
    public DateTime? From { get; set; }
    public DateTime? To { get; set; }
}

This new class can be further extended with one or more factory methods, or be tailor-made in any way that makes sense to your business domain.

public class TimeInterval
{
    public TimeInterval(DateTime? from = null, DateTime? to = null)
    {
        From = from;
        To = to;
    }

    public DateTime? From { get; set; }
    public DateTime? To { get; set; }
}

You now have a new entity with its own personality, which actually plays a defined role in the business domain: the interval of time between two dates. Two sparse DateTime objects could have still performed this, but in a much more blurred way.

The Booking constructor may evolve as below:

public Booking(int bookingId, int roomId, DateTime? from, DateTime? to)
{
    BookingId = bookingId;
    RoomId = roomId;
    Interval = new TimeInterval(from, to);
}

Addressing the Data Clump code smell by extracting specific classes also makes your code inherently more extensible, because what moved around now is a wrapped-up object and not the mere collection of simpler data items. And a new standalone object can be extended at will with additional properties and methods when needed.

In our next post, let’s look at converting objects and readability. See you next week!

Download ReSharper 2018.1.2 or Rider 2018.1.2 and give them a try. They can help spot and fix common code smells! Check our code analysis series for more tips and tricks on automatic code inspection with ReSharper and Rider.

image description