How-To's

The simple case of special string types – Code smells series

This post is part of a 10-week series by Dino Esposito (@despos) around a common theme: code smells and code structure.

Good to see you back! In our series about code smells, so far we have seen various theoretical cases. Today, we will look at an example that can be found in many code bases: special string types and the Primitive Obsession code smell.

In this series:

Modern programming languages, including C# and Kotlin, make every type available as an object – regardless of their internal representation. Primitive types such as strings, numbers and booleans are often implemented internally as plain values and are boxed into objects for use. Once boxed into an object, those primitive types gain an array of methods for developers to perform a wide range of operations. For example, the String type is given methods for text operation such as trimming, counting, slicing, parsing and the like.

All these operations are specific to the type String and act on the collection of characters that actually make up the string object. A string object is only a generic container of characters, but sometimes those characters hold some semantics and require ad-hoc methods. Let’s look at an example of strings with a special meaning: URLs!

URLs

A URL is a string that identifies a network resource. Trimming the URL, or determining its length, are not key operations when compared to escaping the URL or making it relative.  To easily manipulate a URL, properties like PortScheme and Segments are also quite useful.

Not coincidentally, in the .NET Framework, you deal with URLs through a dedicated class: System.Uri. While a URL is in essence only a string, the System.Uri class treats it as a special type, recognizing it has the status of a business concern.

Using dedicated classes to encapsulate specific amounts of data is a good way to vaporize the Primitive Obsession code smell.

Your code becomes more readable when small value objects are used instead of raw primitives.

Are email addresses just strings?

Another good example of primitive obsession is working with email addresses in your code. While in the .NET Framework there is a type to work with emails (System.Net.Mail.MailAddress), many of us will be using a string, aren’t we?

Let’s see what it takes to extract the server name from an email address.

var email = "user@server.com";
 
// Extract the server name
var server = "";
int index = email.IndexOf("@", StringComparison.Ordinal);
if (index > 0 && index < email.Length)
    server = email.Substring(index + 1);

At the end of the code, the variable server contains the desired substring. It works, but can you really guess what the code does at first glance? Using primitive types instead of clearly definable business-entity classes is a smell! In addition, using similar chunks of primitive-based code leads straight to duplicated code.

Here’s how to solve it:

public class Email
{
    public Email(string email)
    {
        if (string.IsNullOrWhiteSpace(email))
            throw new InvalidDataException();

        Address = email;
    }

    public string Address { get; }

    public string GetServer()
    {
        var index = Address.IndexOf("@", StringComparison.Ordinal);
        return index > 0 && index < Address.Length
            ? Address.Substring(index + 1) 
            : Address;
    }
}

The same code we’ve seen above now takes the following two lines:

var email = new Email("user@server.com");
var server = email.GetServer();

Note that the GetServer method could easily be a property with the logic encapsulated in the getter. And for good measure, we may have to introduce a Server class which has a Name property, instead of returning a string here.

An immediate benefit is that all methods that could be performed on some particular data are not scattered and live side by side in the same logical container.

The “Value Object” pattern

The Email class in the sample code is a regular .NET class written to achieve a functional goal. Simple classes built around some semantic data are good candidates for the Value Object pattern. Domain-driven Design (DDD) pushes the use of this pattern as a way to more closely model a business domain.

A value object is a class that has no identity and is fully represented by the data it holds. Furthermore, for the same reason, the class is immutable. This means that the class will not offer any public member to alter the state. For example, an email address stored in an instance of the Email class can’t be changed. To work with a different email address, you need another instance of the Email class.

The Email class above is immutable, but not yet fully compliant with the Value Object pattern. To fully identify an instance of the Email class with the data it holds we need to alter the way the class is checked for equality. This requires overriding Equals and subsequently GetHashCode.

public override bool Equals(object obj)
{
    var email = obj as Email;
    if (email == null)
        return false;

    return string.Equals(email.Address, Address, StringComparison.OrdinalIgnoreCase);
}

public override int GetHashCode()
{
    return (Address != null ? Address.GetHashCode() : 0);
}

The new implementation of Equals compares two instances by simply comparing the value of the internal Address properties. As a result, two distinct instances of Email are considered just the same if they contain the same email address.

Tip: Equals and GetHashCode can be generated in ReSharper and Rider using the Generate action (Alt+Insert).

No, it’s not that obvious!

The most significant benefit we gain out of this refactoring is keeping the Email class much closer to the real-world idea of an email. The class has behavior-driven methods and has equality adapted to the context.

The approach of using small, simple and immutable classes to render small-but-significant pieces of data can be extended to a variety of other common cases – currency, address, temperature, quantity, and dates – even though a DateTime class exists in the .NET Framework.

Primitive obsession is an obsession you should get rid of. For the sake of your code.

Next week, we’ll take this approach a little bit further and tackle the Data Clump code smell. Stay tuned!

Download ReSharper 2018.1.2 or Rider 2018.1.2 and give them a try. They can help spot and fix common code smells! Check out our code analysis series for more tips and tricks on automatic code inspection with ReSharper and Rider.

image description