Tips & Tricks

Code Smells: Multi-Responsibility Methods

This is part of a series investigating code that looks suspicious (“code smells”), and exploring possible alternatives.

In the last article I introduced a large and complex method, validateQuery.  The focus of that article was how to simplify the method looking only at the “smell” of mutability, but there’s actually one very simple change that can chop the method nearly in half.

I’m afraid the time has come to post the method in its entirety, so I apologise for taking up so much screen real-estate. I’m going to mitigate this slightly by using the end result of our last refactoring, which saves us 20 lines of code or so:

static ValidatedField validateQuery(Class clazz, Mapper mapper, String propertyName, FilterOperator op,
                                    Object val, boolean validateNames, boolean validateTypes) {
    final ValidatedField validatedField = new ValidatedField(propertyName);
    if (!propertyName.startsWith("$")) {
        final String[] pathElements = propertyName.split("\\.");
        final List<String> databasePathElements = new ArrayList<>(asList(pathElements));
        if (clazz == null) {
            return validatedField;
        }

        MappedClass mc = mapper.getMappedClass(clazz);
        for (int i = 0; ; ) {
            final String fieldName = pathElements[i];
            boolean fieldIsArrayOperator = fieldName.equals("$");
            final Optional<MappedField> mf = getMappedField(fieldName, mc, databasePathElements,
                                                            i, propertyName, validateNames,
                                                            fieldIsArrayOperator);
            validatedField.mappedField = mf;

            i++;
            if (mf.isPresent() && mf.get().isMap()) {
                //skip the map key validation, and move to the next fieldName
                i++;
            }
            if (i >= pathElements.length) {
                break;
            }
            if (!fieldIsArrayOperator) {
                //catch people trying to search/update into @Reference/@Serialized fields
                if (validateNames && !canQueryPast(mf.get())) {
                    throw new ValidationException(format("Cannot use dot-notation past '%s' in '%s'; found while validating - %s", fieldName, mc.getClazz().getName(), propertyName));
                }
                if (!mf.isPresent() && mc.isInterface()) {
                    break;
                } else if (!mf.isPresent()) {
                    throw new ValidationException(format("The field '%s' could not be found in '%s'", propertyName, mc.getClazz().getName()));
                }
                //get the next MappedClass for the next field validation
                MappedField mappedField = mf.get();
                mc = mapper.getMappedClass((mappedField.isSingleValue()) ? mappedField.getType() : mappedField.getSubClass());
            }
        }
        //record new property string
        validatedField.databasePath = databasePathElements.stream().collect(joining("."));

        if (validateTypes && validatedField.mappedField.isPresent()) {
            MappedField mappedField = validatedField.mappedField.get();
            List<ValidationFailure> typeValidationFailures = new ArrayList<>();
            boolean compatibleForType = isCompatibleForOperator(mc, mappedField, mappedField.getType(), op, val, typeValidationFailures);
            List<ValidationFailure> subclassValidationFailures = new ArrayList<>();
            boolean compatibleForSubclass = isCompatibleForOperator(mc, mappedField, mappedField.getSubClass(), op, val, subclassValidationFailures);

            if ((mappedField.isSingleValue() && !compatibleForType)
                 || mappedField.isMultipleValues() && !(compatibleForSubclass || compatibleForType)) {
                if (LOG.isWarningEnabled()) {
                    LOG.warning(format("The type(s) for the query/update may be inconsistent; using an instance of type '%s' for the field '%s.%s' which is declared as '%s'", val.getClass().getName(), mappedField.getDeclaringClass().getName(), mappedField.getJavaFieldName(), mappedField.getType().getName()));
                    typeValidationFailures.addAll(subclassValidationFailures);
                    LOG.warning("Validation warnings: \n" + typeValidationFailures);
                }
            }
        }
    }
    return validatedField;
}

The smell that led me to this code in the first place was the multiple isPresent() checks on my Optional mf value, but I’m going to bypass that issue for now and see if there’s a quick win to simplify the code.

The Smell: Multi-Responsibility Method

A method this size is almost definitely doing more than One Thing.  The key is trying to work out which things it’s doing and how to classify those things.

One of my pet peeves is passing in boolean parameters.  These primitives are sometimes used to switch the behaviour of a method, which suggests that if the method is doing two different things, following two paths, under two circumstances, well, maybe it’s two methods.  If we investigate how the boolean parameters validateTypes and validateNames are used in this method, we may be able to find at least one code path which can be extracted into a separate method.

The validateNames flag is used twice in the first half of the method, in lines 63 and 79, for a simple check.  I suspect there’s a prettier way of achieving the same thing, but for now I’m going to ignore this value and move on to the other boolean.

Looking at validateTypes, we see this is used to turn on (or off) the entire second half of the method.  This looks like a good candidate to extract into its own method. Next I need to work out which values are needed for the first half of the method and which are needed for the second. I’ll start by looking at the parameters.

I already know validateNames is only needed by the first half so I can ignore that.  Next I’m simply going to highlight usages to see at a glance what I’m using.  To fit it all on the screen for the screenshots, I’m using code folding to hide the code I’m not so interested in.

First three parameters are only used in the first half of the method

The first three parameters are only used in the first half of the method

The first three parameters aren’t used in the second part of the method that validates types, they’re only used in the first half.

Parameters 4 and 5 are only used in the second half of the method

Parameters “op” and “val” are only used in the second half of the method

The fourth and fifth parameters aren’t used at all in the first half, they’re only used in the second half.

The fact that three parameters are used for one part of the method and two parameters are used for a different part makes it pretty clear that what we’re dealing with here is really two different bits of functionality that need two different sets of input.  This almost definitely deserves to be two distinct methods.

Step 1: Extract a validateTypes method

We want to see how easy it is to separate the second part into its own method without impacting any of the callers for the time being.  There’s a good chance this code needs values from earlier in the method.  Let’s do extract method and see what it looks like

Extract validateTypes

We end up with a validateTypes method that looks like:

    private static void validateTypes(FilterOperator op, Object val, boolean validateTypes, 
                                      ValidatedField validatedField, MappedClass mc) {
        if (validateTypes && validatedField.mappedField.isPresent()) {
            MappedField mappedField = validatedField.mappedField.get();
            List<ValidationFailure> typeValidationFailures = new ArrayList<>();
            boolean compatibleForType = isCompatibleForOperator(mc, mappedField, mappedField.getType(), op, val, typeValidationFailures);
            List<ValidationFailure> subclassValidationFailures = new ArrayList<>();
            boolean compatibleForSubclass = isCompatibleForOperator(mc, mappedField, mappedField.getSubClass(), op, val, subclassValidationFailures);

            if ((mappedField.isSingleValue() && !compatibleForType)
                || mappedField.isMultipleValues() && !(compatibleForSubclass || compatibleForType)) {

                if (LOG.isWarningEnabled()) {
                    LOG.warning(format("The type(s) for the query/update may be inconsistent; using an instance of type '%s' "
                                       + "for the field '%s.%s' which is declared as '%s'", val.getClass().getName(),
                                       mappedField.getDeclaringClass().getName(), mappedField.getJavaFieldName(), mappedField.getType().getName()
                                      ));
                    typeValidationFailures.addAll(subclassValidationFailures);
                    LOG.warning("Validation warnings: \n" + typeValidationFailures);
                }
            }
        }
    }

Not surprisingly, our new validateTypes() method needs op and val, the parameters we identified earlier, and the validateTypes parameter.  In addition, it also needs validatedField, which is effectively the output from the first half of the method, and a MappedClass mc.  It looks like mc is also an output of the earlier half of the method.

Step 2: Put all outputs from validateQuery into ValidatedField

The reason we introduced ValidatedField in the last article was to encapsulate multiple return values from validateQuery.  It seems logical, therefore, to add the MappedClass mc to this class. Again, we have the option of adding it to the MappedField instead (and I would certainly question why the MappedField doesn’t already have it), but we know from our earlier investigations that MappedField is sometimes null, and we expect MappedClass to always be present.

Add mappeedClass to ValidatedField

I’ve added a mappedClass field to the ValidatedQuery, and I used inline variable to use this new field everywhere I was using mc (which has automatically been removed). Next I can remove the MappedClass parameter from the validatedTypes method.

Remove MappedClass from validateTypes

This now looks like:

private static void validateTypes(FilterOperator op, Object val, boolean validateTypes, ValidatedField validatedField) {
    if (validateTypes && validatedField.mappedField.isPresent()) {
        MappedField mappedField = validatedField.mappedField.get();
        List<ValidationFailure> typeValidationFailures = new ArrayList<>();
        boolean compatibleForType = isCompatibleForOperator(validatedField.mappedClass, mappedField,
                                                            mappedField.getType(), op, val, typeValidationFailures);
        List<ValidationFailure> subclassValidationFailures = new ArrayList<>();
        boolean compatibleForSubclass = isCompatibleForOperator(validatedField.mappedClass, mappedField,
                                                                mappedField.getSubClass(), op, val, subclassValidationFailures);

        if ((mappedField.isSingleValue() && !compatibleForType)
            || mappedField.isMultipleValues() && !(compatibleForSubclass || compatibleForType)) {
            logWarningForMismatchedTypes(val, mappedField, typeValidationFailures, subclassValidationFailures);
        }
    }
}

(Sorry, I know it’s not the most readable code).

Step 3: Get the callers of validateQuery to call validateTypes as well

All the values that are passed in to this method are available to anything that called validateQuery, therefore we can push the call of validateTypes to the callers of validateQuery.  Firstly, we need to remove the private keyword from validateTypes so that it can be called from other classes in the package.

Then, we find all the callers of validateQuery using Alt + F7.
Find usages

From here we can get the callers to call validateTypes directly

Call validateTypes directly

Now all the call sites make this call themselves, we no longer need validateQuery to call it, we can just remove the call to validateTypes from the end of the method.

Remove call to validateTypes

When we changed the callers, we saw half of the places that call validateQuery don’t need to call validateTypes.  The other two cases we don’t know, so we wrapped the call inside an if statement, for example:

final ValidatedField validatedField = validateQuery(query.getEntityClass(), 
                                                    query.getDatastore().getMapper(),
                                                    fieldName, 
                                                    op,
                                                    value,
                                                    validateNames,
                                                    validateTypes);
if (validateTypes) {
    validateTypes(op, value, validateTypes, validatedField);
}

Now we’ve done that, IntellIJ IDEA is hinting that we may be able to make further simplifications:

07-validate-types-flag-is-always-true2

We’ve moved the check for validateType to the caller of the method (since this place often knows whether it needs to call the method or not).  Therefore we can remove the check from inside the validateTypes method, and remove the flag:

Remove boolean flag

By moving the call of validateTypes (and the decision of whether to call it) into the code that was calling validateQuery, this has simplified the validateTypes method a little, and put the responsibility of deciding whether to validate the types or not into the correct place.  Running the tests shows everything still works as before.

Step 4: Simplify validateQuery

We saw at the beginning that half the parameters to validateQuery weren’t needed for anything other than the section that validates the types.  Now we’ve moved that code into its own method, and we’re not even calling that method any more from inside validateQuery, we should be able to clean up some of the parameters.

Remove unused parameters

In Summary

Our validateQuery method is now down to 57 lines of code, from 75, which is a noticeable improvement (I can nearly fit it on the screen without scrolling).  More importantly, pushing validateTypes into its own method means we no longer need to understand an additional boolean flag, and some of our calling code is much simpler because it doesn’t need to call this method at all.

When we stumble across very long methods we normally want to break them into smaller ones.  In some cases, this process is easier than others.  If the method takes groups of parameters that don’t interact with each other, for example in our case where the first half of the parameters are used in the first half of the method, and the second half are used in the other section of code, there’s a pretty clear indication of where the method should be split.

In our example, this is made even clearer by the use of flags turning functionality on or off. The use of a flag parameter may indicate that a method is doing two different things, and therefore should be split into two different methods.  If the caller of a method can pass in a flag turning functionality on or off, that same caller can make the decision themselves (or avoid the call altogether).

In our case, the functionality was grouped together because the second part of the method relied on the output of the first. We solved this issue by adding another field to the new ValidatedField class we introduced in the last refactoring, precisely to deal with multiple return values. The fact that the method needs to return multiple return values at all might be yet another smell, but let’s tackle one problem at a time.

Symptoms:

  • Groups of parameters that are used in non-overlapping sections of code.  E.g. three parameters used in the first half of a method, two in the second half.
  • Boolean parameters used to turn whole blocks of code on or off.
  • One part of the method is used to calculate a value (or values) that are only used as “input” to a later block of code.

Possible Solutions:

  • Extract distinct blocks of code (e.g. code from within an if statement that’s checking a boolean parameter) into its own method, and push the call of this new method to the caller of the original method (see: Replace Parameter with Explicit Methods).
  • If the complex method with non-overlapping parameters is a constructor (or if you have two constructors with very different parameters), create two separate classes for the two separate contexts/sets of values (see: Extract Class).
image description