Expressing Intent vs Duplication

I had an interesting conversation with a colleague this afternoon. It centred around what was more important: expressing intent (which I was advocating) vs. removing duplication.

Some context: we were refactoring a poorly written unit test that was aimed at testing certain validation rules. The tests tended to be very similar: create an object by setting a million and one properties (with the one that was being tested for being mandatory commented out), try to validate it, and check the resulting error.

What we ended up with was a helper method that created a valid object. The individual tests, then, applied transformations that rendered the object invalid, and checked the errors. We were focusing on improving the existing tests, not changing them, and that may have contributed to the resulting discussion. 🙂

The majority of tests checked one property at a time. Some looked at multiple properties (don’t ask me why… I was “visiting” the project, so I don’t actually understand the code being tested). Due to the nature of the object being constructed (a HttpRequest), the properties were actually a map. What I argued was this: “The test is checking that property X is mandatory, and that the error text is this resource key. It should do those as two steps, like so:”

public void testFooIsRequired() {
  Map properties = createValidProperties();   // Actually, this line was in the setUp() method.
  properties.remove("foo");
  executeTestWithError(properties, "error.required");
}

Because most of the tests were one-property tests, Greg (my pair) was saying that we should extract a method out to remove the duplication. This meant that most tests looked like this:

public void testFooIsRequired() {
  assertRequiredProperty("foo", "error.required");
}

A handful of tests, however, had multiple properties, so they had to look like the first example above (just with multiple properties removed).

We ended up having an largely philosophical debate over which was important: removing the duplication or expressing the intent. I was strong for expressing the intent: we were applying a transformation, and then we were verifying the error message. The fact that most of the transformations were similar (remove a single property) was a co-incidence of the business rules being tested; more complex business rules could easily have produced more complex transformations. Merging one “special case” of only one property being removed didn’t look right to me. Unless I could describe all transformations easily, extracting the special case actually makes the code less expressive, to my mind.

The moral to this story is that, to my mind, expressiveness trumps duplication. When you are removing duplication (which, BTW, is a “Good Thing™”), it is important to ensure that you did mean to say the same thing twice. Sometimes the duplication is purely co-incidental and should probably stay that way. Other times, it is an indication that you should probably refactor what you have to put more emphasis on the differences.

Advertisements

Author: Robert Watkins

My name is Robert Watkins. I am a software developer and have been for over 18 years now. I currently work for people, but my opinions here are in no way endorsed by them (which is cool; their opinions aren’t endorsed by me either). My main professional interests are in Java development, using Agile methods, with a historical focus on building web based applications. I’m also a Mac-fan and love my iPhone, which I’m currently learning how to code for. I live and work in Brisbane, Australia, but I grew up in the Northern Territory, and still find Brisbane too cold (after 16 years here). I’m married, with two children and one cat. My politics are socialist in tendency, my religious affiliation is atheist (aka “none of the above”), my attitude is condescending and my moral standing is lying down.

6 thoughts on “Expressing Intent vs Duplication”

  1. Good post – I agree and I have something to add.

    Heavy duplication of code in tests (whether copy-pasted or factored out in a separate method or even worse – a separate class) is a symptom of a more fundamental problem:

    The code being tested is too coupled – it can’t be used/tested without a complex context.

    If you address this problem by making it less coupled, perhaps splitting it up in more fine grained units (classes and methods) – then you’ll see a reduced need for duplicated code in your tests too.

    And as a special bonus we also give you increased readability for free.

  2. I believe this is a false dichotomy. The challenge is to find some way to express the intent *and* remove the duplication. Acknowledging that we don’t have the skill to do that right now is fine (I have to do it all the time), but I think presenting the issue as a dichotomy sweeps the challenge under the carpet.

    Not claiming to have the answer, but would

    public void testFooIsRequired() {
    testForErrorWithOnlyOneInvalidProperty(“foo”, “error.required”);
    }

    be a step closer?

  3. Steve,

    If all the tests only had one invalid property, then yes, this would be better. However, some had multiple properties (or were testing for unusual combinations).

    With some tests having multiple properties, you end up need to either have multiple methods to do the test (which I found distasteful), or you pass the properties in as an array (which looks ugly to construct). In Java 5.0 with varargs, I can at least pass in the arguments without needing array syntax in the caller. However, that approach only works while I’m testing for required properties (one variant of a transformation). If I start testing for other illegal transformations (say, bad combinations of properties), then I end up having lots of different methods, and, to my mind, the intent becomes harder to follow.

    I felt that having the method explicitly perform the transformation, and then verify the error, was adequate. The duplication, while there, was fairly minimal and matched the corresponding duplication in the business requirements.

    What I would really like to do (and would be able to express in other languages, e.g. Smalltalk) would be to have this:

    public void testFooIsRequired() {
    testInvalidTransformation(someTransformation, “error.required”);
    }

    In Java, that’s annoying, because you end up need to write an anonymous inner class for the transformation.

    Another alternative would have been to hack the test case a little so that the test method merely applied the transformation, and the execution occured outside. (Think of using reflection to find the ‘transformXYZ’ methods, instead of the ‘testXYZ’ methods). This would have been the most expressive approach, but would have been overkill for the problem size (I think). (We could also have done the execution in the tearDown() method, but that would have just looked really odd).

    (BTW, Steve, your solution was exactly what my pair was proposing. 😉

  4. I should emphasise here that we got to this point by eliminating a large amount of duplication, while at the same time expressing our intent better. So, you’re right, Steve, the challenge is how to do both.

    Eventually, however, I feel you end up with a tension between the two. When you end up debating over fairly minimal differences (which is what Greg and I were doing), you’ve probably gone a long way from where you were.

    Hmmm… I think I need to write about co-incidental duplication sometime soon. 🙂

  5. As long as you are testing single mandatory fields, why not put them all in the same method and test agains an empty map? The test could be named something like:

    testSingleMandatoryFields()

  6. Except that the point here is that while _some_ tests have only one mandatory field, there are other tests that have multiple.

    Having one style for the “one-field” version and a different style for the “multi-field” version is inconsistent and a smell in its own right.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s