As I said earlier, I’ve been reading Michael Feathers _Working Effectively With Legacy Code_ recently (I’m taking my time with it; it’s a good book). He spends a lot of time talking about techniques to get code under test. However, he spends very little talking about finding out if (and to what extent) a method is under test in the first place. So here’s how I do it:
- Start with working tests.
- Enable assertions when running your tests (worth doing all the time!)
- Start the method with
assert false : "Forced test failure"
- Run all your tests.
- The ones that break use the method.
Having identified the tests, you can inspect them to see which one (if any) is the right one to enhance to cover the new-or-changed functionality. Most important of all, you can inspect them to see what kind of changes you could make that wouldn’t cause test failures, because it’s those kind of holes in test coverage that can cause you to introduce bugs.
Now, it’s easy to say that, with TDD, you end up with good correlation between code and tests. If that’s the case, you simply open up FooTest, find all the methods called testBarXYZ() and away you go. In practice, you don’t end up with that tight a correlation. Here’s why:
- First, you shouldn’t aim for a 1-1 correlation anyway. Your tests should focus on behaviour, and you’re betting off grouping by behaviour than by class. Grouping by class is convenient but shouldn’t be a constraint. It’s not hard to come up with reasons to have more than one test class for a given class (repeated, but different, setup code being the main driver), and there may be many classes which don’t have direct tests.
- Second, even if you start with a 1-1 correlation, you don’t end up keeping it. When you refactor, it’s dangerous (and often not sensible) to move tests around. Thus, when you apply the Extract Method or Extract Class refactorings, you can easily end up with code units that aren’t directly tested. This is especially true with automated refactoring tools, but I’m sure it would happen if refactoring manually as well.
A side effect of this technique is to let you see how extensive the afferent coupling to the method really is; having a huge number of test failures would be a code smell indicating some sort of God class in many cases (utility methods excepted, but then utility methods don’t change that much either).
 Or a subset that runs fast.