donderdag 12 april 2012

Testing Better than the TSA

Yesterday I came across a posting titled “Testing like the TSA” by David at 37signals. He makes a case against over-testing, and argues that developers who fall in love with TDD have a tendency to do that in the beginning. Of course, when somebody learns or discovers a new tool, there's always the danger that he or she uses it too much – in fact, I'd say it's part of the learning process to find out where the new tool is applicable and where not. But judging from the seven don'ts of testing David lists in his posting, my impression is that he should rather try to do more testing than less. Here are my comments:

1. Don’t aim for 100% coverage.
Sure, but then again, what's 100% test coverage? In any system there will be places where it doesn't make sense to write tests because the code is trivial or writing the tests will require more energy than you'll ever be able to save if one of the tests would find a bug. Typical examples of the latter include interfaces, like the GUI or the main entry method from the command-line. But at other times, 100% line coverage isn't enough, and even 100% branch coverage won't do it because it leaves some important paths through your system untested. And to be honest, I've found myself writing unit tests to check logging messages because they were a vital part of the system.

I know something is wrong when a system has 100% test coverage, but the core classes should be pretty close to 100%. As a consequence, I do aim for 100% test coverage, even though I know I won't reach it. It's just like aiming for the bull's eye, even when you're not good at darts. If you're not aiming for 100% (or the bull's eye), then what are you aiming for, and what will the result be?

2. Code-to-test ratios above 1:2 is a smell, above 1:3 is a stink.
That just doesn't make sense. I'd say a code-to-test ratio below 1:2 sounds very wrong. Just think of a single function point, like e.g. a condition: you'll need at least one positive and one negative case to be able to say your code does what it's supposed to do. And this is really just the minimum. I think that if you're doing unit testing right, you should probably have something between two and five unit tests per function point.

3. You’re probably doing it wrong if testing is taking more than 1/3 of your time.
4. You’re definitely doing it wrong if it’s taking up more than half.
That's probably right, but only because your test code should be much simpler than the system you're trying to implement.

5. Don’t test standard Active Record associations, validations, or scopes. Reserve integration testing for issues arising from the integration of separate elements (aka don’t integration test things that can be unit tested instead). 
I don't know about those Active Records (I'm a Java programmer), but I agree that integration testing should be about integration testing.

6. Don’t use Cucumber unless you live in the magic kingdom of non-programmers-writing-tests (and send me a bottle of fairy dust if you’re there!)
I agree on this one. I like the idea of Cucumber, and I've seen some great talks about it at conferences, but I've never seen it used in a real system and have no idea how I would ever be able to use it in any system at all.

7. Don’t force yourself to test-first every controller, model, and view (my ratio is typically 20% test-first, 80% test-after).
I do force myself, and I know it works for me, but I also know it doesn't work for everybody. I can live with that (some can't). But as a consequence, my ratio is pretty different. In a typical project, where I can write code exactly the way I want, I break the TDD rule about writing tests first in a different way than David: I often find myself writing up two or three unit tests before I start programming, especially when I start working on a new component or on a new feature. Sometimes I have to write a second or a third unit test to get the first one right, e.g. in order to be able to decide how the component's interface should look like. I find that easier than getting it wrong the first time, and then having to start refactoring everything five minutes later. But I guess this makes my ratio something like 40% multiple tests first, 40% single tests first, and 20% tests after. The reason why I still have a number of unit tests that I write after the code, is that test coverage reports and mutation testing often point me to branches and paths in my code that haven't been tested well enough.

I don't like testing TSA-style either, i.e. writing lots of tests just to get a high test coverage. That's coverage theater, like Davids puts it, and usually leads only to a large number of brittle tests that break down after the first refactoring. But we should aim high when we write unit tests, and try to do better than the TSA. I'd rather put in another five minutes to get the tests right, than having to spend an hour four months later trying to find out where the bug is hidden…