Difficulties With Test Metrics

The answer to the question of whether we should write automated test suites has largely been settled. We absolutely should write unit tests, and possibly even integration and end-to-end tests. But as acceptance of this practice grew, and adoption became more widespread, a follow-up question arose; How many tests should we write? How do we know when we’ve written enough? And thus was born the metric of code coverage.

The problem with the metric of code coverage is that while it gives a nice number for managers to point at and say “this is how far along we are constructing our test suite,” it doesn’t actually measure the important aspect that we’re interested in. What we’d really like to know is have we formally defined and tested all of the requirements of our software. The metric of code coverage does not answer this question.

If for example, I have 80% code coverage from my test suite, it could mean that I’ve only tested 80% of my requirements, or it could mean that 20% of the code in my software is completely unnecessary. And even if I have 100% code coverage, it may be that I still have not tested some requirement. It could be that those lines were not executed with a sufficient range of possible values. And I’ve also up to now ignored the implicit assumption that your tests are actually testing the results and not just there for the purposes of gaming the metric. Of course software developers would never attempt to game some meaningless metric to please their managers \s.

So if code coverage isn’t a metric we should be particularly concerned with, is there a better one? Unfortunately, none that I know of. We may be able to write down a list of requirements, and point to some and say “these aren’t tested,” but additional difficulty comes in from the nature of what most software development is. We are quite frequently discovering what our requirements are through trial and error. There are often requirements we don’t know about. We must often develop the software as a way discovering what it’s supposed to do.

It is for this reason that I am a big proponent of the practices of test driven development (TDD) and pair programming. By adhering strictly to the rules of TDD that I don’t write any more code than is necessary to pass my current test suite, I know that I have only satisfied the requirements that I know and care about thus far. Thus, any missing requirements become apparent in the lack of some capability of the software. By involving domain experts in my pair programming sessions, I can be confident that my tests actually do specify and test the requirements.

This is why the clarity of your test suite code, and the capabilities of your testing framework are so important. If I am new to a project, and the test suite is hard to read and understand, and the testing framework can’t nicely report the requirements to me, then I can’t tell what the requirements of the software actually are. If I can’t tell what the requirements of the software are, I’m going to have a hard time making modifications that conform to them.

So spend less time worrying about code coverage, and more time worrying about requirements coverage and your test suite’s comprehensibility.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s