TDD Myths

By Phlip Plumlee
May 29, 2010 | Comments: 3

While Extreme Programming remains a valuable template for project success, the decade of its adoption also saw the rise of many dilutions and derivatives. The waters that were once very clear are now muddied. For example, XP says "test first", not "write developer tests occasionally." While one can debate the merits of Test-Driven Development as a design technique, no rational excuse exists why a simple, failing test cannot precede each line of code. If you don't know what the test would be, how could you be ready to write that line?

This post debunks some common myths about TDD.

Should we Blindly Trust Developer Tests when Refactoring?

Software which cannot change is not worthy of the name "soft". Even if a forgiving customer can wait a year for a perfect design, that design still grows via incremental changes. As you learn from it, you will always find new features to add, and new designs to hold all features together..

So let's split the word "change" into two different activities. They are "growing" and "refactoring". One means adding new features while leaving the design the same. And we should explicitly say, out loud to our pair programmer, "we are refactoring now," when doing the other activity. Giving it a name helps us clearly switch our mode of behavior. We should switch frequently between growing and squeezing our designs.

  • Add details to increase knowledge
  • Remove details to increase wisdom --some zen koan

In refactoring mode, we hit our one test button, and hit our integration button, MOAR often than when growing. Yes, growing is still a little like hacking, and debugging with trace statements (or worse). Refactoring is not. You run your tests often, and you even integrate, between each tiny step of each refactoring recipe. If the tests break, you Undo your refactor and try again.

It is teeth-clenchingly obvious that refactoring might add bugs. Hiding the name "refactoring" inside some other design technique, such as a miraculously bug-free UML diagram, does not fix the problem. So trust your tests as you would trust poisonous snakes with fangs, and constantly seek to improve them. A sucky design is a much worse problem, as the time required to add new features goes up.

Can Tests Catch All Bugs?

Of course not. Tests will never catch all problems that might distress a user, and they will never even catch all bugs inserted during refactoring. Testing cannot even tell you how often a given refactor step inserts a bug, only to see the next refactor step squelch it. You know what you are doing - to the limit of your normal mortal memory - and your tests know what they are doing, even if they slavishly pin down some feature that's wrong.

Tests give you a hint that you are preserving each detail your team once cared about - "the shipping address defaults to the billing address"; "sort the mesh's normals before reducing its vertices"; "the network timeouts cannot cascade." That's more than enough for most mortal development, because most mortal development must refactor anyway, and so a little safety net goes a long way.

Further, some tests are "hyperactive". They fail even when a user would have seen no bug. For example, imagine some complex algorithm that transforms data through a series of steps. Tests should not require an algorithm by name, and they don't really care if the output is "better than" the input. They will specifically care, however, that each detail of the algorithm converts its data correctly. A given input X, for example, should process into a given output Y. Both X and Y appear in the assertions. Another test case will show the that next step converts Y to Z.

Now imagine you change the code and make a trivial mistake. Now, instead of X->Y->Z, the algorithm produces X->B->Z. B is just some trivial transformation on Y that the user does not care about. But the tests do.

If your hyperactive test fails, revert your last change, and find a simpler change that achieves Y. Why? Because the research to detect if B is really a bug is itself usually a waste of time. Revert back to Y, and try a different change. Pure refactors should never alter algorithms. Measure twice and cut once.

Everyone knows that tests cannot exhaustively cover every path through a program. However, when the majority of tests safely cover the majority of features, then tests that overreach their goals can be more exhaustive, while remaining very easy to write.

That's why lots of simple tests help force code to remain elegant.

How Can Test-First Improve Design?

Let's take a degenerate situation. I have a line of code, very deep in my classes, that relies on many variables, all of which must be set up correctly. If I want to write a new test case on it, how the heck can a test case which sets up all those variables, then calls all those class layers, improve the design of that line?

To understand how you got in this situation, pull the scenario inside out like a sock. Fine-grained tests force designs to decouple.

Suppose you wrote all your code, before that line, via simple & clear test cases, written first. Because each test has a simple design and simple setup, then none of the production code can possibly grow too complex to trivially test. New test cases should be modified clones of old test cases.

Yes, some situations are not greenfield. You must work with the code you got. Test-firsting still applies incredible pressure to that design. Your test-side code, if it gets the refactors it needs, will grow custom setup steps, and custom assertions, which hide those complex details. Over time, they give you breathing room to start to squeeze on the design of that bad code, too.

If you have a design goal, you should, perforce, be able to think of a series of tests and refactors that force that design to emerge. If you can't reach it, then you have successfully tested that it was not such a good goal! And if you don't have a design goal, then DRY code will be easy to change and hard to break, which is the goal of a good design, anyway.

If I Can't Test it, Can I Ship it?

Ask your clients. However, if your clients have experience comparing test-first projects to classical projects - with schedule slips and high bug rates - they will probably agree not to ship that new feature, just this time.

You can always ship a version of the code committed before the bug was added. That's what version control is for.

Code is complex by nature, and so are the hardest bugs.


A complex system can easily generate a bug, via "emergent behavior", that resists capture by tests. Don't ship that feature until tests tell you it's ready.

When you have no choice but to debug, you are stuck in "punitive debugging". You cannot revert (because someone integrated an important feature after the bug got inserted), and you cannot capture the bug with a test (because someone was too busy debugging to write nearby tests). A huge, test free project could waste the first week of a punitive debugging session just identifying the failing module.

When you debug, to explore a procedure's implementation, you typically set up some input data, run the program, and examine some intermediate variables at a breakpoint. Then you turn off your IDE, and all that intricate awareness of the procedure's contracts disappears.

When you write a test case, to explore a procedure implementation, you typically assemble some input data, activate a method, and assert some intermediate variables. When you turn off your IDE, all that intricate awareness of the procedure's contract REMAINS IN THE SYSTEM FOREVER.

The odds that you must debug that region again go way down, if you debugged it with a test case.

TDD replaces long horrible hours of debugging with short breezy minutes writing simple tests. Relentless testing forces you to acknowledge and fix any weaknesses in your tests. You will have time to work on them, and on other testing techniques, because the time spent debugging went down, and the time spent in punitive debugging went way down. You can always revert, and you can always write a new test case.

Should an Organization Require Testing?

This is like asking if regulators should regulate their industries. Regulations that force people to waste time with useless bookyuck? No. Regulations that reward positive results? Hell yes.

Imagine a manager and an executive hovering over your shoulder, asking when feature X will be ready. (Frequent Releases defray this problem, but it still happens.) Now suppose that feature X is only a few lines of code. But you are not writing feature X. You are not even writing its test. You are hacking in the kernels of your libraries, wasting time researching exactly how to write a meaningful test for X.

That corporate standard has your back. It's what resists your executive's hints (or, ahem, outright orders) to cut corners and add risk to a project. It's what helps your manager explain the situation to your executive.

And a huge base of developer tests support Behavior Driven Development tests. These put your civilian clients in direct control of your project's features.

How Can Test-First Reduce Debugging?

Easy. If the tests fail unexpectedly, or in an unexpected way...

revert until they pass.

That's all there is to it. Your Undo button, and your version control system, have turned into a magic button labeled "get me out of debugging hell". Some programmers would give an arm and a leg for an editor with a debugging button like that. And you get it by writing and running tests relentlessly.

TDD is much, much faster than debugger-driven development.

Conclusion

Software growth should be a process that makes your next test case easier, not harder, over time. You get that with a simple cycle...

  • Find your next line to write or change. Don't change it.
  • Find the test cases who call that code region, and clone one
    (change its sample data when you clone, just to prove you can)
  • Add an assertion that fails because the code is not changed yet
  • When the test fails for the correct reason, add or change the line of code
  • When the test passes, you are free.

You can refactor, to merge working code into a better design. You can integrate, deploy, or release, because all code changes were simple.

The downside of the XP diaspora is a generation of programmers who misunderstand these principles, then complain about the misunderstandings. So maybe it's time for a new generation of XP teachers.


You might also be interested in:

3 Comments

Good article. However, the most common complaint I've heard about TDD is that it becomes "think last development".

This article has a glaring example of why that claim is made. Your approach to a broken test is "REVERT!" and you state it in such a way as to imply that no alternative should be considered.

I understand the reasoning behind it. It's definitely the safest fastest way to return to a passing state, but it is not the fastest way to develop software. After reverting all your changes, you now have write that functionality all over again. Sometimes this is faster than a difficult debugging session. Sometimes it is not.

I am a firm supporter of TDD, but I don't believe in dogmatically following any practice to the expense of allowing myself rational thought.

Many TDD detractors never even write tests & run them often enough to be _able_ to revert. So at the slightest sign of trouble, out comes the debugger.

(Now imagine if our industry, instead of spending 30 years perfecting debuggers, had instead perfected TDD harnesses!)

On a greenfield project, using only well-factored libraries, your new code can be so clean that you might revert simply because a test run failed to perform the way you expected. Going one line at a time, you only lose a line.

In the middle ground, suppose you have a large project, and after each edit you run tests for your current module. Then when you integrate, tests for other modules fail. Yes, you now have more than a trivial amount of code changes at risk. Which one caused the error?

In this situation, I use

git diff >vive_la_difference.diff
git reset --hard HEAD

Then I read that diff file (in an editor that syntax-hilites it). I find the easiest, most trivial edit, such as changing a comment. I re-do that edit in the code, pass all tests, integrate, and remove the line from the diff file.

I keep going until A> I get bored, or B> all the edits are integrated, safely, in order from easy to hard.

It is a curiosity of programming that maybe I won't even find the error. This time, when I either wandered away from the diff and started developing again, or I enter the edits in a different order and something snaps into place.

Under stress, revert _more_, integrate _more_, run the complete test batch _a_little_more_, write new test cases _more_, and hit that one test button _more_.

All that dogma will free up your time (especially during test runs!) for this "rational thought" thing that so many have earnestly & painstakingly attempted to explain to me.

"There Must Be 50 Ways To Leave Your Debugger" --Joshua Kerievsky

News Topics

Recommended for You

Got a Question?