...you are doing it wrong. No matter what your version control system, your command to throw away a batch of changes is your best friend — if you leverage your test rig correctly to avoid debugging.
Let's collaborate today with our colleague Vu, as he works on the Frob module. The Frob is just one small part of a very large project, with many automated tests. Like everyone on our team, Vu scrupulously runs all those tests before committing his code changes; in one long integration test batch. (tl;dr? Already know this stuff? Skip the next section, and read the one after it.)
Incremental TestingVu runs the tests on the Frob module after nearly every edit. That way, if a test fails unexpectedly, he has the option to Undo his last change, and try again. And the tests make an excellent platform for automated debugging, and manual debugging with trace statements — specifically when they help make such behaviors optional. (Incidentally, if your incremental tests take longer than a few seconds to run, you have bigger problems than a failing test here and there. Fix the test performance first, to remove any impediment to testing at whim. If your test runner dabbles in aggressive high-end special effects, such as running your tests "in the cloud," cut these effects out of your incremental tests, and save them for your "soak tests", when you are not waiting for their output.) Each time Vu changes some production code, or some test code, he hits a test button that saves every changed file, runs the tests, and reports the result. If the results include a failure, his test runner offers the option to navigate his editor directly to the failing line. Most failures are easy to fix — and some fixes require Vu to add new code, supporting new features. So far so good; most tutorials for automated testing reveal this cycle working for small projects; for "Hello World" projects, and for newly hatched, computation-intensive projects such as geometry engines or language parsers. Vu's project, however, lives in the real world. It's big & rambling, and many developers have worked on its many modules. And it lives within an ecology of other applications and online services. Any changes to the Frob module could affect many other modules, no matter how "decoupled" they are. When the time comes to integrate, Vu must start the long integration test run, then take a short break. And real world projects are vulnerable to gremlins. Even if we design all of our modules to insulate them from other modules, a change to one module might change how it calls another. If that module is sensitive to, say, the order we call its methods, or the exact encoding in our strings, or quantum tunneling between hardware circuits, or solar flares, then its behavior might change unexpectedly. If its tests, in turn, are sensitive to that change, the integration test run might break in a way that the incremental tests don't catch. At this point, a programmer's years of experience using automated debuggers might lead them to respond to this programming challenge by whipping out their IDE, and thumping the code with it. And, plenty of times, that will work! If you find yourself applying the advice in this post one too many times, you should go in after the root problem. And your assertions should help you identify the problem, to start you on your way. Automated tests should always use clever, verbose assertions that return as much diagnostic information as possible at fault time. Assertion diagnostics should compete with an automated debugger's "watch" systems! While tests should also be easy to write, and some assertion diagnostics might be clear as mud, the diagnostics often point directly to the solution. Dot your i-s, cross your t-s, and try the integration batch again. Despite all these checks-n-balances, and best practices, in a real-world project with huge modules, incremental testing trades slow test runs for test runs with blind spots. At integration test time, a test case might still fail, incomprehensibly, irreproducibly, in a totally inconvenient place, such as a module everyone thought was working! Touching that module, even to debug it, just might make things worse. And, finally, we just might not feel like debugging!
Avoid DebuggingIn general, debugging should not be your first response. Going in with overwhelming force to get rid of the problem, as quickly and safely as possible, should be your first response. Firstly, Vu saves all his workbench changes into a backup file:
git diff > ../changes.diffYour version control system might support a system like
git stash, to store changes for automated retrieval. Don't use it, because it will only preserve all your changes, including the ones that cause trouble. Next, Vu inspects the new
changes.difffile, to ensure he recognizes the changes it describes. Then Vu throws his code changes away:
git reset --hard HEADThat feeling of grim satisfaction — as our version control system reverts each change in each of our files, restoring them to the state where they (presumably) passed all their tests — should prepare us for the fun that comes next. Vu is in this trouble because the "Grand Wazoo Test Run" failed — the integration test run that actually tests everything. We can't do anything else until we see that entire test run pass, because we don't know if our changes caused the problem, or if they were latent in the system waiting to appear, say, at this time of day. So, now that the code on our file system exactly matches the
HEADof the code in our version control system, we must run the entire integration test run, again... ...and take a short break. If this run fails, the project has a broken build, so Vu should inform our team about the problem. And, possibly, Vu would start debugging! If the Grand Wazoo Test Run works, the next step is one of those "counterintuitive" things that generally makes sense only after you have tried — and suffered under — some alternatives. Vu's
changes.diffcontains quite a few changes, in a few files. Some changes are merely cosmetic, such as improving indentation. Some are ancillary to the thrust of his current effort, such as new comments, or identifiers with new names. And some of those changes are important; they are the feature Vu were working on! Vu does not start with those changes. Vu finds the simplest change in the file — even something as trivial as removing a trailing space. Vu doesn't start with the hardest change. And, for all we know, the trailing space could have caused the test failures! Vu manually applies that change to his code. (And he marks
changes.diffwith a star *, to show the change is recovered.) Now Vu runs the entire integration test batch, again, and integrates that one tiny stoopid change. Yes, this hurts. We feel like we are just malingering, while we are not adding that feature we were supposed to add to the Frob module. This is still better than endless debugging! And, as usual with any doctrinaire writings, rules are meant to be made, followed, and broken. Maybe your hardest change will turn out to be easy to integrate, and luck will shine upon you. It usually doesn't on me, hence my incredible laziness masquerading as caution. Vu repeats this process, for each code change in
changes.difffile, generally going from easy to hard. This process ensures that valuable cosmetic changes and incidental refactors go into the system first, regardless of the feature Vu was working on. If Vu's changes were indeed harmful, at some point the situation will come to a head. By integrating a few lines at a time, at some point Vu might encounter the lines that caused the problem. And, because debugging a small change is much easier and safer that debugging a large one, extracting that small change itself is very valuable. Despite the large number of "short breaks" this technique has caused us! (Maybe Vu works for one of those smug Silicon Valley Workers' Paradise companies that provide free Yoga classes, gourmet food, and book lectures, to sponge up the free time their developers spend awaiting their test batches. Or maybe Vu reviews the code, and sketches other improvements into a scribbles file, while waiting.) If Vu finds the line that actually caused the problem, he's free to debug, OR write the line a different way, OR fix the module who responded poorly to that line. With all the other changes from
changes.diffnow "off his plate," he has the widest range of options to pick from. Even better, the line in the Frob module which caused the problem should be exposed for more automated tests, to capture that bug, neutralize it, and make a permanent example of it. Alternately, this technique might appease our gremlins, and their quantum fluctuations that caused the test flakiness. Maybe Vu can simply re-implement each change in
changes.diff, and this time they will all work. That has certainly happened to me before. The point is that stepping your code from fully-tested to fully-tested is much, much safer and easier than stepping into flakiness, and then trying to work your way back. Alternately, Vu might get bored of reading
changes.diff. After extracting some value from that file, and proving the build is not broken, Vu might just discard the file, and then try to implement his current engineering task, again, from scratch. Either way, he does so with much more awareness of what might work, and what might not. And he'll run the entire integration test batch a little more often!