Low Budget Regression Testing for Developers

Principal Consultant, Beaver Creek Consulting

Having a regression test suite can increase developer productivity.  Verification that changes do not breaking existing functionality allows for confident, aggressive coding.  Unfortunately, one is often placed into a situation where there is no regression suite.  In this article, I describe tactics I've used to quickly create a regression suite designed to facilitate development.

Low Budget Objectives

The objectives for a low budget regression suite are:

  1. Requires a small amount of time to build and maintain,
  2. Runs quickly, and
  3. Provides a simple pass/fail result.

The overriding goal is to keep the level of effort low.  The optimal level of effort is low enough that you don't need to mention to your manager that you're building a regression suite; otherwise, you'll waste time defending the need to do so.

I'll run this type of test suite multiple times per day, right after I complete a small to medium sized set of changes.  That way, when a failure is encountered, it is pretty obvious where the fault was introduced.  This also relieves the need to have detailed test reporting.  The last set of changes is known so a simple pass/fail indication is often sufficient information to track down the problem.

There is a trade-off between execution time and coverage.  Designing for quick execution implies less than perfect coverage; finding the right balance between execution time and coverage is tricky.  The goal here is to provide immediate feedback for a series of small changes.  The proper balance with this goal in mind is heavily weighted towards quick execution times.

The Perfect World

In a perfect world, testing will be multi-layered and frequent.  A nice model is the continuous integration approach.  Unit test are developed along with the application code.  Changes are committed frequently and the application is tested daily with an automated process.

For Java developers, JUnit is the standard for unit testing.  Another nice touch is a code coverage tool to insure that sufficient unit tests have been written to engage all code branches.  A comprehensive approach will perform an automated testing cycle on a daily basis including pulling the latest sources from a repository, building the application, performing the tests, and creating reports.  There are many tools available designed to automating these tasks.

The Real World

But alas, reality intrudes.  I usually find myself new to a project, asked to implement a significant new feature, and there are no test cases in sight.  Major re-factoring might look like the right course of action.  However, without being confident about not breaking existing code, it takes a lot of nerve to aggressively attack a problem.  The right answer for me is to build a quick, low budget regression suite.  With a regression suite in place, I feel the confidence to do the right thing and leave the code base cleaner when I'm done.

There are many reasons for not having test cases.  It is easy to assign blame in this area.  But let's face the facts; developing and maintaining a test suite can be expensive.  A project team is often constrained by available time and resources.  Implementing new features and fixing existing bugs is typically given more emphasis than developing project infrastructure.  This emphasis on features is for a good reason; features and bugs are much more apparent to users.  I also believe that many managers and customers fail to make the connection between a good testing program and high quality software.

The Elements of a Regression Test Suite

The quickest way to generate a test suite is to do black box testing.  A black box test presents a set of inputs to the system, captures the output, and compares the output to known expected values.  Figure 1 shows the major elements need to put this plan into effect.

Figure 1. Elements and execution flow for a regression test suite.

The main software element needed is a test harness.  The test harness reads the test cases (step 1) and uses the contained information to invoke the application (step 2).  The test harness then collects the output from the application (step 3).  I call the output from the application the actual results.  The next step is for the test harness to compare the actual results to a known set of expected results (step 4).  The final step is to emit a test report (step 5).

One item not shown is setup and tear-down phases.  In the best case the test harness will be able to start with a clean system and load any required supporting elements (such as reference data in a database).  Even nicer is a test harness that cleans up after itself.


The main way to keep this plan "low budget" is to Keep It Simple.  For the test case file, I like a nice simple text format that can be easily edited in a text editor.  XML is nice for this, but may involve more overhead when building your test harness.  I tend to pick a well known format such the as the Java properties format or the Windows .ini format.  Several other well established text file formats can be found in The Art of Unix Programming.

The use of text files applies to the expected and actual output files as well.  The comparison can then be done by a diff between the two files.  This avoids having to write your own code to make the comparison.  One common issue is that the actual and expected files may have fields that are expected to differ between runs (such as sequence numbers or timestamps) and need to be ignored in the comparison.  I recommend developing a regular expression based filter to blank out the fields in question.  The filter is then applied to both the expected and actual output files prior to comparing them with a diff program.

For development level testing, the status I wanted reported is whether or not I broke anything.  The test harness output can be as simple as a line count of the diff program's output.  If there are no differences, things are in good shape and coding can continue.  One can even do this by hand:

<mybox> diff actual.txt expected.txt | wc -l

Unix command line tools are well suited for this kind of text file processing.  When I am working on a windows box, I install either Cygwin or UnixUtils for these tasks.  For examining the differences Win Merge is a good choice.

Preparing expected results can be difficult.  One approach is to craft what the output should be based on the test cases.  This can be a lot of work; consider the output of a web application where the expected results could be many pages of complex HTML.  In some cases, you might not know enough about the application to prepare the expected results.

The approach I take is to just go with the current state of the system.  Sure, they may be a few (or a lot) of bugs.  However, the goal is to avoid unexpected changes.  The quickest way to generate expected results is to run the test harness and capture the actual output.  Then, simply use this as the expected output going forward.

The Test Harness

The test harness needs to process text files and interact with the application.  Scripting languages are a perfect fit for this task as they feature quick development and excellent text processing features.  Interacting with the application may involve command line invocations, creating HTTP request, or perhaps calling stored procedures.  Most scripting languages will have libraries that make short work of invoking the application.

My favorite scripting language is Perl.  Libraries for most any task can be found on CPAN.  For Java developers Groovy is a scripting language that will make the most of your existing Java skills.  Every developer should have scripting language expertise in their tool box.  I believe that the choice of language is not overly critical, so if you don't know a scripting language, just pick one from the list and learn it.


The main points to keep in mind are:

  1. Do develop a regression suite to increase development productivity and accuracy.
  2. Keep the test harness simple.
  3. Favor quick execution and development time over extensive coverage.
  4. Use text files for inputs and outputs.
  5. Implement the test harness in a scripting language.

Having a regression suite helps to insure success in difficult situations.  I think you'll find the effort of developing a low budget regression suite to be well worth your time.

About the author

  1. Marty is a Principal Consultant with Beaver Creek Consulting Corp.  Marty has 10 years of professional experience as a Software Developer and specializes in developing finance and accounting solutions.  Marty is a Sun Certified Java Programmer with expertise in J2EE and database systems.  He holds a Doctorate in Chemical Engineering from the University of Virginia.

This article first appeared at MrBool.com at: http://www.mrbool.com/articles/viewcomp.asp?comp=7197

Last Update: 20071120