Creating unit tests from scratch

As well as refactoring existing unit tests, I’ve also recently created some tests from scratch.  I realised that, while I have gone on at length about testing on this blog, including the ways in which I think tests can be well- or poorly-written, I haven’t talked about the process of writing them.

In case it’s useful, for instance if you’re just starting writing unit tests, I’ll give an overview of the process I seem to follow most of the time.  It’s not the only way to do things, so please don’t think I’m telling you how you should do it.  This will be in the context of C#, but I hope it will still be generally applicable.

Also, I don’t use Test-Driven Development.  Again, this isn’t because I think it’s wrong, just that it doesn’t work for me.  I find it forces me to chop my thoughts up into tiny pieces, which makes me trip over myself mentally.  If anyone is thinking “you can’t do agile without doing TDD” or “you can’t do software development well without doing TDD” I would urge them to consider what’s the end and what’s the means.

If I write good quality code that’s easy to change and test, that has good coverage of good quality tests, and I’m reasonably quick about it and my process is repeatable and sustainable, why does it matter what the process is that achieved it?  I do write tests for my code, often soon after the code and in the same commit, it’s just that I don’t use them to drive the writing of the code.  If TDD works for you, then I hope you continue using it – I’m not knocking it, just being honest about not using it myself.

Summary

I think it’s important to remember that unit tests are code too, so the strategies for writing the code under test are worth considering when writing the tests.  So I don’t write the fully-formed tests starting at the first line down to the last line.  Instead I repeatedly go from a simpler working version to a fuller working version, improving things with each version or iteration.

I don’t try to get from nothing to complete in one go, but in a series of manageable steps.

Photograph by Mike Peel (www.mikepeel.net)., CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

Version 0

The aim of version 0 is to flush out the bare minimum of dependencies to get things to compile.  So I create the new class in a new file, and name it Xtest if I’m testing a class called X.  I then create a method without giving it a proper name – just a single letter would do – and in it create an instance of the class under test by calling its constructor.

This won’t compile (I don’t bother actually compiling, as I rely on my IDE putting red squiggly lines under errors).  So I add in arguments to the constructor call, which is how the class under test accepts its dependencies.  The arguments I use are variables, which I also create in the class.  The variables’ type bring in namespace dependencies (using statements in C#).  By default I use mock rather than real versions of all the dependencies for speed and flexibility, and wrapping the variables’ type in Mock<> brings in another using statement for my mocking library (Moq).

I now have a file of unit test code that will compile, but it will almost certainly crash if I ran it.  I’ll fix this in version 1 – the first version with a runnable test.

Version 1

The aim of version 1 is to get the simplest test that runs and passes.  This is often an error test, as often success depends on things being correct for all of a fairly long code path, but the first error can be returned after only a short code path.

I’m not aiming for neat and tidy, with everything refactored and lovely.  It really is just the quickest way of getting some code path exercised.  It will help to bring in any dependencies to do with the test framework (the thing that asserts that things are equal, null etc.), and will start giving me a feel for what test data will be needed.

So I look through the code under test, pick a method that I want to test, and from that pick the simplest way I can get an error from that method.  The method I pick might be the simplest, the most valuable, or one that will help develop my understanding of how I should test the class.

I create the bare minimum to get this test to pass.  I create any mock methods I need for only this test, and also test data if I need it for this test.  This is all done inside the test method for now – I can refactor things later.  Soon I have a passing test.  As well as the flushing out of dependencies, and growing my understanding of how I’ll test the class, I must admit that seeing the green tick is a childish pleasure.

If I had done nothing but lay groundwork (mocks, test data generators etc.) for ages, not only might I have wasted a lot of effort going down the wrong path (which I might have discovered only when finishing off the tests at the end), I would have no green ticks for ages and then all of them at the end, rather than a steady stream of them as I worked.  Instant gratification FTW!

Version 2

Version 2 is where things start to get a bit tidier, and I start to set up a proper structure.

I have a test class that has an oddly named test method.  This test method’s test passes, but it is self-contained rather than relying on mocks etc. that have been factored out.

Here’s where I stop and think i.e. I do a bit of designing.  I think about all the tests I want to write for the method under test.  Sometimes I write these down as single-line comments in the test class, and sometimes I write them down directly as empty test methods with a name that reflects their purpose.  I rename the test method from version 1 so it’s clearer.

Unless there’s only one method I want to test in the class under test, I also create a derived class for the method under test, so that all the test methods for it are together in their own class.  Things at the level of the class under test stay in the parent test class – the variables for the constructor etc.  This means all the test classes (for the different methods under test) will inherit them, so they don’t need repeating for each test class.  The test methods / single-line comments are moved down into this new test class.

There are no new tests introduced in version 2, but I now have things in their final structure, and I have a To Do list to work through.

Version 3 onwards

This is where things get a bit more free-form, and so hard to describe simply, as they depend a lot on the specifics of the method under test and the tests I’m writing for it.

Sometimes the test data generators, mocks etc. I need become obvious immediately, and sometimes I need to work through tests a bit first before I can tell what I need.  If they’re not immediately obvious, I write a test method that does all its work rather than depending on external methods.  I then write the next test method, and the pattern of similarities and differences between the two test methods helps me to spot where things can be extracted into helper methods, and where it makes more sense to leave things alone.

As I said in my previous article, I don’t aim for perfection where it comes to things like test data generators.  I try to balance simplicity / understandability on one hand and completeness on the other, rather than going for just completeness.  This means that sometimes I have test methods that can’t use test data generators etc. because their test data’s too weird.  That’s OK, if it means that the test class as a whole is more understandable.

I often write methods to run the code under test and then process the result in some way.  For instance, if the method under test returns some JSON, the RunAndCheck method might:

  1. Call the method under test and store the result in variable 1
  2. Assert that variable 1 is not null
  3. Deserialise the JSON into a proper type, e.g. an enumeration and store the result in variable 2
  4. Assert that variable 2 is not null
  5. Assert that the value in variable 2 is the expected member of the enumeration.

This way if the method returns null, or returns something that can’t be deserialised into the enumeration, I get a helpful error rather than just the final comparison failing with a message such as “” != OrderComplete.

Example

I won’t go into the details, because I don’t think they’ll be helpful, but here’s an example at a high level.  First, a description of the method under test.

Imagine that the method under test queries some data according to some criteria, and returns a list of 0 or more results.  There’s a mapping between each member of the list and the corresponding bit of the underlying data, e.g. if a bit of the data is NULL, set a flag to false in the output.  The list must be ordered by one of the properties of the list e.g. the data is about people, and the results must be order by age.  The method is passed some parameters that shape the query and / or mapping, and some possible combinations of these produce an error, e.g. if objects are null.

The test methods I would write for this would probably be:

  • A test for each distinct important error path
  • A test for each important variation of the mapping – each test would return a single result that is checked.
  • A test for each important part of the query’s predicate – each test ignores one bit of test data (because it’s on the wrong side of some imaginary line defined by the part of the predicate I’m testing) and finds another bit of test data (it’s on the correct side of the line), and then returns one result.
  • A test for the order – it usually returns 3 items (so that there’s a beginning one, an end one, and one that’s neither).  This uses the simplest test data, simplest path through the predicate etc, and tests just the order.

Note that I’m deliberately leaving important undefined, as its meaning will depend on the context.

If there are any test data generators, I’ll make sure that they work nicely for the tests to do with the predicate as this is where they’re likely to be most useful.  If there are many errors, or there’s quite a bit of processing to get an easily-testable version of each error (as in the JSON example above) I might write a RunAndExpectError() method.  I might also write a RunAndExpectOneResult() method for the predicate tests.

I try to avoid hard-coded things in the tests as much as possible.  I don’t mean things like ids or strings in test data, but more things like checking the 7th item in a result list against the 6th, or that the Age field of the result has the value 42.  Unless the returned age is calculated, I define a constant for the age and use it in both the relevant bit of test data and the check against the result.  Not only does it make it easier to change in future, it makes it clearer that the reason why 42 shows up in the result is because 42 went into this part of the test data.

Similarly, for checking the order, I don’t check that result[0].age <= result[1].age etc.  This is brittle, and also doesn’t express the meaning in its clearest form.  The important thing is that, no matter how short or long the list is, it’s ordered.  Not that element 0 <= element 1 etc.  So I write a loop of X over 0 to N-2, checking that element[X] <= element[X+1].  Checking each pair of elements means the list is checked as a whole.

Parameterised tests

One thing I’ve missed out so far is when a series of tests form a simple list.  As in, they’re structurally the same, but something (like an input and/or an output) varies from one test to the next.  This is a bit of a special case in my experience, because often there’s too much difference between related tests.  If they are this similar, it usually helps to make this similarity explicit.  How you do this depends on your test framework, and whether it supports things like parameterised tests.

Even if it doesn’t, you can roll your own version of this by defining things like arrays or lists to hold the things that vary from test to test, and then looping over them and calling the method under test appropriately and checking its outputs appropriately i.e. using things from the lists or arrays.

If you do roll your own, I suggest that you include data from the lists or arrays in the assertions that check the world after the code under test has been run.  If you’re looping through a list of 10 things and the test fails, it can be tedious to work out which of the 10 things caused the failure.  If the assertion failure says “3 != 7, inactive customer” this will make your life easier.  (I’m assuming that the assertion would always print the “3 != 7” part, and you’ve added the “inactive customer” part from the list data.)

Summary

Remembering the test code is still code is something I find helpful when writing tests.  The techniques I use for code, such as iterative development, refactoring where it makes sense, mixing up front design with discovering the design as I go, can help me write its tests.  It might be that the details of what I present don’t work for you.  In which case: a) I’m not dictating anything, b) I hope you find something that works better for you.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s