Tuesday, 1 November 2011

Concise, Elegant and Readable Unit Tests

Just recently I was pair programming with another developer and we were writing some unit tests for the code we were about to write (yes, we were doing real TDD!). During this exercise we came to a point where I didn't feel that the changes being made to the code met my measure of what makes a good test. As in all the best teams we had a quick discussion and decided to leave the code as it was (I save my real objections for times when it is more valuable). However, this got me thinking about what actually I was uncomfortable with, and hence the content of this post.

As a rule I like my code to have three major facets. I like it to be concise: there should be no more code than necessary and that code should be clear and understandable with minimum contextual information required to read it. It should be elegant: the code should solve the problem at hand in the cleanest and most idiomatic way. Also, it must be readable: clearly laid out, using appropriate names, following the style and conventions of the language it is written in and avoiding excessive complexity.

Now, the test in question was written in Groovy, but it could just as well be in any language as it was testing a fairly simple class that maps error information into a map that will be converted to Json. The test is:

@Test
void shouldGenerateAMapOfErrorInformation() {
    String item = "TestField"
    String message = "There was an error with the test field"
    Exception cause = new IllegalStateException()

    ErrorInfo info = new ErrorInfo()
    info.addError(item, message, cause)

    Map content = info.toJsonMap()
    assert content.errors.size() == 1
    assert content.errors[0].item == item
    assert content.errors[0].message == message
    assert content.errors[0].cause == cause.toString()
}

Now, to my mind this test is at least three lines too long - the three lines declaring variables that are then used to seed the test and in the asserts. These break my measure of conciseness by being unnecessary and adding to the amount of context needed in order to follow the test case. Also, the assertions of the map contents are less than elegant (and also not very concise), which further reduces the readability of the code. I'd much rather see this test case look more like this:

@Test
void shouldGenerateAMapOfErrorInformation() {
    ErrorInfo = new ErrorInfo()
    info.add("TestField", "There was an error with the test field", 
                 new IllegalStateException())

    Map content = info.toJsonMap()
    assert content.errors.size() == 1
    assert hasMatchingElements(contents.errors[0], 
          [ item : "TestField",
             message : "There was an error with the test field",
             cause : "class java.lang.IllegalStateException" ])
}

This test implementation is more concise. There are less lines of code and less context to keep track of. To my eye I find this much more elegant and far more readable.

"But, you've added duplication of the values used in the test", I hear you cry! That is true and it was totally intentional. Let me explain why…

Firstly, as we have already mentioned, duplicating some test values actually makes the code more concise and readable. You can look at either the test setup/execution or the assertions in isolation without needing any other context. Generally, I would agree that avoiding duplication in code is a good idea, but in cases such at this, the clarity and readability far out way the duplication of a couple of values!

Secondly, I like my tests to be very explicit about what they are expecting the result to be. For example, in the original test, we asserted that the cause string returned was the same as the result of calling toString() on the cause exception object. However, what would happen if the underlying implementation of that toString() method changed. Suddenly my code would be returning different Json and my tests would know nothing about it. This might event break my clients - not a good situation.

Thirdly, I like the intention of my code to be very clear, especially so in my tests. For example, I might create some specific, but strange, test data value to exercise a certain edge case. If I just assigned it to a variable, then it would be very easy for someone to come along, think it was an error and correct it. However, if the value was explicitly used to both configure/run the code and in the assertion then I would hope another developer might think more carefully as to my intent before making a correction in multiple places.

My final, and possibly most important reason for not using variables to both run the test and assert against is that this can mask errors. Consider the following test case:

@Test
void shouldCalculateBasketTotals() {
    Basket basket = new Basket()
    LineItem item1 = new LineItem("Some item", Price(23.88))
    LineItem item2 = new LineItem("Another item", Price(46.78))
 
    basket.add(item1)
    basket.add(item2)
 
    assert basket.vatExclusivePrice == 
                       item1.vatExclusivePrice + item2.vatExclusivePrice
    assert basket.vat == item1.vat + item2.vat
    assert basket.totalPrice == item1.price + item2.price
}

In this test, what would happen if the LineItem class had an unnoticed rounding error in its VAT calculation that was then replicated into the basket code? As we are asserting the basket values based on the fixture test data we may never notice as both sides of the assertion would report the same incorrectly rounded value. By being more explicit in our test code we not only create code that is more concise, elegant and readable but we also find more errors:

@Test
void shouldCalculateBasketTotals() {
    Basket basket = new Basket()
    LineItem item1 = new LineItem("Some item", Price(23.88))
    LineItem item2 = new LineItem("Another item", Price(46.78))
 
    basket.add(item1)
    basket.add(item2)
 
    assert basket.vatExclusivePrice == 58.88
    assert basket.vat == 11.78
    assert basket.totalPrice == 70.66
}

Finally, a couple more points that people might raise and my answers to them (just for completeness!):

Q: Can't we move the constant variables up to the class level and initialise them in a setup to keep the tests methods concise?
A: This makes an individual method contain less lines of code but fails my consiseness test as there is context outside of the test method that is required to understand what it does. You actually have to go and look elsewhere to find out what values your test is executing against. This is also less readable!

Q: Isn't is better to share test fixtures across test methods in order to keep the test class as a whole more concise?
A: This might be appropriate for some facets, such as any mocks that are used in the test or if there is a particularly large data set required for the test which is not part of the direct subject of the test. However, I'd rather see slightly longer individual test methods that are explicit in their test data and assertions, even if this means some duplication or a slightly longer overall test class.

No comments:

Post a Comment