15 January 2008

Better unit tests with ScalaCheck (and specs)

Writing unit tests can seem tedious sometimes.

Some people tell you: "Hey, don't write unit tests only! Do Test-Driven Development". You write the test first, then the code for it. This way:
  • you end-up writing only the code that's necessary to deliver some concrete value for your customer/user
  • you drive the design of your system
  • you add frequent refactorings to the mixture to ensure your code stays clean
Or even better, "Do Behaviour-Driven Development!". With BDD, you get nice executable specifications for your system, which can almost read like English.

While I fully adhere to the above principles, I also think that there is a continuum between specifications and tests. And at the end of this continuum, it's all about testing that your software works. Even given silly inputs. Your unit tests should provide that kind of coverage.

And it's not that easy. A single line of code can go wrong in so many different ways. Try copying a file. Here comes ScalaCheck to the rescue!

Introducing ScalaCheck

Using ScalaCheck, you define:
  • properties which should always be true
  • random data to exercise the property
and ScalaCheck generates the test cases for you. Isn't it great?

Let's take a concrete example to illustrate this, because I feel I almost lost my only reader here (thanks bro, you're a real brother).

If you want, you Can

Last week I started specifying and testing the famous Can class from the lift framework. The Can class is the Option class from Scala library, on steroids. [To Scala newcomers: there are many good posts on Option, Maybe (in Haskell), Either and all this monad folkore but I will send you to a concrete example here].

Basically, a Can is either Empty (it contains nothing) or Full (it contains a value). This is a fairly common situation in software or elsewhere: the user with name "Smith" exists in the database (Full) or not (Empty), I got the power (Full) or I haven't (Empty).

When a Can is empty, it can be enhanced with an error message explaining why it is empty. In that case, it will be a Failure object.

Now, if you want to test an "equals" method working for all different cases you have to specify a lot of test cases:
  1. 2 Full objects which are equal
  2. 2 Full objects which are not equal
  3. 2 Empty objects which are equal
  4. 2 Empty objects which not equal
  5. 2 Failure objects which are equal
  6. 2 Failure objects which not equal
  7. A Full object and an Empty object (not equal)
  8. A Full object and an Failure object (not equal)
  9. A Failure object and an Empty object (not equal)
When I said it could be tedious,... And I'm even simplifying the situation since Failures can be chained, optionally contain an Exception, etc,...


Here is the solution, implemented using specs and ScalaCheck, with the support of Rickard Nillson, author of the ScalaCheck project:

object CanUnit extends Specification with CanGen {
"A Can equals method" should {
"return true when comparing two identical Can messages" in {
val equality = (c1: Can[Int], c2: Can[Int]) => (c1, c2) match {
case (Empty, Empty) => c1 == c2
case (Full(x), Full(y)) => (c1 == c2) == (x == y)
case (Failure(m1, e1, l1),
Failure(m2, e2, l2)) => (c1 == c2) == ((m1, e1, l1) == (m2, e2, l2))
case _ => c1 != c2
property(equality) must pass

How does it read?

"equality" is a function taking 2 Cans. Then, depending on the Can type, it says that the result from calling the equals method on the Can class should be equivalent to calling equals on the content of the Can if it is a Full Can for instance.

Create a "property" with this function and declare that the property must pass. That's all.

Well, you may want to have a look at what's generated. Add the display parameter:

import org.specs.matcher.ScalacheckParameters._
property(equality) must pass(display)

Then you should see in the console:

Tested: List(Arg(,Failure(cn,Full(net.liftweb.util.CanGen$$anon$0$UserException),List()),0),... Tested: ...
Tested: ...
+ OK, passed 100 tests.

And if one test fails:

A Can equals method should
x return true when comparing two identical Can messages
A counter-example is 'Full(0)' (after 1 try) (CanUnit.scala line 21)

But you may have, at this point, the following nagging question: "Where does all this test Data come from?". Let's have a look below.

Generating data

Data generators are defined "implicitly". You define a function which is able to generate random data and you mark it as "implicit". When ScalaCheck tries to generate a given of object, it's looking for any implicit definition providing this. Like:

implicit def genCan[T](dummy: Arb[Can[T]])
(implicit a: Arb[T] => Arbitrary[T]) = new Arbitrary[Can[T]] {
def getArbitrary = frequency(
(3, value(Empty)),
(3, arbitrary[T].map(Full[T])),
(1, genFailureCan)

This code says that generating a Can, optionally full of an element of type T, which has its own implicit Arbitrary generator, is like choosing between:
  • an Empty object, 3 times out of 7
  • an arbitrary object of type T, put in a Full object, 3 times out of 7
  • a Failure object (which has its own way of being generated via another function), 1 time out of 7
[The "dummy" parameter is here to help Scala type inferencer, AFAIK. The world is not perfect, I know]

Here is the Failure generator, which make heavy use of ScalaCheck predefined generation functions:
def genFailureCan: Gen[Failure] = for {
msgLen <- choose(0, 4)
msg <- vectorOf(msgLen, alphaChar)
exception <- arbitrary[Can[Throwable]]
chainLen <- choose(1, 5)
chain <- frequency((1, vectorOf(chainLen, genFailureCan)), (3, value(Nil)))} yield Failure(msg.mkString, exception, chain.toList)

In the above method,
  • choose returns a random int number inside a range
  • vectorOf returns a collection of arbitrary object, with a specified length
  • alphaChar returns an arbitrary alphanumeric character
  • arbitrary[Can[Throwable]] returns an arbitrary Can, making all this highly recursive!
Random thoughts

I hope this sparked some interest in trying to use ScalaCheck and specs to define real thorough unit tests on your system.

The added value is similar to BDD, you will see "properties" emerge and this will have a better chance at producing rock-solid software.

From now on, you too can be a ScalaCheck man! (see lesson 4)


rickynils said...

In an upcoming ScalaCheck release, you can write your implicit genCan function like this instead:

implicit def genCan[T <: Arbitrary[T]]: Arbitrary[Can[T]] = ...

This is because the type inferencer was improved in Scala 2.6.1. However, I still lack a feature (Scala ticket #298), for making it perfect, so I haven't introduced the change yet.

/ Rickard

gnufied said...

You should probably have a mailing list for specs. I am using specs in my project and often stumble on few problems.

Eric said...


Group email:

Waiting for you on the mailing-list!