18 February 2011

Scalacheck generators for JSON

More than 6 months without a single post, because I've been focusing on the creation of specs2, which should be out in a few weeks (if you want to access the preview drop me a line).

Among the new features of specs2 there will be JSON matchers to help those of you handling JSON data. When developing those matchers I used ScalaCheck to test some of the utility functions I was using. I want to show here that writing custom data generators with ScalaCheck is really easy and almost follows the grammar for the data.

Here's the code:

* Generator of JSONType objects with a given tree depth
import util.parsing.json._
import org.scalacheck._
import Gen._

trait JsonGen {

implicit def arbitraryJsonType: Arbitrary[JSONType] =
Arbitrary { sized(depth => jsonType(depth)) }

/** generate either a JSONArray or a JSONObject */
def jsonType(depth: Int): Gen[JSONType] = oneOf(jsonArray(depth), jsonObject(depth))

/** generate a JSONArray */
def jsonArray(depth: Int): Gen[JSONArray] = for {
n <- choose(1, 4)
vals <- values(n, depth)
} yield JSONArray(vals)

/** generate a JSONObject */
def jsonObject(depth: Int): Gen[JSONObject] = for {
n <- choose(1, 4)
ks <- keys(n)
vals <- values(n, depth)
} yield JSONObject(Map((ks zip vals):_*))

/** generate a list of keys to be used in the map of a JSONObject */
def keys(n: Int) = listOfN(n, oneOf("a", "b", "c"))

* generate a list of values to be used in the map of a JSONObject or in the list
* of a JSONArray.
def values(n: Int, depth: Int) = listOfN(n, value(depth))

* generate a value to be used in the map of a JSONObject or in the list
* of a JSONArray.
def value(depth: Int) =
if (depth == 0)
oneOf(jsonType(depth - 1), terminalType)

/** generate a terminal value type */
def terminalType = oneOf(1, 2, "m", "n", "o")
/** import the members of that object to use the implicit arbitrary[JSONType] */
object JsonGen extends JsonGen

Two things to notice in the code above:

  • The generators are recursively defined, which makes sense because the JSON data format is recursive. For example a jsonArray contains values which can be a terminalType or a jsonType. But jsonType can itself be a jsonArray

  • The top generator used in the definition of the Arbitrary[JSONType] is using a "sized" generator. This means that we can tweak the ScalaCheck parameters to use a specific "size" for our generated data. Here I've choosen to define "size" as being the depth of the generated JSON trees. This depth parameter is propagated to all generators until the value generator. If depth is 0 when using that generator, this means that we reached the bottom of the Tree so we need a "terminal" value. Otherwise we generate another JSON object with a decremented depth.

You can certainly tweak the code above using other ScalaCheck generators to obtain more random trees (the one above is too balanced), with a broader range of values but this should get you started. It was definitely good enough for me as I spotted a bug in my code on the first run!