09 June 2010

Random Hacks of Kindness

Pure kindness

My last coding week-end is worth a few notes. I participated to the Random Hacks of Kindness event in Sydney. The idea is to gather techies and make them code during the week-end on software projects that help the organisations working on humanitarian crises, like the earthquake in Haiti.

I joined the event thinking that, for once, the code I would create would have a direct impact on people lives! Not that working in banking, telecom, or testing tools has no benefit, for no one, but this is guaranteed to help.

First of all I want to say that the organization was flawless, with a very nice room at the University of New South Wales and all the network equipment you would expect: a proper wifi network, video and audio facilities, an additional meeting room, a nice terrace to appreciate the view around and get fresh inspiration/air.

It is also worth mentioning that food and (non-alcoholic) drinks were provided in abundance (maybe because the crowd was less that expected :-)).

Then I must say that one big success factor to the event was the incredible energy conveyed by Heather, a (completely jet-lagged :-)) Canadian girl who animated and coordinated the teams locally and across the continents. A special thank also to Martin Bliemel (from the University) who co-organized and Tolmie for the videos and pictures! (including a video of me if you crawl the links below well enough, you should find it!)

"Crises are chaos, so expect chaos"

One day before the beginning of the week-end the list of projects was published. I had a quick look, the list was huge. My first thought was: "wow, a lot of Python over there. How can I be effective in 2 days only?".

I also realized that most projects had very vague specifications, some looked more like brainstormings indeed.

Eventually, when I arrived on Saturday morning, I had really no idea where I could bring any meaningful contribution, so I just followed someone who had chosen one project and seemed really motivated by it: the PersonFinder project. It turned out that he eventually went on to something else, leaving me and 2 others continuing the project.

But that's the rules of the game. As an introduction for the week-end, Heather reminded us that crises situations generate a lot of chaos, so that's only normal that we experience part of that during the week-end.

For me, that chaos took the form of a long "What am I supposed to do for God's sake?!!" during most of the Saturday (you can read that here).

We did a lot of research: what is already there? what's valuable? How could it be done? It was like a mini business plan for the day. Of course that's a bit frustrating for a software developer who wants to save the world by coding like a madman, but I kept thinking to myself that laying down the foundations of what could be a productive coding session for others, if not for me, was definitely worthwhile. So that become my objective for the week-end:
  1. define exactly what needs to be done
  2. prepare the environment (development, test)
  3. code at least a single, well-defined, small feature
Another way of saying it: reduce the chaos,... Actually most of that became clear to me after I had a discussion with Alice, from the DC team, who had a well-defined idea of where we should go with the project. If only I had that discussion to start with,...

I learned a lot

Let's say it straight, I failed to deliver even a single small feature. But I learned a lot! On the "functional" side, I learned that:
  • there's a whole ecosystem of "crises software" with platforms like sahana (PHP, Python) or usahidi (PHP)
  • the integration of social web services is only at its infancy. Website like pipl do not offer a public API for example. There a Google Social Graph API, but it's still in the Labs.
  • there is available data on the web which is incredibly difficult to grab. For example, knowing my full name, you can easily get my personal address. But for that, you have to know in which country I live and which white page directory website to use. There's no generic query for that kind of search. Hmm, I'd be interesting in knowing which "semantic web" initiative or magical pattern language addresses that in a near future
  • Web 2.0-this, Social-that, this is a bit of chaos too. Quizz question: what's Google Wave? An email client, a wiki, a code-review tool, a forum? Not only the frontiers and definitions blur, but the amount of possibly related data grows exponentially. Managing all that, including in "crisis" environments will require a lot of "social" heuristics, mixing all kind of CS and social knowledge
The technical side was very illuminating too, as I'm way behind the curve in terms of web technologies:
  • I learned how to install and deploy a Google Web App on the Google App Engine (GAE)
  • I realized that my Java skills were useful after all, with the GAE
  • I read about the Twitter search API (kudos, effective and well-documented)
  • I had a look at JSON as a data format
That may seem silly but for the first time, the whole notion of webservice started to make sense to me!

And now?

Well, the features are still not fully delivered. Thanks to the heroic work of the DC team, we've made some progress but nothing has been yet approved and reused. Can we do better next time? Sure we can:
  • we need to have "Product Owners". People who know what they *want* to be done and are able to specify it. Then they can validate the deliverables and can be a stable point of contact after the event is finished
  • we need more preparation on the projects: for example, what to install for development and testing, some hints about what could be used in the solution. I'm killing a bit the fun and brainstorming aspect of the event here, but the next point is
make a clearer separation between:
  • "proof of concept" projects: show me this could work
  • "brainstorming" projects: what could be cool
  • "prototype" projects: cool + some code
  • "feature" projects: add a new functionality
  • "problem-solving" projects: fix something by using a brilliant algorithm for example
  • "specification" projects: prepare the problem statement for a "feature" project
I think that by making clear what are the expectations and deliverables on each type of project we give more chances to the community to converge towards something eventually useful and durable.

My personal dilemma is that I would love to contribute more to what was started but I have a few open-source commitments that are taking priority over this at the moment. On the other hand I will be really happy to be involved next year in the preparation of the event so that everyone can get the maximum out of it. And that's because, as one of my colleagues said at work:
These projects have a spiritual meaning

28 May 2010

OSCON 2010

The slides

Those are the slides I used to present my talk:

Thank you

I want to add a thank to:
  • Dean Wampler and Alex Payne for thinking about me to make a presentation
  • Edd Dumbill and Shirley Bailes for the organization
  • Scott Berkun ( for giving useful tips to make my talk "suck less"

13 May 2010

GuiceModules and TestComponents, a powerful combination

Important update!! [21/05/2010, see at the bottom]

Sorry everyone, that doesn't work

In this previous post, I was mentioning a way to inject the dependencies of a TestComponent without having to inject the component itself.

The objective was, if I pick up the example I was using, to have a TestComponent called "CustomerOrder". That TestComponent is responsible for setting up a proper Customer and its Order, so that they both can be injecting into the TestCases needing them. But the additional requirement was not to have to inject the CustomerOrder component itself.

It turns out that what I was advising to do, create a binding of the CustomerOrder in the CustomerOrderModule doesn't work at all!

And all my other attempts at solving this issue usually ended up in initialization errors where a TestComponent was relying on another for his setup, but that other one was not initialized.

The proper trick

It is actually simple and I don't understand why I haven't been able to come up with it in the first place.
The idea is to:
  • get a set of all the modules, including their transitive dependencies
  • create an injector for this set of modules
  • inject the modules themselves!

* Use all the bindings of this module, including the additional ones
* to inject dependencies
public T inject(final T object) {
return object;
* @return an injector initialized with all the transitive bindings of this module
private Injector injector() {
if (injector == null) {
injector = Guice.createInjector(getModules());
return injector;
* modules are injected one after the other following their dependencies order:
* If module A depends on module B, then module B must be injected first
private void injectModules() {
final List sorted = new ArrayList(getModules());
Collections.sort(sorted, new GuiceModuleComparator());
for (final GuiceModule m : sorted)

Because the modules are now injected, if you add a dependency inside the module, that dependency will be initialized when the Module is injected. So nothing stops me to inject the CustomerOrder TestComponent into its module so that anytime someone uses the module, the CustomerOrder TestComponent will be fully set-up.

Now, how does the code above solves the initialization issues I've been mentioning?

Well, I guess that you noticed something in that code, the modules are sorted before they're injected. Indeed they are sorted following the "natural" pre-order of their transitive dependencies with the following Comparator:

private static class GuiceModuleComparator implements Comparator {
public int compare(final GuiceModule o1, final GuiceModule o2) {
if (o1.getModules().contains(o2))
return 1;
else if (o2.getModules().contains(o1))
return -1;
return 0;

This sorting ensures that "low-level" modules will be injected first, then modules depending on them will be injected, respecting the actual dependencies.

So now I can write:

* The bindings definitions
public class CustomerOrderModule extends GuiceModule {
// this injection will setup the Customer/Order before they're used anywhere
@Inject CustomerOrder customerOrder;
@Override protected void configure() {
bind(Customer.class).toInstance(new Customer());
bind(Order.class).toInstance(new Order());
* The CustomerOrder TestComponent setting up a "typical"
* Customer and its Order
public class CustomerOrder extends TestComponent {
@Inject Customer customer;
@Inject Order order;
@Override protected void before() {
// set up the Customer and order instances here
* A test case using the Customer and its Order
public class OrderingTestCaseextends TestComponent {
// the customer and order can be injected without having to
// inject the CustomerOrder TestComponent!!
@Inject Customer customer;
@Inject Order order;

@Test public void anOrderMustHaveACustomer() {
assertEquals(customer, order.getCustomer());
@Override public GuiceModule module() {
return new CustomerOrderModule();

Please ask

That's all there is to it. I understand that this may be a bit difficult to grasp at first, but then you realize that you only have a few concepts:
  • a GuiceModule specifies bindings
  • GuiceModules can to have dependencies to other GuiceModules
  • a TestComponent is either setting data (aka TestFixture) or is a TestCase
  • a TestComponent specifies its GuiceModule to define its required bindings

Give it a go, I've been using that for a while now on my project, and the flexibility we got for defining and reusing TestComponents has been really great. And if you have any trouble don't hesitate to ask, I'll be happy to provide some more complete code samples and explanations.

Important update!! [21/05/2010]

Live and learn, they say. Well the sorting of modules proposed above doesn't work either. What I really want is a topological sort. I'm glad that Cedric Beust posted this article once, explaining how he was dealing with the dependencies in TestNG, because that's essentially the same problem here.

So, for the courageous who want to follow the GuiceModules/TestComponent way, here is the sorting code you will need:

* This sort algorithm is taken from
* @return the list of sorted modules according to the Topological order
public List sortedModules() {
final List result = new ArrayList();
final Map visited = new HashMap();
for (final GuiceModule m : getModules())
visited.put(m, false);
for (final GuiceModule m : getModules())
visit(m, result, visited);
return new ArrayList(result);

* visit a module and add it only if it comes after other modules
private void visit(final GuiceModule m, final List result, final Map visited) {
if (!visited.get(m)) {
visited.put(m, true);
for (final GuiceModule other : getModules()) {
if (!m.equals(other) && !visited.get(other) &&, m) < 0)
visit(other, result, visited);

This is exactly the time when I don't regret reading blog posts as a morning routine in the morning!

06 May 2010

Mini-Parsers to the rescue

An example of curryfication

I'm implementing a reasonably complex algorithm at the moment with different transformation phases. One transformation is a "curryfication" of some terms to be able to transform some expressions like:

f(a, b, c)


.(.(.(f, a), b), c)

Being test-compulsive, I want to be able to test that my transformation works. Unfortunately the data structures I'm dealing with are pretty verbose in my tests.

One of my specs examples was this:

"A composed expression, when curried" should {
"be 2 successive applications of one parameter" +
"when there are 2 parameters to the method expression" in {
ComposedExp(MethodEx(method), const :: arb :: Nil).curryfy must_==
Apply(Apply(Curry(method), Curry(const)), Curry(arb))

But Apply(Apply(Curry(method), Curry(const)), Curry(arb))) is really less readable than .(.(method, const), arb). And this can only get worse on a more complex example.

With the help of a mini-parser

So I thought that I could just write a parser to recreate a "Curried" expression from a String.

A first look at the Scala 2.8.0 Scaladoc(filter with "parser") was a bit scary. Especially because my last parser exercise was really a long time ago now.

But no, parser combinators and especially small, specific parsers like the one I wrote, are really straightforward:

object CurriedParser extends JavaTokenParsers {
val parser: Parser[Curried] = application | constant
val constant = ident ^^ { s => Curry(s) }
val application = (".(" ~> parser) ~ (", " ~> constant <~ ")") ^^ { case a ~ b =>
Apply(a, b)
def fromString(s: String): Curried = parser.apply(new CharSequenceReader(s)).get

In the snippet above I declare that:

  • my parser returns a Curried object

  • my parser is either an application or a constant (with the | operator)

  • a constant is a Java identifier (ident is a parser inherited from the JavaTokenParsers trait)

  • a constant should be transformed to a Curry object

  • an application is composed of two parts (separated by the central ~ operator). The first part is: some syntax ("(") and the result of something to parse recursively with the parser (i.e. an application or a constant). The second part is a constant, surrounded by some syntax (", " and ")"). Note that the syntactic elements are being discarded by using ~> and <~ instead of ~.

  • once an application is parsed, part 1 and part 2 are accessible as matchable object of the form a ~ b and this object can be used to build an Apply object

  • the fromString method simply passes a String to the parser and gets the result

I have to say that this parser is really rudimentary. It doesn't handle errors in the input text, there absolutely needs to be a space after the comma, and so on.

Yet it really fits my testing purpose for a minimum amount of development time.

Open question

I hope that this post can serve as an example to anyone new to Scala wanting to play with parsers and I leave an open question for senior Scala developers:

Is there a way to extends the case classes representing algebraic datatypes in Scala so that:
  • each case class has a proper toString representation (that's already the case and that's one benefit of case classes), but that representation can be overriden (for example to replace Apply(x, y) with .(x, y))

  • there is an implicit parser that is able to reconstruct the hierarchy of objects from its string representation

10 March 2010

We need an algebra for Guice modules

Or,... I missed something (which is still highly probable).

Anyway, I want to share today the small library I've created to ease the life of my co-workers when faced with the tedious task of writing unit tests.

I won't do that again and again (and again)

I love writing unit tests. Ah, the smell of the CPU burning cycles to pass all those little creatures to green! Yet, starting from a blank page can be really tedious.

If you're like me, working with some kind of major Client-Server system, with tons of business objects and a fair deal of interfaces, you may find that just starting writing the first line of test code is no piece of cake. You're likely to need:
  • some kind of "reference" data in your system, like a list of customers or a list of products
  • some live-like data, like current prices
  • some elaborate building of your "object under test", like a ConfirmationProcessor for orders confirmation
  • some infrastructure to make sure that you don't need to go to a real database (think "mocks" here)
  • some infrastructure to make sure that you don't need to access external webservices (think "mock" one more time)
Phew! I can pretty much guarantee that if you've gone through the trouble of coding all this machinery once, for your first test class, you will desperately want to reuse it a for your next test class!

Injection is for everyone: interfaces, objects, mocks and, yes, tests

If I rephrase the requirements above a bit differently, what I really, really, want is the ability to create Test Components where:
  • one Test Component represents my Server, with all interfaces being mocks (those are RMI interfaces in my case)
  • one Test Component provides a set of factories to create business objects easily and make as if they were saved on the Server (this uses the previous Test Component, right?). Once initialized, this component maken sure that a mininum of default "reference" objects are already saved (a customer and a product for example)
  • one Test Component named CustomerOrder represents the "typical" customer order, with all necessary customer data, product selection,... This component also provides easy methods to change the product or the payment details, possibly using a DSL to do that concisely
  • one Test Component is for the ConfirmationProcessor, what you really want to test, an object which creates and sends confirmations to the customer for their orders. This component, for example, is using a mock for the DiscountCalculator because it is not relevant in the context of sending confirmations
It must be clear, from the description above, that those Test Components may have lots of dependencies between them. Instantiating all those components properly can rapidly become a nightmare, but it happens that we know exactly how to deal with this nowadays.

Different shapes and flavors are available for Dependency Injection and I went with my favorite: Google Guice. Guice is going to "inject" all the required objects for my test:
  • a mock Server
  • a set of factories
  • a typical customer order
  • the context of my test
Show me the code

First of all, I need to show what I mean by TestComponent:

/** something with a Guice module */
public interface HasGuiceModule {
GuiceModule module();

* A simple test component using a Guice Module to inject its members
* when invoked by JUnit through the @Before annotation.
* If necessary, after the members injection, the subclass can use the localSetup method
* to add more initialization (like creating an initial customer) or more
* expectations (for mock objects)
public abstract class TestComponent implements HasGuiceModule {
public void setup() {

private void inject() {

protected void localSetup() {}
protected void expectations() {}
The component above is really JUnit-oriented but you could as easily add a main method which would call the setup.

Armed with this, I can create my first concrete Test Components, leaving out the module() method definition for now:
public class MockServer extends TestComponent {
Connection connection;

CustomerInterface customers;

ProductInterface products;

OrderInterface orders;

// do all the wiring when the members have been injected with Guice
protected void expectations() {

protected GuiceModule module() { /** to be defined later */ }

public class Factories extends TestComponent {
// this server interface provides methods to individual objects: a customer
// a payment method, a customer address,...
@Inject CustomerInterface customers;

// this factory provide easy to use methods to create
// fully setup customers with delivery address, payment method,...
// it uses the CustomerInterface interface to do so
@Inject CustomerFactory customerFactory;

protected GuiceModule module() { /** to be defined later */ }

public class ConfirmationProcessorTest extends TestComponent {

// the class under test
@Inject ConfirmationProcessor processor;

// the standard order, it uses the OrderFactory and CustomerFactory to save and
// update the customer and order
@Inject CustomerOrder customerOrder;

@Test public void aConfirmationForAGoldCustomerMustDisplayStars() {
// a DSL for describing customer types
customerOrder.setCustomerType("Gold 5Y 4*");

private void confirmationMustContain(String content) {

protected GuiceModule module() { /** to be defined later */ }
The code above shows a classic JUnit4 class, with one method annotated with @Test. The nice things to notice are:
  • the test data is injected in the form of a Test Component with a ready-to-use initial state
  • that Test Component provides convenient ways to refine the initial state with data relevant to the test objective
So what's Guicy here?

So far, so good, but you must definitely feel that I left out part of the actual magic here. What about this module() method? How are GuiceModules defined? And why a GuiceModule and not a Guice,...,Module (i.e.

The module() method is very straightforward. It just returns a GuiceModule which describes
the bindings for the TestComponent:
public ConfirmationProcessor extends TestComponent {
protected void module() {
return new ConfirmationProcessorModule();
// and
public ConfirmationProcessorModule extends GuiceModule {
// bindings go here
So, what is a GuiceModule? Actually, a GuiceModule is mostly a regular It allows you to describe the bindings for a specific TestComponent. However, it is refinable, test-friendly and composable with other modules.

Let's see how it works with several examples:
  • A simple module
    public MockServerModule extends GuiceModule {
    @Override protected void configure() {
    // equivalent to bind(Customers.class).toInstance(mock(Customers.class))
    // this binding declares that in every TestComponent using this GuiceModule
    // the Customers interface will be a mock

  • A module dependent on another
    public MockFactoriesModule extends GuiceModule {
    @Override protected void configure() {
    // in these bindings, we declare that the factories are
    // going to use an in-memory representation of the database
    // the job of a Mock database is to use the mock server interfaces
    // to make as if a saved object was always returned when required
    @Override protected Set<GuiceModule> modules() {
    // add a MockServerModule to the list of dependent modules
    return module(MockServerModule.class);

  • A module with dependencies and local mocks
    public ConfirmationProcessorModule extends GuiceModule {
    @Override protected List<Class<?>> mocks() {
    return classes(DiscountCalculator.class);
    @Override protected Set<GuiceModule> modules() {
    return modules(CustomerOrderModule.class);

  • A module refining another one and composed with other modules for a complex setup
new ConfirmationModule() {
@Override protected List<Class<?>> spies() {
return classes(ConfirmationProcessor.class);

As you can see on those examples, there is a lot of flexibility to allow you to reuse your existing modules as much as possible:
  • by refining them with additional mocks / spies (subclassing the mocks method)
  • by adding another module to form a larger one
  • by removing / replacing another module
One question however remains unanswered. The dependencies among modules could be rather hairy and knowing that you can't declare bindings more than once with Guice, how do you manage to avoid conflicts? You can't afford to add modules referencing one another and face the nightmare of digging out why a given module has been referenced twice.

Transitive dependencies and Set of GuiceModules

The answer to this issue is simple. It is coded in the GuiceModule#getModules() method:
public Set<GuiceModule> getModules() {
Set<GuiceModule> result= new HashSet<GuiceModule>();
result.addAll(dependentModules); // a private list of added modules
result.addAll(modules()); // the ones declared by the subclass

for(GuiceModule m: modules()) {
for(GuiceModule m: modulesToRemove()) {
return result;

The GuiceModule#getModules() method adds or removes modules following all dependencies transitively (and recursively). The design is also kept voluntarily simple here. A Set makes sure that 2 "same" modules are never returned based on them being equal. And 2 modules are considered to be equal if they have the same class.

This forbids the possibility of parameterizing modules but I haven't found the need yet to do that. My guess is also that evolving the design to accommodate that situation shouldn't be too difficult (add a smarter equal method).

Now my real question to the knowledgeable people around here is: is there a better way to do so with Guice?

I've found a way to override bindings from a module with bindings from another module. I've seen that you can create a hierarchy of injectors. I've seen that you can use scopes to specify the applicability of your bindings. But I haven't found anything like a simple "algebra" for modules so that they can be added, subtracted (or unioned / intersected if you prefer). Maybe this post title is provocative enough that I can get an enlightened answer!

Anyway, the supporting code I'm presenting here is just a few lines. All the heavylifting is done in Guice for dependencies resolution as well as in the Factories I've created to mock out the Server and create smart test data.

A Test Component is declaring lots of bindings but and in order to get the corresponding members injected, you either need to:
  • inject the component itself to your test
  • inherit from it
Let's take an example. You want to use the "standard" order built by the CustomerOrder TestComponent. You could write:
public class MyTestWithAnOrder extends TestComponent {
// injecting the CustomerOrder builds an order which itself is injected
// forgetting that line would end up with an "empty" order
// (order.equals(new Order()))
@Inject CustomerOrder customerOrder
@Inject Order Order
public GuiceModule module() { return new CustomerOrderModule(); }
You can also write:
public class MyTestWithAnOrder extends CustomerOrder {
// the order member can be inherited and properly setup by the superclass
public GuiceModule module() { return new CustomerOrderModule(); }
But here's a trick. If you declare the following binding in the CustomerOrderModule:
public class CustomerOrderModule extends GuiceModule {
@Override protected void configure() {
bind(Order.class).toInstance(new Order());
bind(CustomerOrder.class).toInstance(new CustomerOrder);
// given that CustomerOrder looks like
public class CustomerOrder extends TestComponent {
@Inject Customer customer;
@Inject Order order;
@Inject OrderFactory orderFactory;

// this is where, as a post-construct step, the order object gets fully setup
protected void localInit() {
public void GuiceModule module() { return new CustomerOrderModule(); }
Then, you can just access the order dependency in your test without having to reference the CustomerOrder TestComponent:
public class MyTestWithAnOrder extends TestComponent {
// it just works (tm)
@Inject Order Order
public GuiceModule module() { return new CustomerOrderModule(); }

The other tip I want to share is the addition on the Factories objects of autoGet() methods.

Let's say you need, for your test, to make as if a customer existed in the database. The usual thing to do is to use the test factory to create a new Customer object and use mocks to have it returned when someone calls getCustomer(id). But you may actually not really care about which customer it really is. So why not go one step further with an autoGet() method?

Calling autoGet() on the factory will just set-up the mock object representing the CustomerInterface on the Server so that every time a customer is requested with an id, you simply create one on the fly and return it! (see here to see how to do it with Mockito).


I think the take-away points from this post are:
  • Test components are worth investing in
  • But then they have to be highly reusable
  • Guice is a great tool to manage dependencies
  • Dependencies are better off with some operations to combine them
And the last, last, important thing about having TestComponents and dependencies carefully outlined is that it forces you to think about your design. If you find yourself having too many components and objects to inject for a given test, it may very well be sign of a potent code smell.