A++ [Eric Torreborre's Blog]: 2006

06 December 2006

Testing a complex API

We all know the value of testing, and especially the importance of unit testing. However, it is sometimes quite a challenge to properly test some pieces of code.

What about testing classes that access an API that uses complex data structures as input and output?

Testing the Eclipse Modeling Framework

One example of this is the Eclipse Modeling Framework, but I am sure you will find plenty of examples of your own.

This kind of API offers points of entry to modeling data structures such as "PackageRegistry". A PackageRegistry has "getAllPackages" method returning Package objects.

Then, each Package object can be browsed for its content: Class objects, Diagram objects, nested Packages,...

Package, Class, Diagram,... are all interfaces to a complex data structure that can be navigated in order, for instance, to generate code or to create HTML reports. But how do you test this transformation code?

An "out-of-the-box" answer is to use mock objects to create the objects accessed through the API. A fine Mock object framework such as jmock is one of the best for that purpose.

However, setting up the mock structure is:

extremely painful (such as in "extracting-my-teeth-one-by-one")
near to unreadable

Here's what it looks like:


Mock packagesRegistry = mock(PackagesRegistry.class);
List packages = new ArrayList();
packagesRegistry.expects(once()).method("getAllPackages").will(returnValue(packages));

// now create one mock package
Mock package = mock(Package.class);
List classes = new ArrayList();
package.stubs().method("getName").will(returnValue("pack1"));
package.stubs().method("getClasses").will(returnValue(classes));

// and so on,...

This code is so awful that it is almost useless. We need a fresh approach for this kind of unit testing: a mocking DSL!

A mocking DSL

Well, it may be far fetched to call that a DSL, but the idea is to use operation calls to represent the mocks structure and the expected stubs (with real-life code sample):


topPackage = (Package) create(
  mock(Package.class),
  stub("getFullName", "model"),
  stub("getOwnedMembers",
        elist(createClassOne())),
        stub("getNestedPackages",
             elist(create(
             mock(Package.class),
             stub("getFullName", "model.pack1"),
             stub("getOwnedDiagrams", elist(createDiagram())),
             stub("getOwnedMembers", elist()),
             stub("getNestedPackages",
                   elist(create(
                   mock(Package.class),
                   stub("getFullName", "model.pack1.pack2"),
                   stub("getNestedPackages", elist())))))))));

I let you imagine the same mock structure with "classical" declarations,...

Even here, this is not a revolutionary improvement:

it is still quite verbose: more complex data structure may not be readable at first glance (but do you need them for unit testing?)
instead of having "create", "stub", "mock" methods, "package", "klass", "method"may be closer to the domain. If you are to test heavily the complex API, I would suggest to do so.

The MockBuilderTestCase class

The "create", "mock" and "stub" operations are offered by a MockBuilderTestCase class subclassing the MockObjectTestCase (see the jmock doc).

This is fairly simple code:


public class MockBuilderTestCase extends MockObjectTestCase {
   private Stack mockStack = new Stack();

   public Mock mock(Class klass) {
       mockStack.push(super.mock(klass));
       return currentMock();
   }

   private Mock currentMock() {return mockStack.peek();}
   protected Mock callOnce(String methodName, Object value) {
       currentMock().expects(once()).method(methodName).will(returnValue(value));
       return currentMock();
   }

   protected Object create(Mock mock) {
       mockStack.pop();
       return mock.proxy();
   }

   protected Object create(Mock mock, Mock...) {
       return create(mock);
   }

   protected Mock stub(String methodName, Object value) {
       currentMock().stubs().method(methodName).will(returnValue(value));
       return currentMock();
   }
}

As "mock" operations are encountered when java executes the topmost "create" function, a new mock object is created and pushed on a Stack. Then all subsequent "stub" operations are applied to that mock. Eventually, when the "create" operation has all its arguments evaluated, the mock is ready and can be removed from the stack.

I wish I could turn the "eager/lazy" knob for evaluating arguments in java. I often find that using nested operation calls in java can provide an easy internal DSL, however, having the operation arguments evaluated first is not always practical.

Don't forget the Law of Demeter!

A last tip for an efficient and easy testing of classes that uses a complex API: don't let them access a too deep level in the API hierarchy.

That is, instead of allowing modelBrowser.getOperationNames() be like


List result = new ArrayList();
for (klass: getAllClasses()) {
  for (operation: klass.getOperations()) {
      result.add(operation.getName);
  }
}

aim for delegation to classes dedicated to lower levels:


List result = new ArrayList();
for (klass: getAllClasses()) {
   result.addAll(classBrowser.getOperationNames(klass));
}

10 November 2006

Thinking twice about it

We are conducing an interesting exploration project these days.

The product I am working on uses UML models as a primary input. Those models are used for test generation, after setting various coverage criteria on the model elements.

Conceptually, the picture is quite clear: data in, data out. However, the devil is in the details,...

Having the same UML model referenced in 2 different tools leads to pretty complex synchronisation issues between the 2 data repositories:

when an operation is removed, you have to remove the corresponding configuration
when an operation is added, you have to set some kind of defaults for it
when an operation is renamed, well,... you're stucked! You can only treat it as a removal + addition

So working on the third generation of the product, I was challenged pretty hard by the development manager to find anything that would allow us to get rid of those synchronisation mechanisms. I was quite reluctant at first but decided to have a go at it.

Thinking twice about it

I started this exploration and it appears that I may be quite possible:

using the UML profile capabilities of the UML tool, it is possible to add more semantics to model elements and let user define properties on them

using UML2.0 interactions allow us to represent quite nicely the operations that are targeted for test generation

Yes, this could work. But wait! Look at the pay-offs, we wouldn't hope they were so great:

the configuration is always coherent with the model
there's not a line of code to edit, visualize or store this configuration
you have a natural way to define initial states for test generation and to deal with UML elements
this approach is reproducible to other UML modeling tools in the category we're targeting
you don't have anymore test generation projects to manage, because all the information is efficiently stored along with the modeling project!

Okay, okay the "No silver bullet" theorem still applies there. There are some obvious drawbacks:

we are constrained by the UML tool. The properties that are vital for -us- are displayed by -them-

we are, of course, a lot more dependent on the tool bugs (and of course I spotted some, even if I could find turnarounds)

there is no easy way to tell the user what to do to configure the test generation. It doesn't come right into his face (that's the biggest drawback to me - a UI is supposed to provide the maximum guidance to the next thing to do)

I don't know where this experiment will lead us but the lesson is nonetheless interesting:

Think about it twice: can you state your software requirements so that the solution is not included in the question?

"We need an application to configure the parameters on the model elements"

Otherwise this may lead to other requirements,...

"Please, synchronize the data for me,..."

10 October 2006

Writing on the walls

We had this wonderful session last week about "writing on the walls". This tutorial was held by Laurent Bossavit (read here for his contribution to the Agile movement).

The purpose of this tutorial was to teach the use of 3 simple agile tools:

the tasks board
the release board
the burn-down chart

As far as pedagogy isinvolved, the tutorial was really great. In 90 minutes, we had planned 4 or 5 iterations, developed 3 (using dices to decrease the points on the cards!) and really experienced what if feels like to communicate the project status on the walls.

The tasks board

I have been practising the tasks board for some time now and already had this kind of revelation: holding the cards, displaying them, ripping them off, replacing them, writing your names (the pair) on it,...

This "physical", "concrete" approach to the work to be done makes it very real. It makes the people focus on something concrete, not on a line in an Excel spreadsheet or in an html page. Hence, when a card is moved from the "TODO" to the "DONE" column, you can see it's finished.

We found a common pattern for the 3 tools, during the tutorial. They deliver information in a very progressive way. The tasks board answers the questions:

"how far are we done?". Glancing at the board gives you the info right away: GOOD -> many cards in the DONE column, BAD -> many cards in the TODO column (unless at the very beginning of the iteration of course,...)
"how much exactly has been done?". Look at the expected/actual velocity
"what has been done? / what remains to do". Reading the cards tells you what the developped features are

By the way, the exercise is a lot less futile when you have acceptance tests corresponding to each card and if they are green when the card is said to be done. But this is another story,...

The release board

This one is fairly simple. You just display the remaining cards of the milestone, sorted along the remaining iterations. In additions to the cards, you add the expected velocity of each iteration (and make it match to what you know you can do!).

The interesting part about the release board is that it tells you a story. It is like a storyboard of what's going to happen in the project. If done properly, you should understand the plot, the tension between the characters, an overall sense of coherence, and... the happy ending, of course!

I also found that this is a very good tool for the client to communicate with the developers. Most of the time, developers are so much engaged into their code that they end up lacking a vision for the overall product. The release board helps them getting the broader picture and to understand the customer priorities.

The burn-down (or burn-up!) chart

This one is the typical tool of the manager and illustrates a very fundamental point about "writing on the walls". On this chart (in a "burn-down" fashion), you display the stock of points to be developed at the beginning of the milestone, and after each iteration, you add a new point on the chart, showing that this stock is decreasing. So you have a more or less declining slope, made of several segments.

Then, by interpolating roughly, you draw a line that crosses the X axis, giving you a bold estimate of the release date for the whole product.

Now, what is a burn-up chart? Well, reality is not very linear. Every once and then new features or weird discoveries pop out, and your stock will be increasing. You can display things the other way around, showing that each iteration adds features and try to reach a "ceiling". Then, you can easily raise the ceiling bar if necessary. You will also be able to get an estimate of how much the ceiling is raised along a typical milestone and use this data for your next planning game (thanks Olivier for convincing me on that point!).

To sum it up, the burn charts give a graphical representation of the velocity of the team for a whole milestone and helps conveying the "are we done yet?" data on a large scale of time.

Oh, by the way, what was the fundamental point?

The fundamental point

The fundamental point is not "cool, everyone sees the project progress".

The fundamental point is "cool, everyone cannot help but see the project progress". Developers are facing their responsabilities of trying to maintain the best velocity (mark this word: maintain,...). And managers have to acknowledge that it takes time to deliver software.

Information is not only displayed but it imposes itself to everybody.

14 September 2006

Interesting software vs Plumbing

What makes interesting software? What makes hard software? What makes fun software?

Customer affinity

It all started with Martin Fowler post: CustomerAffinity. In this blog entry, Martin makes the point that "enterprise software" can be much more than "boring" and "shuffling data around". If you have enough interest for your customer (CustomerLove?), you can be pretty much delighted by finding the best ways to solve his business challenges.

So, as a programmer, even if you should be interested in frameworks (he calls that "plumbing"), you should be even more interested in the business that's supported.

Well, this has been more or less my opinion for years. I love to formalize, structure the fuzzy concepts that business people are toying with.

It is fascinating to see how human creativity can elaborate very complex things from very simple ideas. For instance, Thales is seemingly credited with the idea of buying the option to use olive presses during the next harvest. He bought the right to use something from sometime in the future. Now, play with the words in italics and you will get all the subtle ways to define options.

Is it boring?

Fine, fine, but Ravi Mohan disagrees with force (But Martin, entreprise software IS boring). He advocates that the talentest programmers would rather code a compiler than a loan-disbursing process.

So, "Computer science" and mathematics could be much more interesting than "enterprise software"? Reginald Braithwaite entry about business programming provides some answers.

Not only, getting the concepts right can be interesting (how do you define a "fraud suspicion"?), but this can raise really interesting computing issues. He takes the example of a system for routing concrete trucks: what are the optimal paths if you take the traffic into consideration? What if an accident occurs?

And Reginald raises the bar higher: can you do it in real-time? For thousands of trucks?

So, what's interesting in software? We can get some hints from those blogs:

customer business (Martin's interest)
mathematical / algorithmic challenges (from Reginald's blog)
scalability / performance issues (Reginald again)
computer-related stuff : languages, compilers, protocols, graphics-intensive software,... (Ravi )

I would also add one source of interest that is rarely given much interest as such: maintainance. Bleech! bad word, how can it be desirable? Well, when you code and design software day after day, you can also meet pretty tough challenges around maintainance. Is your code readable enough? How is conveyed your design? Can the software be plugged, unplugged gracefully? All these questions sometimes require specific talents that may be not found in the "hardcore programers" set. So, let's add one more point to the list of what makes software interesting:

software maintainance 'art'

Fun happens

So far, so good. There's even a good news: you may have to deal with all of these in your next software project! The chances are very thin but software is such a continuity that you may have to implement a software for hospital management:

what's the content and life of a medical file?
how do you allocate efficiently resources such as rooms, surgery equipment? (what if a big -think 'terrorist'- accident occurs?)
how do you do that for thousands of patients, in real time?
how do you get x-rays pictures from the equipement to a secured medical file?
how do you support a domain-specific language for an expert database?

Anyway, you will be bored somehow

Fun stuff, no doubt. So, what's REALLY boring? I'll tell you. This is what -I- call "plumbing":

software compatibility: "oh, version X.Y.Z is needed for this dll!"
environment variables: "BLOB_DIR was not set. How was I supposed to know that?!"
defining paths, installing stuff: "a new Windows update is there, you should restart your computer"
interoperability: "this very useful piece of java code is supposed to communicate with this VB crap,..."
fighting with your programming language to express simple ideas: "look! 6 lines of java vs 1 ruby line of code"
lousy operating systems: "what?! cmd.exe can not display ansi codes? What year are we?"
fiddling with your UI framework to get what you want: "why on earth is this not resized?"
and yes: business software that's repeating itself, either in the concepts or in the implementation,...

In the end, I feel it is very important to track the "boring" parts of software. Either make them disappear or become very good at dealing with them efficiently or hire someone to do that for you (why would "boring" be an universal definition?)!

And,... allow yourself to play from time to time.

So relax and enjoy

This is why I appreciate the RubyQuiz. This week's quiz is about a compression algorithm. I had a chance to learn a completly new field for me, without pressure nor any of the 'boring' constraints. I had everything at hand with a ruby interpreter (ok, not entirely true: I installed Radrails,...).

Another grid of evaluation for recruitement

Concluding thought. All those points can be considered when hiring someone:

comfortable with concepts?
has knowledge of the mathematics, algorithmics involved in programming?
knows how to deal with scalability/performance issues (both in software and hardware?)
is confident with parsing, ASTs, GUI, disk accesses?
can produce maintainable code?
can minimize the effects of boring activities? (knows how to install a full production system without spending hours on the web?)

Wow, this looks like a whole program that I can apply to myself! And I didn't even talked about project management and people,...

01 September 2006

Ruby Quizzzzzzz

After 3 weeks of vacations (a wonderful place in Greece), I felt I needed some kind of warm-up,...

So I undertook the idea to answer the week's quiz from the RubyQuiz site every week.

This week quiz was especially fine since it was also simple enough to be used as an introduction to Ruby for our weekly Dojo at work (see here and here).

The subject of the quiz is to have a class that's able to transform days of the week into a human compact format:

[1, 4, 6] should be displayed 'Mon, Thu, Sat'
['Monday', 'Tuesday', 'Wednesday', 'Thrusday', 'Sunday'] should be displayed 'Mon-Thu, Sun'

This looks simple but may be tricky to write properly,...

It was very interesting to take a look at the other submissions after coming up with my own solution:

I discovered there was a way to use the Date class to get standard day names
You can use infernals text manipulation ideas to get the job done (short but unreadable to me)
No one used Sets or extended the Range class like I did
There were only few submissions creating ranges from an array with successive integers in a readable way (this is were I struggled the most)
Not everybody submitted unit tests
I was the only one to submit specifications using rspec
After writing down my own solution, I just modified it to take advantage from the "inject" method of Enumerable to make my code more concise

Well, I look forward for doing the next quiz, or even best,... submit a quiz myself!

02 August 2006

Agility and legacy code refactoring

Let's say you have some bad legacy code (I have yet to see a codebase without any,...). When, how should you refactor/reengineer it?

In an agile context, there are conflicting forces related to that matter:

deliver as much value as possible
have a unit test for each line of code
refactor mercilessly
raise your velocity

Real cases from real life

Ok, this is very abstract, let's take some real cases:

1. You need to implement a cache to lower memory consumption (this is the consequence of a customer request) and you realize that the objects you are building are not properly build:

instance = new Model(xmlContent);

Creating a model doesn't have to be dependent on a specification representation format. You'd better use a factory for that. This code should be refactored,... But is it the right time to do so?

2. You found the fix for a nasty bug. Very very nasty, ... But deeply buried in a horrible mess of interactions. You have to write a unit test to correct the guilty class. Easier said than done. A real unit test may require a lot of refactoring, just to untangle the class dependencies.

What should you do? Write an "integration" test? Take time to write a unit test, refactoring everything that's necessary?

3. You have to add a new import/export functionality. However, your persistence design is so bad that implementing the function looks like hell,...

Should you refactor your persistence design? This may halt the team for many iterations. Should implement the function, going through so many loops and hooks, but doing it in a realistic time?

The books/consulting answer

I would like to point out an easy answer from a well known book or website, but I haven't found any. No, wait, there's one! The famous "consultant answer": "it depends,..."

Whose call is it?

Well, before emitting an opinion on the subject, I would like to add some more questions: whose responsibility is it to answer? Is it the client? Is it the team? Should they bargain refactoring actions around the velocity issue?

I think,... (drum roll,...) that this is,... (drum roll again,...) a development team issue. Let me list several reasons for that:

the team has the responsibility to improve its own velocity
the team has the best knowledge on how to do it
the team has to live with the code day after day and suffer from its sluggishness
really empowering the team also means letting it decide on how to improve its velocity

Now, what can be the policy for the 3 above cases?

Refactorings in the 3 cases

1. The "oh, this should be refactored" case

The pair-programmers should not refactor any code that doesn't get in their way. 3 reasons for that:

keep the velocity up,
don't risk to break existing working code,
work on that code when really exercising it (so they don't take the risk to over-refactor)

By the way, this is such a real case that we incurred a several days development penalty lately for not following this rule (hence the idea for this blog entry,...)

2. The "nasty bug" case

This one is a "must do". You have to write a unit test for your fix, even if it is difficult. If you don't do that:

you take to risk to face the bug again
you may write an "integration test" that will slow your test suite and may be very fragile
you open the door for your QA team to write more fragile validation scripts
you don't show to path to more testability and better design for the guilty class

Having said that, the refactoring can range from a complete reengineering of the guilty class, to the sole extraction of the guilty behavior in a special-purpose class, passing by the only isolation of the bad class.

On that real case, we chose to reduce as much as possible the class dependencies, just to be able to test the fix, and only that.

2. The "new functionality/bad velocity" case

This one calls for a trade-off (the famous "it depends").

On one hand, you cannot spend several iterations on your brand new this-time-I-will-get-it-right-design. On the other hand, adding functionalities on existing debt adds more debt (and we all know about interest rate composition).

The only duty here is to do at least something. You have to set the path for the reengineering of the offensive design: implement a tiny part of the solution and communicate a new vision to the rest of the team.

Then, on any of the 3 refactoring cases, another pair can pick up the new design and push the codebase towards the new path. The net result of this strategy is:

a velocity that will better reflect the codebase state
an investment against more debt if not an investment towards a better velocity

This is also a real case,... we chose here to prepare the path for persistency isolation by introducing several abstract classes upon the base class of domain entities to isolate the persistence behavior. Then we have to move some entities to an abstract class with less dependencies.

Let the team clean the house

I think this is very important for the team to communicate a lot here. The first case calls for information: "refactor that, next time you have develop/fix that code".

The second case needs communication around the new concepts introduced to enhance testability or if the extraction that has been done is a bit clumsy.

The third case, presenting a compromise and beginning some kind of pharaonic work around the existing code warrants obviously for a lot of discussion.

It's like house-cleaning,...

if you don't do it regularly, it can only get worse
you can fix a broken bulb more easily if you have ready-to-use tools and spare parts
building a new ground-heat heating system is a long project but this will be a nice replacement to your ever-broken old one. In the meantime, you can still add a new remote control, even if it would be easier on the new system.
you have more fun when you appreciate where you live!

[Update 03-08-2006]

Ian Roughley has written a very good article on a non-intrusive technique to write tests for legacy code. Add some logging traces and a Junit appender!

This is a really nice technique to use when you want to fix a bug and don't have time to break all dependencies to write a unit test (or you can't take the risk: see case n. 2).

You can also use it before refactoring some code, but in any case you'll have to get rid of those tests sooner or later (see my comments on the article) to replace them with "real" unit tests.

20 July 2006

Unit testing with style

Writing unit tests looks like an art sometimes. You want to:

test your class functionalities
test it properly in isolation
convey the specification of the class as clearly as possible

Following those objectives, we encountered today some issues that are most certainly classical but not necessarily handled properly.

Setting-up of an input object hierarchy

Our tested class is a factory that takes an object hierarchy build from a parser and creates another hierarchy of objects.

To make an analogy with an xml configuration file, the first hierarchy is a set of xml nodes and attributes, the second one is a configuration object with specific composed options. The issue is: how do you build the input object hierarchy in a way that's readable and really conveys its expected structure?

The unreadable way would go like that (in java):


AtomParameter attribute1 = new AtomParameter("attribute1");
AtomParameter attribute2 = new AtomParameter("attribute2");
ListParameter classList = new ListParameter();
classList.add(attribute1);
classList.add(attribute2);

AtomParameter class = new AtomParameter("class");
ListParameter modelList = new ListParameter();
modelList.add(class);
modelList.add(classList);

And the readable way goes like:

Parameter parameter = list(atom("class"), list(atom("attribute1), atom("attribute2"));

By implementing the proper 'list' and 'atom' functions (returning List/Atom Parameter) the resulting code conveys the expected structure much more clearly. Unit tests as a software specification gets more real!

Testing in isolation

Not a new subject here, we use jMock to isolate our class. However, here's a pattern we've used some times. Our factory class declares 2 constructor methods:


public static ModelFactory createFactory(Project project);
public static ModelFactory createFactory(Project project, Translator translator);

In that case, the first method uses a default Translator implementation and the second one allows for another Translator, which is going to be our mock object of course. The second method may look like a pure test-only method but if you come about it, it is just extending the Factory API towards more flexibility.

By the way, I would also like to point out in this entry, one of the difficulties with mock objects in general. There is no way to specify the protocol for using an interface in a central place. If operation1() must always be called before operation2() on an interface, the mock expectations won't really declare that. Moreover, the expected protocol cannot be enforced from any unit test where the interface is called.

Protocol errors will then be catched by integration tests (acceptance tests for instance). The creator of jMock, Nat Pryce, has attempted to get around this, but he ultimately thinks this is too complex to do in Java (see the pragprog mailing list archive).

Expressive assertions

JUnit has blessed us with quite a few assertXXX methods but I have recently followed the advice found in this article, and it really changed the way we use assertions.

For instance, testing our factory goes like this:


assertThat(aFactoryUsing(model1).createInstance(parameters), containsSlots("a", "b"));

assertThat is an assertion method from MockObjectTestCase
aFactoryUsing() is a method creating a properly setup class to test
createInstance() is the tested method
containSlots() is a method returning a custom Constraint object (from the jMock framework) and doing the necessary checks for the test

The nice thing that comes for free from jMock is then the ability to compose custom and standard constraints. For instance:


and(containSlots("a", "b"), not(containSlots("c", "d")))

First-class code

At the end of the day, unit testing with style is not wasted time. If you consider your unit test code as first-class code, that must be as factored and expressive as possible, you will really facilitate communication, design and maintenance (especially test code maintainance, which can be otherwise very, very, very expensive,...).

23 June 2006

Feedback from another Dojo

Today's Dojo was interesting, at least interesting enough to keep a trace of it.

The subject

I wrote a small Dojo subject a week ago that was selected by the group. The subject was to have a DSL to describe Recipes and to be able to compute some data about it.

Let's take the easy-child-access yogurt recipe:

Pour one yogurt
Add 3 eggs
Add 100 ml of vegetal oil
Add 200g of sugar
Mix all with 300g of flour
Cook during 30' at 180° celsius

The idea is to be able to compute:

the list of ingredients with their quantities
the cooking time
the preparation time

Of course, you can start with simple calculation rules and add more complex ones such as:

"Melting 50g of chocolate takes 30'' in the microwave"

You can also play with some grammar aspects:

synonyms: Add, Pour
complements: in a jar, with a mixer
...

I had several objectives with this subject:

Make people collaborate on a quick design session
Implement a parser for a DSL

So far, people decided to vote for this subject and we chose to focus on the quick design session then on some coding.

The Quick Design Session

This session was elaborated with somebody at the board during 5 mn, sketching a domain model. We quickly converged on the main classes: Recipe, Action, Ingredient, Quantity and thought about some kind of PreparationTimeEngine that was responsible for calculating preparation time depending on an action and its ingredients.

The acceptance test

We started coding an acceptance test for a very simple recipe: a yogurt with sugar!

Take 1 yogurt
Pour 10g of sugar

And programming the acceptance test took us the remaining 35' we had on the session,...

The retrospective

Yes, the result was not tremendous, and I am still thinking: "wow, 5 poor classes, 1 acceptance test = 1,5 hour, not a great productivity".

Anyway, I liked this Dojo for what we learned:

:-) The quick design session pleased everybody because they really had the opportunity to share a common design. Something to do again "live", because this really accelerates the feeling of common ownership

:-) I liked the emphasis on getting the acceptance test as readable as possible and the hunt for local variables that were not necessary

:-) I liked the discussion around the necessity to have factory methods: createAction,...

:-) There's no better place to jellify the group around fun and quality code

:-\ Some pair co-pilot were relatively silent. Their role was taken up by the group

:-\ Yes, I still think the productivity was poor and I think this shows plenty of improvement room for the group

:-? We should try to enforce the rule that the group is silent and only the pair should talk. The other ones could take notes, observing the functionning of the pair

:-? I wondered about the use of Java for this kind of Dojo. Python or Ruby would be more appropriate and productive

:-? It is certainly better to prepare the Dojo subjects, so that the participants can get up to speed in a limited time. I had prepared a written spec with an acceptance test, but written acceptance tests would have been better. I think this is even more true with some of the subjects we wanted to do: Bayesian Filter for spam, Java Plugin Framework,...

Now, let's see how goes the next one.

22 June 2006

Responsabilities dilemna

When can you say "too much is too much"?

Let's say you have an object: Customer. Wow, how original! Anyway, please bear with me.

So, your Customer class has some "customer info" responsabilities, but also it may be a "billable customer" or a "prospective customer" (for cross-selling). Hey wait, it can also be a "persistent customer". And so on.

How do you manage all these responsabilities with a classical object-oriented language?

The Horror show

First of all, you could put all the code in the Customer class. Boo-oo! "OO for beginners" tells us its bad, you end up with a 2000 lines class.

or,... you could have only a Customer class containing the customer info and a BillManager that bills a customer, a SellManager that tries to sell him stuff,... No, I'm kidding, let's forget that one too.

or,... you could inherit from a PersistentObject class, derive BillableCustomer from Customer,...

Time to stop the horror show.

Delegation forever

Then you can put everything in the Customer, but it can delegate to other objects. For instance, customer.save() would delegate the real action to a CustomerRepository. This would be the same for other responsabilities.

What about testing? Oh yes, testing,... Ok, you can use your best-of-breed dependency injection framework to inject the proper objects behing finely crafted interfaces.

Right, but what about dependencies anyway !? Your Customer class is still not reusable without a whole bunch of interfaces! What if you don't want your customer to be billable at all?

Ok, you can set the persistence aside by having the Customer clients make calls to an appropriate factory after dealing with the customer. But what can you do with the billing responsability (No, not the BillManager again,...)?

Object inheritance

Another way, suggested by the Streamlined Object Modeling book is to use "Object Inheritance". You use the Actor-Role pattern and define 2 classes:

the actor: Customer class
the role: Billable customer class

A billable Customer class will have the same interface as the Customer class and will delegate all customer info requests. We could say this is a kind of "external delegation" in contrast to the "internal delegation" mentioned before.

When discussing this pattern with a friend recently, I told him that I felt uneasy with one thing (beside the heavy use of delegation which results in silly code in Java). It was strange for me to have 2 objects representing the same logical entity.

He told me to consider another way to see things: BillableCustomer would be a "decorator" of Customer. If I need a BillableCustomer, I write: new BillableCustomer(customer) and do my job with it. I really like this way of seeing things. There's still one drawback in my mind: if BillableCustomer needs to be persisted, you still have to consider 2 entities for the same logical object in the system.

Dynamic languages to the rescue

In the end, what I would like to do is to mix behaviour into an object. This is exactly what Ruby allow me to do: I have a Customer class and I can extend it Billable responsabilities at runtime:

customer.extend(Billable)

This is the idea of "Duck typing". You cannot rely on the object type alone to know its capacities. If it walks like a duck and quack like a duck, then it is a duck (even if it is really a swan,...)

There is certainly closer to "object-orientation" than any class-interface-inheritance pattern. If we think of it, object are often thought as human when doing object analyzing. And we, as humans, often learn and forget, get abilities and loose abilities. Objects should be able to do the same.

A new persistence challenge

But "when is easy, too easy?".

Yes, when it comes to persistence, how do you do in Ruby with something like that? ActiveRecord was not thought with that kind of use in mind.

;-) I have no doubt that Rails fanatics could come with a fairly elegant solution to that problem.

18 June 2006

Enlightment, planning and convictions

Sometimes, learning new things is scary.

You've just learned something you find very important, acquired a new skill and you ask yourself:

"How can I have worked without understanding something so crucial during all this time?"

Well, another more useful question usually pops up in my head:

"Why was I not able to pick this up before?". This is certainly more useful because it opens up the possibility to pick it up faster next time!

Recently, I got the feeling that I learned very important lessons about the raison d'être of agile methodologies.

The cards array enlightment

I remember standing in front of our cards array contemplating the next iteration planning and the result of our last planning game. I was thinking: "now I know why I was not so effective as a project manager for planning the activity of the project" (better late than never!)

Planning, yes but why?

I always had difficulties for planning software activity, and was even beginning to think that development was something like the butterfly story introducing any chaos theory presentation: a beat of wings and milestones go away.

Many of my attempts to get more insightful plannings failed. Here's a set of recurring issues:

how can the development planning stay in line with the global roadmap (especially if it is very "ambitious" but also a moving target)?
how do you plan bugs?
how can you make accurate predictions regarding the delivery date?
how do you fight developers natural optimism ("well, 2 days should be ok for the parser")?
how do you plan non-development time (meetings, holidays, illness,...)?

Moreover, every "aggressive planning" I saw was the recipe for repeated failure. And the effects on the team are disastrous: "Whatever we do, we are late". But "late" relative to what? A release date set by a manager and approved by a developer muttering to himself "I'll do what I can". So what's the real use for a planning?

A manager answer included: "setting a challenge to motivate the developers". This is the "aggressive planning" theory. A really counter-productive one, since it leads to the "We're always late" syndrom, which is soon followed by the "We're poor developers" consequence.

I had my own variant on aggressive planning: "Watch what you do and register time passed". I thought that doing this will drive individuals to be more conscious of "unproductive" time and get rid of it. I was wrong. A developer can chase "unproductive" time only if he has decided to, not if he has to fill a time report precise to the hour (I have even read about minute-precise time report, where bathroom time is to be precised!). And then, as a mean, he can watch his time spent.

So I figured out there must be better reasons for doing plannings (instead of the "I develop to my best abilities, you pick the product when ready" approach).

One of the reasons is resources management. If time-to-market is a crucial issue to the Product owner, then he may wonder "how many people can I add to the project to get it faster?". Once you keep in mind the Mythical man-month essay by Fred Brooks (and this funny anecdot), you can still try to add people to reach the project optimal size.

One other reason, is some kind of control over what is developed. Plan your activity, do the job. If along the way, the plan is not respected, this may be interpreted as a warning. However, if the meaning of the warning is that your plan was not accurate, you have not gained much.

And maybe the most evident reason is the need to communicate dates to people outside of the project: customers, sales representatives, consultants, investors,...

Planning revisited

After our 6 months effort to implement the full agile practises stack (we already had continous integration, unit testing, iterations), I now understand why this works:

you deliver software in a time-box manner and focus on priorizing customer value, having both short-term control (the iteration) and mid-term control (the milestone)

you have a very simple measure for productivity: velocity. You globally measure the capacity of the team to deliver useful functionality. Meetings occur, bugs occur, obscure technical work occurs but it is not counted as such

estimations are more reliable because of the planning game. Indeed, estimating a task is better done by a bunch of people having different experience and knowledge about it (I used to do that on one-to-one with the "concerned" developer - which is not a relevant concept in eXtremeProgramming anymore)

estimations are also more reliable because of the frequency of the planning games (every 3 months)

By the way, I want to emphasize all the benefits of the planning game:

developers have the opportunity to understand precisely the customer objectives
the customer can get a better grip on development risks and difficulties
the team is really empowered and feels responsible for the overall project

Let the team manage itself

The last point is what stroke me in front of the cards array: many brains are better than one.

the team is much more efficient at organizing the project than the project manager
the team is much more efficient at defining and estimating the tasks
the only way to get the maximum productivity is to trust the team members and let them set high-productivity standards. No management indicator can ever imply high productivity. You cannot measure commitment, smartness, collaboration

The consequences for a project manager seem pretty clear to me: you are only here to help the team manage itself.

How to do better next time?

Thinking about that, I concluded 3 things:

I should have experimented more. Some agile practises seems evident. Continuous integration is one of them (spend machine time instead of developer time). Some other are not so intuitive: why use physical cards to track development tasks? It is not until you hold one in hand that you can feel why

Empower the team. This is the brain number effect. I also find some relations with Michel Crozier's theory regarding the fundamental liberty of any worker to gain power around him. This also means that team management can never be an easy path

I should live up to my convictions. For instance, I was not strong enough on emphazising why continuous integration should always be green, no matter what (this one I experimented)

Don't give up on your conviction

We all make mistakes, learn more or less painful lessons. However, when we learn something the hard way, we'd better not forget about it. What we learn become our convictions and the remaining questions have to be experimented.

I recently read a blog entry of an experienced developer analyzing the reasons behind his software projects failures. The whole entry is worth reading and wondering about. But one specific thing was worth remembering for me right now:

And if you decide to make changes, have the courage to go 100% with your gut. I've failed more than once when I watered down my convictions in order to appease dissenters. The only thing worse than evangelizing change and failing is looking back and realized you might have succeeded if you'd held firm on your convictions. What a waste!

I won't be caught twice on that one.

19 May 2006

More analytics, less mechanics

The opening keynote of ICSTest Düsseldorf 06 was: "Tools: what we really need and how to get it?".

Mechanical tools

The speaker made an important distinction for testing tools: most of them are "mechanical tools".

Some of them are just a way to store data and to support a process. They are nothing more than a specialized database.

Other ones are dealing with the execution of test cases on different machines, with different environments. They are nothing more than a specialized batch file.

The last ones are experts at interacting with gui components to insert and retrieve values. They are nothing more than a specialized keyboard and mouse simulator.

Of course, this is an oversimplification and each of these products is not easy to get right:

Test management tools must deal effectively with concurrent modifications of shared data, they must provide nice edit and search capabilities, they must support every possible workflow

Distributed test execution tools must help with different hardwares, different OS: reexecuting the same C script remotely on 2 different systems is not straightforward for instance

Graphical test tools must allow the creation of test cases independently of the application-to-be and allow a flexible and evolutive mapping between the test cases and the graphical components

But still, this is just dealing with plumbing (see my previous post,...). Any serious attempt at managing the complexity of testing today systems should allow you to:

get objective information about your actual coverage
refactor test sequences and extract the common parts
propagate modifications of the specifications to the test plan
analyse test dependencies
generate test cases from the specifications
propose test strategies from the system typology
combine components test cases into system test cases
...

Analytical tools

All these tools would be called "analytical tools". They support the human mind when it is somehow limited:

exploration of exponential number of possibilities
analysis of complex relationships
metrics-based decisions
exhaustive application of systematic rules

Why are there not more of those tools?

The "low-hanging fruit" rule

The answer is "the low-hanging fruit" rule.

Most of the time, the mechanical tools are easy to implement and bring immediate return on investment. This is why big companies such as Mercury or Borland focus on "process" tools and their integration. Moreover, those tools appeal to the common manager, they tend to give the impression that everything is under control.

You can use TestDirector and feel that you can really manage a fine test plan covering your requirements. But how do you know you've written the right test cases? How do you know you got the minimum number of them for the best coverage?...

By the way, there is a similar argument for the CMM process improvement. You may set up reviews and yet do terrible ones, you may have source control configuration and yet create unnecessary branches, and so on.

In the end, one of the speech conclusions was that the innovative companies should be supported to get greater benefit for special analytic tasks. The answers won't come from the big corps.

Mercury and co. will provide mechanical tools to support your mechanics, innovative companies will provide analytical tools to support your brain!

18 May 2006

Back to language roots (and to plumbing once in a while)

I have been musing a lot with computer languages recently.

Parsing Ruby code

First of all, I needed to be able to analyze some Ruby classes and extract part of the operations behavior. For instance, I needed to parse the following code:

if (p1 == 2 && p2 == 1)
@attribute = 3
else
@attribute = 4
endif

I wanted to extract something like:

p1==2 && p2==1 'implies' @attribute = 3
and
not(p1==2 && p2==1) 'implies' @attribute = 4

So I started to write my own parser, that would take the ruby class file and:

-find the class definition
-find the operation definition
-parse the if-then-else-endif expressions (that can be nested,...)
-...

There's got to be a better way!

A better way

Google was my guide to a super ruby library: ParseTree (http://rubyforge.org/projects/parsetree).

ParseTree takes some ruby code and return a "sexp" (symbol expression) that represents the program being parsed. For instance:

def example
1+1
end

becomes

[:defn, :example, [:args], [:call, [:lit, 1], :+, [:array, [:lit, 1]]]]

Then, the ParseTree library offers a Sexprocessor class that allow the easy consumption of the sexp.

This is fine for the theory. The usual practise of a programmer is less shiny:

I had to download also RubyInline which is a library that allows c code to be compile then called by ruby code

I had to let RubyInline compile the ParseTree c code, which took me some hours to do tweak it right, from modifying part of ParseTree c code to modifying the RubyInline compilation command to work on my Windows laptop (the ParseTree/RubyInline folks don't seem to be willing to live with Microsoft around). If you encounter the same difficulties, send me a mail, I'll try to help you

When I do that, I really feel like a computer plumber, there are so many more interesting things to do with a computer! Anyway.

Then I realized that the trip wasn't over: parts of the sexp had to be translated back to Ruby code again!

Back to the roots of programming

This is where I found (or more exactly refound) Paul Graham articles on Lisp (http://www.paulgraham.com). I was really fascinated by the data <-> code equivalence offered by Lisp. The syntax is simplistic and the code is already expressed as a syntax tree!

The funny thing is that the first language I was taught in my engineer school was Scheme, a Lisp dialect. At that time, I mostly saw the power of recursivity, but not this idea of extending the language itself with macros, and so on.

One more funny thing before returning to Ruby: Lisp was not invented as a new language, but more like discovered as an experiment to find another computation axiomatisation than Turing machines (John MacCarthy, 1957!, see Paul Graham's article).

From Ruby to sexp to Ruby again

Anyway, back to Ruby, the idea is to use another Ruby library: Ruby2Ruby (in the zenhacks gem) that should do the trick. I have not yet finished the round-trip experiment, but this should do it. The idea behing Ruby2Ruby is to implement most of the Ruby language as Ruby code, leaving only a few primitives translated to C. This provides some interesting lines of code in the library tests, check it out:

r2r2r2 = RubyToRubyToRuby.translate(RubyToRuby).sub("RubyToRuby","RubyToRubyToRuby")

Good luck with that!

A mini-language for acceptance testing

Today, I wanted to write acceptance tests for our generation algorithms. The trouble is that our algorithms explore a tree of possibilities based of the system behavior. What I would like to do is:

to specify a pattern of possible behaviors
having some java code generate the system behavior based on the pattern

Not clear? Let's say I have the following pattern:

[a*|b]*cd. This would mean that, if my system prints a, b, c or d letters when stimulated, then the printed string must follow the [a*|b]*cd pattern.

So, rolling up my sleeves, I first tried to implement a parser for those types of expression. Again, implementing my own parser? There gotta be a better way!

A better way, revisited

I thought about JavaCC, Antlr and then I recalled an article about JParsec. JParsec is a port of the Parsec Haskell library. The main difference of JParsec with JavaCC and Antlr is that it is not a code generator. You do not feed it with a grammar and get back a parser for your (mini?)language.

So I tried to define my mini-language parser with Jparsec. Unfortunately, I got stuck by the lack of available documentation (the codehaus server was down all day long, it still is). At the end of the day, it looked like I was back to plumbing again, having a library and using it as a blind man with trials and errors.

Perseverance, mini-languages are a must-have

I should many collect writings on the qualities needed to be a good programmer. Perseverance seems to be one of those. It would be very tempting, in my situation, to let down the JParsec trial and to write my own parser. Or to let down the whole mini-language idea and to write ad-hoc acceptance tests.

However, I feel that perseverance here is important. Mastering the creation of mini-languages is such a powerful tool in your toolbelt.

Because the best way to leverage the assembly language was to create a programming language, the best way to leverage a programming language is to create mini languages that are adapted to your domain.

Build up on your own language

Related to parsers and the use of mini-languages, I would add a concluding thought: every computation should be done within your programming language.

This is why I like Ruby and Rails.

This is why I don't like java AOP: when you use java AOP, the syntax is specific for the annotations, you have an extra "weaving" pass in your development process.

In the end, this is where Lisp may be leading us (or only me?): parsers and code generators should be included in the basic toolkit of any decent programming language and every programmer should master them.

20 April 2006

Funny recruitement ad

We were writing a recruitement ad yesterday and were thinking about the kind of skills and experience we wanted for 2 senior developers.

My colleague showed my the web site of a guy presenting himself as Erlang code. This was really funny and I had the idea of writing acceptance tests for the ad!

It went like that:

public class CandidateAcceptanceTest extends TestCase{
Developer candidate;
public void setUp(){
candidate = new Developer();
}

public void testTechnicalSkills(){
assertTrue(candidate.isJavaExpert());
assertTrue(candidate.knowsJDO());
}

public void testMethodologySkills(){
assertTrue(candidate.canTeachAgileMethodologies());
..... and so on
}

We were even arguing about how writing the acceptance tests that would best describe the "optional" skills: the ones that were desired but not mandatory.

The result was indeed quite interesting because I thought: the guy (or gal) that will read this ad, understand it and like it should be on the right track for the job!

Dynamic languages are scary?

I have noticed a post on the Ruby on Rail mailing list that catched my attention today:

> I somewhat agree with you, but I recently
> realized I don't even know *why* I feel that way. Is it because
> dynamic scripting languages, by being way less chatty than something
> like Java, somehow feel more fragile? I seriously wonder.

It was about implementing financial transaction systems with Rails.

This thought catched my attention since I have been programming a lot on my free time recently with Ruby (as an attempt to prototype some ideas that I found revolutionnary for our product).

While programming, I had many times the impression that I would accidently misname a variable or a method and that my program would fail at run-time. No more compile-time safety. This is like jumping from a plane without knowing if your parachute will open at the right time. And this really made me feel uneasy.

Think about it: you program in java with Eclipse or whatever advanced IDE. Every class name or method name can be safely completed, researched, refactored,... This was a real advance for day-to-day programming. With Ruby, or Python, this is all gone. Scary, isn't it? How can you build large systems, knowing that a single mistyped name could bring the whole system down without you even knowing it right now!?

I was indeed scared! However I could also recognize that I had been able to program a quite complex infrastructure in Ruby that was doing the job! So, what are the keys to more confidence in dynamic languages?

The keys to get more confidence are interesting since they are anyway the foundation of better development practises.

TDD virtous circle

I have been practising TDD for this project, I have been checking my coverage (with rcov) all along and I have paused enough to consider refactoring each time the tests were green.

So the first key is "start the vertuous circle of using TDD". You can start using TDD, just because you're afraid to break anything, anytime. The added benefit is a Test Driven Design.

Ruby is your best refactoring friend

The second key is "use Ruby capabilities to refactor to the maximum". Ruby on Rails has really shown that "less is more" and that duplication could be avoided up to the limit. Most of this magic is done thanks to Ruby. Ruby helps factoring code well beyond what can be done with Java (without AOP and all that metadata stuff). The added benefit is a readable code, aiming at representing the domain and nothing more.

Good engineering is here to stay

The third key is "use your object and architecture skills". Whatever the language, if you design carefully your system, with encapsulation, delegation over inheritance, layers, patterns,... you should end up in a system where undesired effects can not ripple too far. The added benefit is a maintainable and evolvable system.

Fragility turned into strength

So, basically the lesson here is that Ruby (and al.) fragility practically forces you to sign for sound development practises. You get more confidence in your development and the added benefit is that you develop it faster!

01 April 2006

Fitnesse and Rails : bridging the business/technology gap

I have been so far quite enthusiastic with 2 frameworks: Fitnesse and Rails.

Fitnesse: executable specifications for the developers

Fitnesse is a framework that allows you to create HTML specification pages in a wiki. Those specifications can include any form of text, images and HTML links. But they can also include tables that specify acceptance tests for your system.

Then, provided that a developer (or better: a pair of developpers,...) has provided a "fixture", that is, a java class that will implement the table, the HTML page is executable! Once executed, the page shows the expected/actual results and uses the famous green/red colors to display those results.

I see many advantages to this approach:

The HTML pages look like a classical specification to the analyst
Having the analyst specifying very precisely expected values results in much better specifications
This makes the analyst and developer discuss very precisely the requirements and their possible implementation (when creating the fixtures)
This forces the analyst and the developer to think about the testability of their system, which usually results in better designed systems
This is the start for a global Test-Driven approach to system development

There are still some points that could be improved:

The wiki markup language itself is weak and would really benefit something like Confluence (if Atlassian guys hear me,...)
The integration into a continuous build system is not standardized (with an ant task for instance)
The implementation of some concepts is not first-class (try to add something like ${myVar} in a table,... Fitnesse will try to evaluate it, even if it is in fact part of your specification)

Anyway, this is the first time that I come across really useful and not deprecated specifications, which is quite uncommon in the industry.

Rails: a bliss

It is hard not to come up with glorious adjectives when speaking about Rails. I have not been specifically a web application developer knowing all the nitty-gritty details of web application development: urls, sessions, security, html specifics, javascript,...

And Rails makes it so easy. Each time I asked myself: "how can I do that"? I found an elegant solution in the Rails framework. And when I dont find it, I think "I have to search harder,..."

Of course, part of the success of Rails is due to Ruby, enabling beautiful code with minimum of syntax. I have been, for instance, very easily refactoring some test code by intercepting calls to my tests methods to assert the same thing in several methods.

There are 3 other reasons in my mind:

the famous "convention over configuration". Having nothing to do when implementing the default/most common behavior is productivity at its maximum
the dynamic aspect of the language: you don't have to redeploy to see the changes
the people that designed the framework really know about web application specifities. And they did their maximum to leverage Ruby and the DRY principle (Dont Repeat Yourself) to reuse this knowledge

All this gives a very fast track to implement business rules and applications.

I am really wondering what is going to be the success of Ruby and Rails in the next years. Certainly Java won't die (21st century cobol), but what I know for sure is that Ruby and Rails is really fun and productive to program with.

Those are 2 really strong arguments for people with squared fingertips !

02 March 2006

Relationships as first-class citizens

Here is a link to a research project for having relationships as first-class citizens in Java:

http://www.cl.cam.ac.uk/%7Eaw345/talks/ecoop05.pdf

Not a new idea, the article mentions another dating back from 1987:

J. Rumbaugh. Relations as semantic constructs in object-oriented development. (OOPSLA 1987)

This is for sure a topic that will make its way to a programming language someday. However, the above article indicates that this is a tough subject and that many of consequences of having relationships as first-class citizens are not yet mastered: inheritance, multiplicities, construction,...

I would bet that adding this kind of functionality via a mixin in Ruby could be a great way of getting the details right, in the manner of what is been done for ActiveRecords in RubyOnRails.

28 February 2006

Techno-thriller and change management

The oath is broken. I wanted to add a new entry in this blog at least every 15 days, and failed miserably for one month.

Let's compensate that with 2 light subjects in one entry.

What if we were trapped in the energy shortage?

Michael Crichton has just released (at least in France) a new book where some eco-warriors are triggering massive natural catastrophies in order to make the world join their cause. I don't feel very inclined to what appear to be M. Crichton views on the climate (from what I know), however there is an issue I have been thinking of, that would be interesting to be written as a novel,... or maybe to be taken more seriously!

In 2 or 3 or 4 decades, we'll be short of petrol. In 60 years or so, we'll be short of gas. Uranium won't last much longer.

Ok, let's develop new technologies, new energies,... But how do we do science, research, technology? With computers, labs, cyclotrons,... All of these are pretty much energy intensive (and water, but that's another debate). What if 60 or 80 years were not enough time to find any viable solution to the energy crisis? Our scientists would be back to wood and stones to find the new super energy of the future? But how could they do that in a world that would certainly be a complete mess? You got it.

This could be interesting as a novel, but I would personnally prefer having a public announcement in the next 10 years saying that a major breakthrough has occurred in cold fusion,... Otherwise, how the hell would I be computing until my fingers are crippled? I really plan on living 60 more years!

You can't change the world alone

I am currently learning an interesting lesson. Since my arrival in my current company, I have introduced several practises, either for development or for management:

TDD
pair-programming
features driven iterations
more participation from the group to development decisions (let people decide what they want to do, as long as what need to be done is done)

So far, there have been some successes and some areas could really be improved. For instance, our java team had a moderate consciousness of the absolute need of a permanent green light for unit tests:

we had in one month 426 builds (99 OK / 327 KO). Maximum OK in a row = 12, maximum KO builds in a row = 126

the whole "unit" test suite takes 1 hour to run (for 1800 tests)

Those 2 examples are clear signs that true TDD is not practised around here. Among the many causes of that, I would register my lack of success at making people change their practises.

The white knight

We recently hired one of the best developers I ever met, in order that he could take some, if not all, my activities as a tech project leader. I had a second objective in hiring him. I knew that he would be good at reinforcing the messages that I had been somehow unsucessful to convey. He was my white knight,...

I really needed him: 2 is better than one, and a newcomer, with a different background is even better. We know software development from different shops. This guy and me are much stronger at convincing -together- that there are good practises that should become mandatory.

And we feel that this is not enough. The "planning game" can also be improved. So we are hiring a "agile consultant" to help us start the new release the right way. One of my personal objectives for this year is to get a perfect consistency of our roadmap, our iterations and our day-to-day development. This is a difficult task. So we need our "white knight",...

Feel it with your guts

Having a white knight is great to trigger change. It can give legitimity to your ideas, better ways of introducing them. But if you want good development practises to survive your presence (something I have been missing this past months), you need to make people feel them with their guts. They should suffer from atrocious aches in their stomach, each time they develop a piece of code without any test for it for instance.

How would you do that? We plan to introduce Dôjô (now I know why there are '^' on the 'o' thanks to my japonese lessons,...) to make people practise TDD the extreme way. If the drug is good enough, they should be hooked on forever.

[ah, good to be back at work after a really pleasant vacation week ;-)]

22 January 2006

The hard piece - solved

What can I do as a kata tonight? Oh yes, the unsolved problem from this week's Dojo, of course!

Well, it was not unsolvable at all, you just have to cool down and try to write down things progressively.

The solution

The idea I had that evening is that, finding the position of the upper left corner of the ith subgrid in a grid, should be as difficult as finding the position of the ith square in a grid. Make as if the big grid is a small one with subgrids in each square.

So here are the formulas (perhaps it is clearer than my previous "explaining" phrase?):

# returns a square position (x, y) for the upper left corner
# of the "grid_nb"th subgrid in a grid
def get_subgrid grid_nb
square = Grid.new(@width / @sub_width, @height / @sub_height).get_square grid_nb
x = 1 + @sub_width*(square.x_pos - 1)
y = 1 + @sub_height*(square.y_pos - 1)
end

and

# returns a square position (x, y) for the "square_nb"th square in a grid
def get_square square_nb
x = 1 + (square_nb - 1)% @width
y = 1 + (square_nb - 1)/ @width
Square.new x, y
end

OK, and for the crazy ones, the formula that precisely calculates the position of the upper left corner of the "grid_nb" subgrid in a (w * w) grid with (subw*subh) subgrids (for instance, you can have a 6x6 grid with 6 2x3 subgrids):

1 + subw*((1 + (grid_nb - 1)% ((w / subw)) - 1) + w * (1 + subh*((1 + (grid_nb - 1)/ w) - 1))

No wonder why we couldn't get it rigth at once! Indeed, it took me slightly more than one hour to do that with 5 tests and 31 assertions in Ruby.

TDD as an unconscious habit

Thinking about the discussion I had about TDD after the Dojo, I feel that TDD should really become an unconscious skill. You have to train yourself consciously then to master it without thinking about it. You know the classical circle for learning:

Something I don't know I don't know - unconscious lack of skill
Something I know I don't know - conscious lack of skill
Somethind I know I know - conscious knowledge
Something I don't know I know - unconscious knowledge

Regarding TDD, the specific thing that should become inconscious is the exact rythm of what you are doing:

either defining test cases - green bar
either implementing test cases - green bar
either implementing code to pass the test cases - red bar
either refactoring code - green bar
either refactoring test cases - green bar

Practising the Dojo with an attentive animator or practising pair programming with a merciless mate is a great way to get the habit.

19 January 2006

Back from the Dojo

Gureto!! (Great in japanese)

The Dojo we had last evening was very instructive

The place

We were 11 people from various background sit around a computer and a keyboard with a videoprojector.

The master and the rules

The overall evening was really well mastered by the MC. He told us the rules:

a recap from last Dojo session
a brainstorming for the subject
a vote
several "randories" of 5 min each
pair-programming. The co-pilot becomes the programmer when the randori is over and a new person becomes co-pilot
TDD is mandatory and should be followed by the rule
watchers are only allowed to participate to ask for explanations or to state that the current development is out-of context
no more than 2 hours in a row (1 and a half is common)
the aim is not to be productive but to learn as much as possible

The session

First of all, we had the choice of the language: either Java or Python (my coworker is also proficient in Prolog, but she was the only one). I voted for Python because first I wanted to taste another dynamic language such as Ruby, but also because I feel this kind of language is pretty well adapted to the Katas.

The Kata

The kata we chose is an idea I proposed from Cédric Beust blog: solve a Suduku grid.

First of all we started from a list of tests that we wanted to pass. And we lowered our ambitions from that point on,... So we started with a kid's suduku of 4x4.

And we had a first objective that was to find the solution when only one square was uncomplete.

All in all, something like 16 unit tests were implemented and everything went with a smooth pace. Until the hard part!

The hard part

When you learn martial arts, there are times when you feel at ease with the movements and there are times when something seems unnatural, uneasy. Then you have to struggle with yourself to feel the right way to do it. Ok, enough for the metaphor.

Everybody in the room stumbled on one algorithm that we could not get straight. Even with such brilliant minds around the table. We wanted to implement a function that would return the uppermost left corner of the ith subgrid of the suduku.

This may sound easy for a 4x4 grid but, I don't know why, we had the madness to try to solve it for the general case:

a suduku of NxN
with subgrids of WxH

We fiddled with /, % +1, x W,... but nobody could either code it or explain the right algorithm to others in the room.

I was myself pretty much disappointed as I was too tired to get it right and I felt there was a really smart solution to the problem (something like a fractal approach).

So I think this will be the subject of an upcoming kata and I will post the solution if available.

Cooling down

Eating a delicious home-made tiramisu, we exchanged on this experience. I had great pleasure exercising TDD with an integrist as MC (I will certainly blog later about that. For instance, why would you need to execute the tests when you have just written a test with a missing method? You know that it will fail!).

I think this kind of session is the real way of learning the practise to a group of programmers.

We also had a discussion around the Cascade style for development thanks to Laurent Bossavit (the MC) that told us a funny thing:

the man that drawed the first cascade-style process for development, Winstom Royce, wrote, in the same article (« Managing the Development of Large Software »), that this style was "risky and invites failure".

This article was an inspiration for the great "V cycle", that spread so widely afterwards,...

13 January 2006

Roman code kata: MCMLXXIV => 1974

I want to share a small code kata I did tonight.

The kata

Transform a roman number "MCMLXXIV" into a decimal number: 1974. This kata is extracted from the XP french website (here).

What! Ruby is not perfect ?!

Another example of DiD,... In my implementation, I wanted to get access to all the characters of a string: "CIV" to ["C", "I", "V"].

So I grab the ruby doc and after a few searches on the web, I realize that no nice method such as String#chars or String#each allows to get the characters in an array (here's a ticket for that issue).

I had to resolve to my_string.split(//) which is not quite natural and expressive.

TDD of course

I used TDD to progress with my development:

assert_equal(1, r_to_i("I"))
then
assert_equal(2, r_to_i("II"))
...

and I was quite glad to get the right code fast enough. Ok, agreed, it is not rocket science. But small successes make a happy day.

Lessons learned

The interesting thing about code kata is the time to consider what has been done and how it has been done. Specifically I wondered about the possible use of an injecter to reduce the code (which was a question on the web site):

def r_to_i r
result = last = 0
r.reverse!.split(//).each{|n|
T[n] >= last ? result += T[n] : result -= T[n]
last = T[n]
}
result
end

[funny thing : copying-and-pasting the code on the blog, I found several ways to reduce the code size. However, I feel a comment is necessary on the algorithm. I may not be understandable at first sight,...]

But I don't how it is possible since injectors are repeatedly cumulating the same operation on elements of a collection. Anyway, it was a good pretext to re-read the paragraph on the subject in ProgrammingRuby.

Ok, back to hiraganas now (I am currently learning japanese,...)

11 January 2006

DiD : Devil in the Details,...

A new acronym, that can be encountered in so many workplaces, even literally speaking sometimes ! (Bwah, ah, ah, ah,...)

Time for a follow-up. I was pretty satisfied with my idea of implementing associations as first-class objects in our object-model. It truly serves its purpose : a fine management of relationships between objects: deletion, updating, adding,...

However, I would like to summarize also all the difficulties we had along the way, that's not always paved with flowers:

Transient/Persistent objects

We have some objects that are build only for the purpose of the application session. Those objects are displayed to the user but not meant to be persisted. Yeah, right. But hey, anytime you are associated with a business object, an underlying Observer-Observable relationship is build. It's fine until the business object is persisted and JDO kindly does "persistence by reachability" to persist anything connected. Bang! A transient object is persisted,...

Modification impacts outside the aggregate

With our framework, each time the marketing adds a new ProductRelease to the Product, the Catalog can be notified. All this stays in the same hierarchy of objects. But what if a corresponding new ManufacturedItem should be created in the Factory ?

Loosing my memory

Objects are cool, objects are fine and RAM is cheap,... but not infinite! Adding new objects to manage the relationships also adds more data and we had surprises with our memory consumption.

Redundant associations

Every relationship is managed as an association, even for redundant associations, that are only there for design reasons. As we also use our framework, in a very generic way, to create configuration files, we had never-ending instability each time a new association was added even if the business did not evolve.

Caches

A variant of the previous one. How should a cache of objects be managed? As a composition between the object owning the cache and the cached objects? Can the cached objects appear in compositions even if they are indeed shared?

Tackling the issues one by one

Ah the details,... They can truly turn any "brilliant idea" into "useless crap". Our current answers to those issues are:

Transient/persistent: no obvious solution. Delete the objects when not used. Persist them temporarily if necessary. We are waiting for our supplier to implement JDO2.0 to implement an elegant solution.

Modifications outside the aggregate: use the ubiquitous EventBus

Loosing my memory: some attributes were unecessary on the associations. This had a nice impact. However since we don't have any way in Java to know who's using a specific object, we need to add extra-pointers

Redundant associations / caches: this is a real case of "don't mess the business model with the design model". We redesigned our associations with more meaningful data. Is it a design association? Is the component object cached? And so on.

So far, so good, we didn't find a vicious little problem that could bring our framework down,... yet!

One more word: I wish we did all that in Ruby! Take a look at the ActiveRecords framework:

class Firm < ActiveRecord::Base
  has_many   :clients
  has_one    :account
  belongs_to :conglomorate
end

class Account < ActiveRecord::Base
  composed_of :balance, :class_name => "Money",
              :mapping => %w(balance amount)
  composed_of :address,
              :mapping => [%w(address_street street), %w(address_city city)]
end

It seems so clean :-O

Pages