08 July 2008

My first ICFP contest, Part 1: ten days to learn Haskell

I wanted to do that since I read Tom's Moertel post on his participation to the ICFP contest 2001. I Told my wife: "One day, I'll participate to the ICFP contest" (head slightly tilted up with a brilliant and determined look in the eyes).

One day, I'll participate to the ICFP contest

Fortunately, that day came quick as I met, last month, other geeks in Tokyo ready to get into the race. I should say über-geeks when I realize the gaps I have in my knowledge of some fields of computing compared to them. System and network programming being only the emerged part of the iceberg.

I met them for the first time during the monthly meeting of the Tokyo Society for the Application of Currying (try to explain that to your grandma). That night we decided that we would form a team for the next ICFP contest. Then, we spent the next week discussing the team name, that was the difficult decision. The easy one was the language we would use. What is THE language of choice for functional hackers? Haskell of course! [flaming starts here]

But how do you learn Haskell in 10 days and hope to be able to contribute at least a little?

Learn Haskell in 10 days!

Actually, I've reading about Haskell for a while now, starting with the very enjoyable A History of Haskell: Being LazyWith Class. From there and working with Scala for the past year, I had understood the following:
  • the "functional" approach vs the "sequential" approach
  • immutable data structures
  • functions as first-class objects
  • lazy evaluation
  • thinking in types
  • thinking in higher-ranked types
  • Options / Maybe
  • List operations: map, folds,...
  • pattern matching
  • type classes / traits
This makes an impressive list of language features but I must say that Scala is a really "transitional language" for Java programmers towards Haskell. Riiiight. I "understand" it but how well can I program within 10 days?

The laundry list

Using a programming language always involves a wealth of down-to-earth considerations worth getting right as fast as possible. Here is my list and how I organized my environment:
  1. the compiler: I used GHC and the Hugs interpreter for Windows
  2. the editor: I used the Functional plugin for eclipse, providing editing, compilation, creation of an executable
  3. the librairies: I downloaded / compiled and installed QuickCheck2 and otherwise used the bundled libraries with GHC
  4. code coverage: hcp (not really indispensable for a beginning but a nice to have for a non-strict language. I was happy to eventually get it to work)
This is clearly non-optimal yet (and I wonder if it can be on Windows?) and I'll certainly explore:
  1. a better editor: considering what can be done in Java, it's pretty amazing that there is no high-class editor for Haskell. I'll try Yi of course, but it's still in its infancy. "Rename function" or "Rename data type" should be pretty accessible (provided enough open-source or company support, that is,...)
  2. a build/test/continuous integration system. The guys from Starling Software have their own system for that matter (QAM) and I haven't seen yet such a standard thing for Haskell.
  3. more libraries, libraries management: I still haven't done it on windows but I plan to install Cabal and download the whole Hackage.
  4. Switching to linux forever. I'll certainly cover this in part 2 but Windows seems less and less a decent alternative for hackers (I didn't say developers ;-) ).
Yes, all set-up! I can start my first tutorial!

No, I'd rather do a code dojo first

Why bother using a tutorial when I have a nice little programming exercise at hand? I did it in Java, I did it in Ruby, I did it in Scala, it was time to use Haskell.

Basically the question is: write a program to transform Roman number back and forth to Integers: unromanize("MCMLXXII") == 1972.

Here is my solution with Haskell:

-- definition of the roman letters with their numerical value
romansList = [("I", 1), ("IV",4), ("V", 5), ("IX", 9), ("X", 10), ("XL", 40),
("L", 50), ("XC", 100), ("C", 100), ("CD", 400), ("D", 500),
("CM", 900), ("M", 1000)]
-- numerical values only
values = map snd romansList
-- letters only
letters = map fst romansList

-- return the value corresponding to a letter
value :: String -> Int
value c = findWithDefault 1 c (fromList romansList)

-- return the letter corresponding to a value
letter :: Int -> String
letter i = findWithDefault "I" i reversed
where reversed = fromList(map switch romansList)
switch = (\(x, y) -> (y, x))

-- to get the value of a RomanString, start
-- from the end and accumulate the value of each letter
-- substract the value of the current letter if its value is
-- inferior to the previous one
-- ex: "XIV" -> 5 (for "V") - 1 (for "I") + 10 (for "X")
unromanize :: String -> Int
unromanize s = fst(foldr add (0, 0) s)

add :: Char -> (Int, Int) -> (Int, Int)
add c (total, lastElem) =
if (lastElem <= val) then (total + val, val)
else (total - val, val)
where val = value [c]
-- [c] is the Char 'c' as a String

-- return the RomanString corresponding to an Int value
-- start with an empty string and the value to convert
-- use the romanizeString function to compute the RomanString

romanize :: Int -> String
romanize i = fst $ romanizeString ("", i)

-- if the remainder is 0, there is no more conversion to do
-- otherwise, try to substract the biggest possible RomanChar from the remainder
-- this RomanChar is then appended to the current result
romanizeString :: (String, Int) -> (String, Int)
romanizeString result@(res, remain) =
if (remain == 0) then result
else romanizeString(res ++ (letter maxi), remain - maxi)
where maxi = maxValue remain
maxValue r = last (takeWhile (<= r) values)

And here is a simple QuickCheck properties that I wrote to check the conversion:
prop_unromanize_romanize :: Int -> Property
prop_unromanize_romanize x = x >= 0 ==> unromanize(romanize x) == x

First impressions

Working on that simple example (but also on string generators to check the reverse property romanize(unromanize) -- much more tricky) led me to the following observations:
  • programming in Haskell really "feels" pure and elegant. There is an economy of symbols which is quite remarkable for such an expressivity
  • not having variables feels like programming in a straight jacket when coming from an imperative background! But the feeling disappears quickly
  • the "functional/data" orientation makes a whole different mental model when you come from an object-oriented perspective. It looks like your objects, encapsulating their data and showing only their business logic have been turned inside out and methods carefully separated from data.
  • browsing the libraries feels like there a huge drive towards abstraction in Haskell. Nowhere else I've seen concepts decomposed so that Monoids would emerge and be reused
  • new language, new librairies. It really takes time but Haskell libraries are full of jewels which are really worth knowing inside out (the same can certainly be said from many languages)
  • although the Prelude contains a lot of useful functions, some facilities you may find in scripting languages are missing. We recently discussed the lack of a "split" operation for Strings on the Haskell mailing-list recently. I know that different people have different semantics for a "split" operation but not including any kind in the standard library feels like it's missing something
  • the compiler is your friend! It goes at great length explaining you why and where you failed to properly type your program
Go, go, go

After those 2 or 3 sessions of installation and programming I was very far from being able to code on my own for the contest, yet I would be able to pair-up and provide a vigilant pair of eyes on someone else code. Stay tuned for part 2: the contest!