11 April 2007

Pattern matching with Ruby

How do you know you got it all in your toolbox? Short answer: you never have it all. Long answer: you may add more, a lot more,...

I was reading one of Steve Yegge's old posts lately: "Choosing languages". This article introduced me to the pattern matching capabilities of a language such as Haskell. Wow,... Yes we can't do that in Ruby. But can't we really?

Indeed, this triggered my memory like a mantra: "if it's not nailed down, steal it... if it's not nailed down, steal it,..." Here's the article: If it's not nailed down, steal it. The author presents a "theft" in Ruby: the implementation of pattern matching.

So using the 'multi' gem, you are able to express the Fibonacci function as:
multi (:fib, 0) { 1 }
multi (:fib, 1) { 1 }
multi (:fib, Integer) { |n| fib(n-1) + fib(n-2) }
This may not be very impressive here, but later on in Steve's article, Steve tells us that pattern matching can really be a huge win when dealing with complex structured data. Here's an example from a contest in the Amazon's developer journal.

bad [[_,fs],[_,ds],[_,cs],[_,gs]] = (cs /= fs) && (cs == ds || cs == gs)

Here, the bad function gets a list as its argument. But instead of naming the list, it specifies a pattern that it expects the list to match: a list of 2-item sublists. It ignores the first element of each sublist, and assigns the second element to fs (farmer side), ds (dog side), etc. The variables then become available for use in the body of the function, after the "=" sign, where we check if if the chicken's alone with the dog or the grain.


For sure, writing the same thing in Java, or even in plain Ruby would be a pain. So I rolled up my sleeves and tried to enhance the 'multi' gem with the ability to use Arrays and variables in Arrays. So I allowed to write things such as:
multi(:foo, [String, Integer]) {|x, y| x; y}
multi(:foo, [:p1, :p2]) { |symbols| symbols[:p1]; symbols[:p2]}
multi(:foo, [:p1, [Integer, :p3]]) { |symbols, x| symbols[:p1]; symbols[:p3]; x}
multi(:foo, [:p1, [:_, :p2]]) { |symbols| symbols[:p1]; symbols[:p2]}
I also added some support for a 'multi' block, which is closer to a Haskell definition (though the 'let' name will be certainly shock a lisper,...):
multi(:foo) do
let(String) {|x| x.should eql "hello"}
let(String, String) {|x, y| x.should eql "hello"; y.should eql "world"}
My conclusion on that little experiment was:
  • The use of a "symbols" map is a bit awkward, even if it uses the symbols declared in the pattern. It looks like a trick compared to a Haskell definition
  • It is possible to reproduce Steve's examples in an easy way (I didn't say efficient!)
  • The limitations mentioned in the article regarding the use of 'multi' methods in classes still hold: they are not inherited. 'multi' is not the 'def' keyword
What I am really interested in is: will I use it? Steve says that once you've tasted Pattern matching, you start seeing it everywhere (design patterns, anyone?). I still don't see it in my daily java programming, nor in my nightly ruby programming.

Now, let's say I start seeing it ("Yes, I got it! I am in the Matrix!!"). Should I use it in my ruby code? And should I use the rest? Let's see:
Is it still Ruby? Will other people still understand what I write?

Hey, why not? I am truly amazed by the Ruby's capacities to be benefit from ideas from other languages or paradigms. Indeed, I think it still feels quite Ruby-like with functions and blocks, and those features allow me to write more concise code, at the right abstraction level.

However, capturing an animal and putting him in a zoo is not the same as watching him in its environment. Haskell, Lisp, Erlang and others are quite unique and they each provide different ways to blow your mind.

By the way, is it possible to do concurrency programming "a la" Erlang in Ruby? The challenge is opened,...


Unknown said...

Some time ago I've showed the concept of slightly another way for pattern-matching. With my "library" you example with let's should be rewritten as

def foo(*arg)
arg.pmatch(String){|x| x.should eql "hello"} ||
arg.pmatch(String, String) {|x, y| x.should eql "hello"; y.should eql "world"} ||

which a bit more verbose, but seems to be more "rubyish" (it doesn't hides function definition with "multi" syntax).

Charles said...

Note that the blocks saved by the "multi" method should be "instance_eval"'d against the original object at dispatch time.

Then there would be no problems accessing instance variables, and you could define the methods in the class body.