ICFP is an awesome conference where I can’t pretend I understand every presentation I’m attending but there is definitely lots of food for thought and some gems which makes it immensely worthwhile.

The gem I want to share here is a paper called “All sorts of permutations”.

This paper shows that you can use the `filterM`

function with a predicate like `a -> [True, False]`

and generate all the sublists of a list.

Similarly, you can evolve the `sort`

function into `sortM`

, taking a monadic comparison function: `a -> a -> m Bool`

. If the comparison function is `a -> a -> [True, False]`

the `sortM`

function will generate the *permutations* of a list.

The paper goes on exploring if `sortM`

generates *all* the permutations of a list or *exactly* the permutations of a list depending on the implementation of the `sortM`

function (insertion sort, merge sort, quicksort,…) and tries to set various restrictions on the comparison function to get there.

At the end of the talk Philip Wadler asked the following question (paraphrased):

This is a neat trick to generate permutations but why should I teach it to my students, considering that there are more ‘classical’ implementations for getting the permutations of a list?"

I think that it is worth showing students because this `sortM`

function gives us a very neat way of writing a QuickCheck generator for distinct elements!

It is indeed not uncommon to have to generate lists of distinct elements for testing applications. For example if you want to test a function scheduling different tasks and these tasks are supposed to be unique, but arriving in any order.

There is unfortunately no `permutations :: [a] -> Gen [a]`

combinator in QuickCheck. So how can you produce random Lists with distinct elements?

One way is to take an existing list of distinct elements and shuffle it to get a random permutation of those elements. This is exactly what `sortM`

gives us, we just need to use the right monad, which happens to be `Gen`

instead of `[]`

!

Let’s take an example. Say you want to generate all the permutations of `["Homer", "Marge", "Bart", "Lisa"]`

```
import Test.QuickCheck
-- An implementation of sortM
sortM :: Monad m => (a -> a -> m Bool) -> [a] -> m [a]
sortM _ [] = return []
sortM cmp (x : xs) = do
ys <- sortM cmp xs
insertM cmp x ys
insertM :: Monad m => (a -> a -> m Bool) -> a -> [a] -> m [a]
insertM _ x [] = return [x]
insertM cmp x yys@(y : ys) = do
b <- cmp x y
if b then return (x : yys)
else fmap (y:) (insertM cmp x ys)
-- A generator for lists of distincts "Simpsons"
type Simpson = String
simpsons = ["Homer", "Marge", "Bart", "Lisa"]
distinctSimpsons :: Gen [Simpson]
distinctSimpsons = sortM (\_ _ -> choose(True, False)) simpsons
```

Let’s test it:

```
λ> sample distinctSimpsons
["Homer","Lisa","Marge","Bart"]
["Homer","Marge","Bart","Lisa"]
["Homer","Marge","Lisa","Bart"]
["Homer","Marge","Bart","Lisa"]
["Bart","Homer","Marge","Lisa"]
["Marge","Homer","Bart","Lisa"]
["Marge","Homer","Lisa","Bart"]
["Marge","Homer","Lisa","Bart"]
["Marge","Bart","Lisa","Homer"]
["Homer","Marge","Bart","Lisa"]
["Bart","Homer","Lisa","Marge"]
```

I think this answers Philip’s question, this is a nice application of the `sortM`

“trick” which can be taught to students learning functional programming.

And what looks *really* promising is that this idea extends to the generation of other data structures. For example we can generate the partitions of a given set with `groupByM`

:

```
partitionM :: Monad m => (a -> m Bool) -> [a] -> m ([a], [a])
partitionM _ [] = pure ([], [])
partitionM p (x : rest) = do
b <- p x
(xs, ys) <- partitionM p rest
if b then pure (x:xs, ys)
else pure (xs, x:ys)
groupByM :: Monad m => (a -> a -> m Bool) -> [a] -> m [[a]]
groupByM _ [] = pure []
groupByM cmp (x : rest) = do
(xs, ys) <- partitionM (cmp x) rest
ps <- groupByM cmp ys
pure $ (x:xs):ps
```

Then we can get all the set partitions of `["Homer","Lisa","Marge","Bart"]`

```
λ> pl $ groupByM (\_ _ -> [True, False]) simpsons
[["Homer","Marge","Bart","Lisa"]]
[["Homer","Marge","Bart"],["Lisa"]]
[["Homer","Marge","Lisa"],["Bart"]]
[["Homer","Marge"],["Bart","Lisa"]]
[["Homer","Marge"],["Bart"],["Lisa"]]
[["Homer","Bart","Lisa"],["Marge"]]
[["Homer","Bart"],["Marge","Lisa"]]
[["Homer","Bart"],["Marge"],["Lisa"]]
[["Homer","Lisa"],["Marge","Bart"]]
[["Homer","Lisa"],["Marge"],["Bart"]]
[["Homer"],["Marge","Bart","Lisa"]]
[["Homer"],["Marge","Bart"],["Lisa"]]
[["Homer"],["Marge","Lisa"],["Bart"]]
[["Homer"],["Marge"],["Bart","Lisa"]]
[["Homer"],["Marge"],["Bart"],["Lisa"]]
```

or get some random partitions of that set:

```
λ> sample $ groupByM (\_ _ -> choose(True, False)) simpsons
[["Homer","Marge","Bart"],["Lisa"]]
[["Homer","Lisa"],["Marge","Bart"]]
[["Homer","Bart"],["Marge"],["Lisa"]]
[["Homer","Marge","Bart","Lisa"]]
[["Homer","Marge","Bart","Lisa"]]
[["Homer","Lisa"],["Marge"],["Bart"]]
[["Homer","Marge"],["Bart"],["Lisa"]]
[["Homer"],["Marge"],["Bart","Lisa"]]
[["Homer","Marge","Bart","Lisa"]]
[["Homer","Marge","Bart","Lisa"]]
[["Homer","Marge","Lisa"],["Bart"]]
```

Some partitions can be drawn twice but the generation should be uniform over the set of all possible partitions (it would be better to not take my word for it and prove it though :-)).

I suspect that this is more than a useful trick and that this deserves a more systematic treatment to understand all the data structures which we can generate by using the appropriate functions taking predicates or relations as parameters. But for now it is already useful as it is!

## No comments:

Post a Comment