Pages

05 December 2011

Pragmatic IO

This post is my exploration of the use of the IO type in a small Scala application.

A short IO introduction

Scala is a pragmatic language when you start learning Functional Programming (FP). Indeed there are some times when you can't apply the proper FP techniques just because you don't know them yet. In those cases you can still resort to variables and side-effects to your heart's content. One of these situations is IO (input/output).

Let's take an example: you need to save data in a database (not a rare thing :-)). The signature of your method will certainly look like that:

  /** @return the unique database id of the saved user */
  def save(user: User): Int

This method is not referentially transparent since it changes the state of the "outside world": if you call it twice, you'll get a different result. This kind of thing is very troubling for a Functional Programmer, that doesn't make him sleep well at night. And anyway, in a language like Haskell, where each function is pure you can't implement it. So how do you do?

The neat trick is to return an action saying "I'm going to save the user" instead of actually saving it:

  def save(user: User): IO[Int]

At first, it seems difficult to do anything from there because we just have an IO[Int], not a real Int. If we want to display the result on the screen, what do we do?

We use IO as a Monad, with the flatMap operation, to sequence the 2 actions, into a new one:

  /** create an IO action to print a line on the screen */
  def printLine(line: Any): IO[Unit] = println(line).pure[IO]

  /** save a user and print the new id on the screen */
                                           // or save(user) >>= printLine
  def saveAndPrint(user: User): IO[Unit] = save(user).flatMap(id => printLine(id))

Then finally when we want to execute the actions for real, we call unsafePerformIO in the main method:

  def main(args: Array[String]) {
    saveAndPrint(User("Eric")).unsafePerformIO
  }

That's more or less the Haskell way to do IO in Scala but nothing forces us to do so. The natural question which arises is: what should we do?

IO or not IO

Before starting this experiment, I fully re-read this thread on the Scala mailing-list, and the answer is not clear for many people apparently. Martin Odersky doesn't seemed convinced that this is the way to go. He thinks that there should be a more lightweight and polymorphic way to get effect checking and Bob Harper doesn't see the point.

Surely the best way to form my own judgement was to try it out by myself (as encouraged by Runar). One thing I know for sure is that, the moment I started printing out files in specs2, I got an uneasy feeling that something could/would go wrong. Since then,
IO has been on my radar of things to try out.

Luckily, these days, I'm developing an application which is just the perfect candidate for that. This application:

  • reads log files
  • stores the records in a MongoDB database
  • creates graphs to analyze the data (maximums, average, distribution of some variables)

Perfect but,... where do I start :-)?

Theory => Practice

That was indeed my first question. Even in a small Swing application like mine the effects were interleaved in many places:

  • when I opened the MongoDB connection, I was printing some messages on the application console to inform the user about the db name, the MongoDB version,...
  • the class responsible for creating the reports was fetching its own data, effectively doing IO
  • datastore queries are cached, and getting some records (an IO action) required to get other data (the date of the latest import, another IO action) to decide if we could reuse the cached ones instead

The other obvious question was: "where do I call unsafePerformIO?". This in an interactive application, I can't put just one
unsafePerformIO in the main method!

The last question was: "OMG, I started to put IO in a few places, now it's eating all of my application, what can I do?" :-))

Segregating the effects

Here's what I ended up doing. First of all let's draw a diagram of the main components of my application before refactoring:

   +-------+    extract/getReport       +------ IO +   store      +------ IO +
   + Gui   + <------------------------> + Process  + -----------> + Store    +
   +-------+     Query/LogReport        +----------+              +----------+
                                             | getReport               ^
                                             v                         |
                                        +------ IO +     getRecords    |
                                        + Reports  + ------------------+
                                        +----------+

Simple app, I told you.

The first issue was that the Reports class (actually a bit more than just one class) was fetching records and building reports at the same time. So if I decided to put IO types on the store I would drag them in any subsequent transformation (marked as IO on the diagram).

The next issue was that the Process class was also doing too much: reading files and storing records.

Those 2 things led me to this refactoring and "architectural" decision, IO annotations would only happen on the middle layer:

                                        +------- IO +            store
                               read     + Extractor + -----------------------------+
                               files    +-----------+          |                   |
                                             ^                 |                   |
                                             | extract         v                   v
   +-------+    extract/getReport       +----------+      +------ IO +        +----------+
   + Gui   + <------------------------> + Process  +      + Status   +        + Store    +
   +-------+     Query/LogReport        +----------+      +----------+        +----------+
                                             | getReport       ^                   ^
                                             v                 |                   |
                                        +------ IO +           |    getRecords     |
                                        + Reporter + ------------------------------+
                                        +----------+
                                             | createReport
                                             v
                                        +----------+
                                        + Reports  +
                                        +----------+
  • all the IO actions are being segregrated to special classes. For example extracting lines for log files creates a big composite IO action to read the lines, to store the lines as a record and to report the status of the extraction. This is handled by the Extractor class

  • the Status class handles all the printing on the the console. It provides methods like:

      /**
       * print a message with a timestamp, execute an action,
       * and print an end message with a timestamp also.
       *
       * This method used by the `Extractor`, `Reporter` and `Process` classes.
       */
       def statusIO[T](before: String, action: IO[T], after: String): IO[T]`
    
  • the Reports class only deals with the aggregation of data from records coming from the Store. However it is completely ignorant of this fact, so testing becomes easier (true story, that really helped!)

  • the Store does not have any IO type. In that sense my approach is not very pure, since nothing prevents "someone" from directly calling the store from the middle layer without encapsulating the call in an IO action. I might argue that this is pragmatic. Instead of sprinkling IO everywhere I started by defining a layer isolating the IO world (the data store, the file system) from the non-IO world (the reports, the GUI. Then, if I want, I can extend the IO marking the rest of the application if I want to. That being said I didn't do it because I didn't really see the added-value at that stage. A name for this approach could be "gradual effecting" (the equivalent of "gradual typing")

  • the Process class is not marked as IO because all the methods provided by that class are actually calling
    unsafePerformIO to really execute the actions. This is the only place in the application where this kind of call occurs and the GUI layer could be left unchanged

All of that was not too hard, but was not a walk in the park either. Some details of the implementation were hard to come up with.

Under the bonnet: syntactic tricks

First of all, what was easy?

"Monadic" sequencing

Sequencing IO actions is indeed easy. I've used the for comprehension:

  def statusIO[T](before: String, action: IO[T], after: String): IO[T] = {
    for {
      _      <- printTime(before)
      result <- action
      _      <- printTime(after)
    } yield result
  }
"Semi-column" sequencing

But also the >>=| operator in Scalaz to just sequence 2 actions when you don't care about the first one:

  def statusIO[T](before: String)(action: IO[T]): IO[T] = printTime(before) >>=| action
"Conditional" sequencing

And, for the fun of it, I implemented a "conditional flatMap operator", >>=?:

  action1 >>=? (condition, action2)

In the code above the action1 is executed and, if the condition is true, we execute action2 and forget about the result of action1, otherwise we keep the result of action1.

"if else" with a twist

Not completely related to IO I also liked the ?? operator. Say I want to execute an action only if a file exists:

   if (file.exists) createRecords(file)
   else             ()

The else line really feels ugly. This kind of "default value" should be inferred from somewhere. Indeed! This "default value" is the Zero of a given type in Scalaz, so it is possible to write:

   file.exists ?? createRecords(file)

It just works (tm) but it's worth knowing where that zero values really comes from:

  1. createRecords(file) returns IO[Int] (the last created id - that's a simplification of the real program)

  2. there is a way to create a Monoid from another Monoid + an Applicative:

    // from scalaz.Monoid
    /** A monoid for sequencing Applicative effects. */
    def liftMonoid[F[_], M](implicit m: Monoid[M], a: Applicative[F]): Monoid[F[M]] =
      new Monoid[F[M]] {
        val zero: F[M] = a.pure(m.zero)
        def append(x: F[M], y: => F[M]): F[M] =
          a.liftA2(x, y, (m1: M, m2: M) => m.append(m1, m2))
      }
    
  3. in this case IO has an Applicative, M is Int so it has a Monoid hence IO[Int] defines a Monoid where the zero is IO(0). I had to open up the debugger to check that precisely :-)

Under the bonnet: it just works,... not

The next piece of implementation I was really happy with was the "logarithmic reporting". This is a feature I implemented using vars at first, which I wanted to make pure in my quest for IO.

What I want is to extract log lines and notify the user when a bunch of lines have been imported (so that he doesn't get too bored). But I don't know how many lines there are in a given file. It could be 100 but it could be 30.000 or 100.000. So I thought that a "logarithmic" counter would be nice. With that counter, I notify the user every 100, 1000, 10.000, 100.000 lines.

The LogarithmicCounter works by creating a State object encapsulating 2 actions to do, one on each tick, one when a
level is reached:

    // create and store a line on each `tick`
    def onCount(line: String) = (i: Int) => createAndStoreRecord(line, creator)

    // and notify the user when we've reached a given level
    val onLevel = (i: Int) => printTime("imported from "+filePath+": "+i+" lines")

    /** @return a State[LogarithmicCounter, IO[Unit]] */
    val readLine = (line: String) => counter.asState(onLevel, onCount(line))

The readLine method is used in a traversal of the lines returned by a FileReader:

    (lines traverseState readLine): State[LogarithmicCounter, Stream[IO[Unit]]]

Pass it an initial counter and you get back a Stream of IO actions which you can then sequence to get an IO[Stream[Unit]]:

    // initial traversal with a function returning a `State`
    val traversed: State[LogarithmicCounter, Stream[IO[Unit]]] = lines traverseState readLine

    // feed in the initial values: count = 1, level = 100 to get the end `State`
    val traversedEndState: Stream[IO[Unit]] = traversed ! new LogarithmicCounter

    // finally get a global action which will execute a stream of
    // record-storing actions and printing actions
    val toStore: IO[Stream[Unit]] = traversedEndState.sequence

Some readers of this blog may recognize one usage of the Essence of the Iterator Pattern - EIP and that's indeed what it is (my first real use case, yes!). traverseState is just a way to use the more traditional traverse method but hiding the ugly type annotations.

This kind of type-directed development is nice. You add some type here and there and you let the compiler guide you to which method to apply in order to get the desired results:

  1. after the traversal I get back a State
  2. if I want to get the final state, the Stream of IO, I need to feed in some initial values, that's the '!' method
  3. if I want to get an action equivalent to the stream of actions, I need the sequence method which has exactly the type signature doing what I want

I was about to call it a day when I actually tried my application,... StackOverflow! What?!?

Trampolining to the rescue

It turns out that traversing a "big" sequence with a State is not so easy. First of all, State is an Applicative because it is also a Monad (please read the earlier EIP post for the details). So basically this amounts to chaining a lot of flatMap operations which blows up the stack.

Fortunately for me, Runar has implemented a generic solution for this kind of issue, like, less than 2 months ago! I leave you to his excellent post for a detailed explanation but the gist of it is to use continuations to describe computations and store them on the heap instead of letting calls happen on the stack. So instead of using State[S, A] I use StateT[Trampoline, S, A] where each computation (S, A) returned by the State monad is actually encapsulated in a Trampoline to be executed on the heap.

The application of this idea was not too easy at first and Runar helped me with a code snippet (thanks!). Eventually I managed to keep everything well hidden behind the traverseState function. The first thing I did was to "trampoline" the function passed to traverseState:

  /**
   * transform a function into its "trampolined" version
   * val f:             T => State[S, A]              = (t: T) => State[S, A]
   * val f_trampolined: T => StateT[Trampoline, S, A] = f.trampolined
   */
  implicit def liftToTrampoline[T, S, A](f: T => State[S, A]) = new LiftToTrampoline(f)

  class LiftToTrampoline[T, S, A](f: T => State[S, A]) {
    def trampolined = (t: T) => stateT((s: S) => suspend(st.apply(s)))
  }

So the traverseState function definition becomes:

  // with the full ugly type annotation
  def traverseState(f: T => State[S, B]) =
    seq.traverse[({type l[a]=StateT[Trampoline, S, a]})#l, B](f.trampolined)

However I can't leave things that like that because traverseState then returns a StateT[Trampoline, S, B] when the client of the function expects a State[S, B]. So I added an untrampolined method to recover a State from a "trampolined" one:

  /** @return a normal State from a "trampolined" one */
  implicit def fromTrampoline[S, A](st: StateT[Trampoline, S, A]) = new FromTrampoline(st)

  class FromTrampoline[S, A](st: StateT[Trampoline, S, A]) {
    def untrampolined: State[S, A] = state((s: S) => st(s).run)
  }

The end result is not so bad. The "trampoline" trick is hidden as an implementation detail and I don't get StackOverflows anymore. Really? Not really,...

The subtleties of foldRight and foldLeft

I was still getting a StackOverflow error but not in the same place as before (#&$^@!!!). It was in the traversal function itself, not in the chaining of flatMaps. The reason for that one was that the Traverse instance for a Stream in Scalaz is using foldRight (or foldr):

  implicit def StreamTraverse: Traverse[Stream] = new Traverse[Stream] {
    def traverse[F[_]: Applicative, A, B](f: A => F[B], as: Stream[A]): F[Stream[B]] =
      as.foldr[F[Stream[B]]](Stream.empty.pure) { (x, ys) =>
        implicitly[Apply[F]].apply(f(x) map ((a: B) => (b: Stream[B]) => a #:: b), ys)
      }
  }

and foldr is recursively intensive. It basically says: foldRight(n) = f(n, foldRight(n-1)) whereas foldl is implemented with a for loop and a variable to accumulate the result.

The workaround for this situation is simple: just provide a Traverse instance using foldLeft. But then you can wonder: "why is traverse even using foldRight in the first place?". The answer is in my next bug! After doing the modifications above I didn't get a SOE anymore but the output in the console was like:

  imported from test.log: 10000 lines [12:33:30]
  imported from test.log: 1000 lines  [12:33:30]
  imported from test.log: 100 lines   [12:33:30]

Cool, I have invented a Time Machine, Marty! That one left me puzzled for a while but I found the solution if not the explanation. The "left-folding" Traverse instance I had left in scope was being used by the sequence method to transform a Stream[IO] into an IO[Stream]. Changing that to the standard "right-folding" behaviour for a Stream traversal was ok. So there is a difference (meaning that something is not associative somewhere,...)

Conclusion

The main conclusion from this experiment is that tagging methods with IO made me really think about where are the effects of my application. It also encouraged functional programming techniques such as traverse, sequence and al.

I must however say that I was surprised on more than one account:

  • I stumbled upon a whole new class of bugs: non-execution. My effects were not executed improperly, they were not executed at all because I forgot to call unsafePerformIO!

  • I was not expecting to be needing an optimisation which had just been added to Scalaz, for something which I thought was a casual traversal

  • there are still some FP mysteries to me. For example I don't know yet how to traverse a full Stream

  • I also don't get why my messages were inverted on the console. I tried different experiments, with a Seq instead of a Stream, with Identity or Option as an Applicative instead of IO and I could only reproduce this specific behavior with Stream and IO

Anyway, I would say that it was overall worth using the IO type, at least for the architectural clarification and the better testing. It also brought back the warm, fuzzy feeling that things are under control. On the other hand refactoring to IO took me longer than expected and required more knowledge than just using vars and regular I/O. But that should really be accounted for in the 'investment for the future' column.

PS

There are certainly many stones left unturned in that post for programmers new to Functional Programming and Scalaz. I do apologize for that (this post is certainly long enough :-)) and I encourage those readers to ask questions on the Scalaz mailing-list.

12 comments:

James Roper said...

My knee jerk reaction to seeing this style of programming is to the use of the word unsafe. Is it really unsafe to perform the core function of my application? If my application manages users, the building of a monad that will create a user is not important. The important thing is actually creating a user. But to trigger that, I have to call a method called unsafe. Are functional programmers really that scared of side effects that they have to call the core function of their application unsafe? Calling a method named unsafe makes me very uncomfortable, like I'm using the infamous sun.misc.Unsafe (which, if you're not aware, allows you to do lots of lovely things like directly manage memory in the JVM and instantiate classes without calling their constructors). I think calling that class unsafe was very smart, it serves as a warning to prgrammers that what they are actually doing is unsafe. But talking to a database is not unsafe, the world won't end, and in fact it is the most important thing that an information system does. It just doesn't make sense to me.

Ben said...

@ James: The name unsafePerformIO comes from Haskell. When we expose ourselves to concepts from two very different programming languages (Java and Haskell), there is bound to be some overloading of terms that can cause some confusion to the uninitiated. Just try to remember that here 'unsafe' refers to the violation of referential transparency; not OS stability, etc. And though we call it 'unsafe', that does not mean we should never use it. Rather, functional programming is about pushing side effects to the boundaries of our code; not eliminating them.

Jesper said...

I'm very curious about this transformation: Did it make the code easier to understand (as in: closer to the domain problem you are working with), how much longer did it make the code, and finally, how hard was the performance/memory hit?

Richard Wallace said...

James,

As Ben said, the function name comes from Haskell but it is extremely accurate. When talking to an external system, isn't that pretty naturally an "unsafe" operation? You can't possibly know when writing the code if the external system will be available: it might be down, or if communicating over a network a link may be down, etc. So it is extremely unsafe.

But reaction you had is also exactly why it is named the way it is. You are meant to have the reaction, "I should not be using this function!" In Haskell, it is up to the environment to call the function.

It could be hidden and the name would not matter - in fact, you could happily use Haskell and never know that function exists. But it does exist and is made available for anyone to use because judicious use of it can be quite useful - as in the Haskell ByteString library, which performs orders of magnitudes better than regular byte arrays when dealing with network data.

The general heuristic in the Haskell community is that if your name is not Simon Peyton-Jones or Don Stewart, you should not be calling unsafePerformIO.

Since the Scala environment doesn't have any effect tracking system of it's own, you have to know about unsafePerformIO and manually call it. So the heuristic has to be a bit different and depends on your context. If developing an application from scratch - i.e. not using an application server or other crummy framework - I generally define

def main(args: Array[String]): Unit = mainIO(args) unsafePerformIO

def mainIO(args: Array[String]): IO[Unit] = ...

If using an application server or other crummy framework that you can't wrap in IO for some reason - servlet engines for example - then you define something akin to

def service(req: HttpServletRequest, res: HttpServletResponse) = serviceIO(req, res) unsafePerformIO

def serviceIOreq: HttpServletRequest, res: HttpServletResponse): IO[Unit] = ...

in your Servlet - although you probably also want to wrap the HttpServletRequest and HttpServletResponse in some API that tries to enforce RT.

Eric said...

@James, @Ben

I also do not find the name "unsafePerformIO" very elegant for something which has to be done as part of the normal operations of my system (even if I know where this name comes from).

Eric said...

@Richard what do Haskell people do when they want to execute IO in an interactive app or in a "servlet" library? Don't they have to use unsafePerformIO?

Runar said...

You call unsafePerformIO in only one place in your application. That minimizes the kitten body-count.

Eric said...

@Jesper I think that overall the "domain" code got a bit better because it didn't have to worry about fetching the records.

Also the code got a bit longer but more generic and reusable. For example, this exercise forced me to extract a "LogarithmicCounter" which I think can be reused in other contexts (and is testable on its own). I could possibly have done this refactoring without thinking much about the IO but, in hindsight, thinking "functionally" helped me see that `State` was a good interface between the function reading the file lines and the one doing something with them.

I didn't measure the performance hit, but I didn't see any. There might be one however due to the additional boxing of actions in objects and functions.

There are also 2 points which I regret not having being faced with because they are usually puzzling for anyone wanting to use IO: configuration and logging. If I ever have to do those I'll post a follow to show how it can be done on a real example.

Eric said...

@Runar how do you do that? In my app, I have several independent actions, like getReport and cleanDb. Both of them need an "unsafePerformIO".

Otherwise that means that my "service" layer can provide only one method called "doAction":

def doAction(a: Action) =
a match {
case Action("getReport") => getReport
case Action("cleanDb") => cleanDB
}.unsafePerformIO

and then that means that I can only have one type of return value.

Is there another way to call unsafePerformIO only once in the code?

siryc said...

@ Erik
Tony Morris has a talk about configuration...http://blog.tmorris.net/configuration-without-the-bugs-and-gymnastics/

Arya said...

Just wanted to say thanks for posting this.

Arya said...

Just wanted to say thanks for posting this.