If at first it doesn’t succeed, try something else
Today we start from a very common programmer need and we end up checking laws! With some considerations about software modularity.
I am currently implementing a command-line application in Haskell. This application needs a OAuthToken
to access some webservice. There are 2 ways to get a token:
- with an environment variable:
OAUTH_TOKEN
- with a call to another command-line tool,
ztoken
, which gives you a token based on your credentials
This translates to the following Haskell functions
getTokenFromEnvironment :: ExceptT GetTokenError IO OAuthToken
getTokenFromCommandLine :: ExceptT GetTokenError IO OAuthToken
Here I am using:
IO
as my base monad for interacting with the external worldExceptT GetTokenError
as a way to declare an error if I cannot get a token because either I don’t have anOAUTH_TOKEN
variable in my environment orztoken
does not accept my credentials. I leave other issues, like network access problems to exceptions in theIO
monad
data GetTokenError =
CommandLineError String |
EnvironmentError String
What I really want now is a way to combine those 2 calls into one
getToken :: ExceptT GetTokenError IO OAuthToken
The first thing which came to my mind was the Alternative
type class and in particular the <|>
operator:
(<|>) :: f a -> f a -> f a
This typeclass has an instance for ExceptT e
provided that there is a Monoid e
, meaning that I can accumulate errors. Actually what I just need is the Alt
typeclass which just requires a Semigroup e
. For this I need to adapt my error type a little bit:
data GetTokenError =
CommandLineError String |
EnvironmentError String |
RepeatedErrors GetTokenError GetTokenError
deriving (Eq, Show)
The RepeatedErrors
case gives us the possibility to accumulate errors in the case of repeated failed calls to retrieve an OAuthToken
. You could argue that it is not particularly well modelled because RepeatedErrors
stores errors more like a tree rather than a list. But let’s leave it at that for now.
The Semigroup
instance for GetTokenError
looks like this:
instance Semigroup GetTokenError where
e1 <> e2 = RepeatedErrors e1 e2
I can finally define:
getToken :: ExceptT GetTokenError IO OAuthToken
getToken =
getTokenFromEnvironment <|>
getTokenFromCommandLine
This is really nice because this is exactly what I want to express!
- get the token from the environment
- if that doesn’t work get it from the command line
More abstraction on the way
This looks all good but something is annoying. My functions are using ExceptT GetTokenError IO
which is a very concrete monad stack. I am going to make further calls to other monad stacks to make HTTP calls and I might have to lift
all over the place to align all the different stacks. This is even more annoying if I create a small library for getting tokens because I impose my stack choices on all clients of the API.
There is a way out of this, the monad transformers library (mtl
). In the mtl there are all sorts of typeclasses abstracting features provided by monad transformers. One of them is MonadError
:
class Monad m => MonadError e m | m -> e where
throwError :: e -> m a
catchError :: m a -> (e -> m a) -> m a
If a Monad
has a MonadError
instance then you can catch and throw errors. Nice. However there is a catch (no pun intended :-)). The | m -> e
part of MonadError
means that the Monad m
you are going to eventually select to support the MonadError
functionality is dependent on the error type e
.
Concretely this means that this type signature propagates to the top of the application:
getToken :: (MonadError GetTokenError m, MonadIO m) => m OAuthToken
Indeed, if I mix other calls to getToken
, for example
getPartitions :: (MonadError GetTokenError m, MonadIO m) => m [Partition]
getPartitions =
do token <- getToken
callService token partitionsRequest
Then the MonadError GetTokenError m
constraints stays because m
is tied to it forever. And if callService
declares its own error type, I won’t be able to mix both getToken
and callService
calls:
callService :: (MonadError HttpError m, MonadIO m) :: OauthToken -> Request a -> a
Maybe one solution would be to define getPartitions
as:
getPartitions :: (MonadError (GetTokenError Either ServiceError) m, MonadIO m) => m [Partition]
In a way we are back to the previous problem. We now have a “concrete stack” of errors where we just want “a” structure capable of holding both GetTokenError
and HttpError
.
Lenses to the rescue
A partial solution to the problem is given by lenses and is wonderfully explained by Georges Wilson in “next level mtl with classy optics” (I also recommend this gist by Nick Partridge):
-- create prisms for GetTokenError
makeClassyPrisms ''GetTokenError
-- now e is anything having a Prism allowing us to extract or inject a GetTokenError
getToken :: (MonadError e m, MonadIO m, AsGetTokenError e) => m OAuthToken
So it becomes kind of easier to mix calls having different error types:
getPartitions :: (MonadError e m, AsGetTokenError e, AsHttpError e, MonadIO m) => m [Partition]
This is probably a better way to mix different error types with MonadError
. We don’t get rid of the functional constraint though. The error types still “bubble-up” to the top and the method used to do the authentication is exposed to the clients of getPartitions
. So if we switch the authentication mechanism and get different error types we will have to change all the functions calling getPartitions
.
I think we need to really take care of this kind of situation because it makes software a lot harder to evolve when a small change has ripple effects across all the software layers.
Errors translation / encapsulation
What we can do is to define a new error type subsuming GetTokenError
and HttpError
:
data ServiceError =
AuthenticationError GetTokenError |
CallError HttpError
makeClassy ''ServiceError
instance AsGetTokenError ServiceError
instance AsHttpError ServiceError
And we need a bit of boilerplate to do the translation between getToken
and callService
. To be able to “liberate” those 2 functions from their MonadError e m, MonadIO m
constraints we can run them with the minimum stack which satisfies these constraints, this is ExceptT <error type> IO a
:
runOAuthToken :: (MonadError e m, AsCliError e, MonadIO m) => ExceptT GetTokenError IO a -> m a
runOAuthToken = runExceptTIO (review _AuthenticationError)
runCallService :: (MonadError e m, AsCliError e, MonadIO m) => ExceptT HttpError IO a -> m a
runCallService = runExceptTIO (review _HttpError)
runExceptTIO :: (MonadError e m, MonadIO m) => (f -> e) -> ExceptT f IO a -> m a
runExceptTIO mapError ioa =
do valueOrError <- liftIO (runExceptT ioa)
fromEitherM (throwError . mapError) valueOrError
(fromEitherM
comes from the from-sum
package)
Then we can make both calls and “unify” them under new constraints:
getPartitions :: (MonadError e m, AsServiceError e, MonadIO m) => m [Partition]
getPartitions =
do token <- runToken getToken
runService (callService token partitionsRequest)
Alternatives?
Unfortunately being more abstract with getToken
breaks the use of <|>
which was so convenient. One option for fixing it is to define a similar operator for mtl
classes:
(<|>) :: (MonadError e m, Semigroup e) => m a -> m a -> m a
(<|>) ma1 ma2 =
catchError ma1 (\e1 ->
catchError ma2 (\e2 ->
throwError (e1 <> e2)))
But this still doesn’t work given the way we have defined our errors as prisms over a general error type! We need to define a extension of <|>
which uses the appropriate Prism
to aggregate errors:
(<|?>) :: (MonadError e m, Semigroup o) => Prism' e o -> m a -> m a -> m a
(<|?>) p ma1 ma2 = catchError ma1 (\e1 ->
case e1 ^? p of
Nothing -> throwError e1
Just o1 ->
catchError ma2 (\e2 ->
case e2 ^? p of
Nothing -> throwError e1
Just o2 -> throwError (review p (o1 <> o2))))
And finally
getToken :: (MonadError e m, MonadIO m, AsGetTokenError e) => m OAuthToken
getToken =
(<|?>) _GetTokenError
getTokenFromEnvironment
getTokenFromCommandLine
This is not syntactically as nice as before but we can improve that:
infix 4 <!>
infix 3 <?>
data Alternating a = Alternating a a
(<!>) :: MonadError e m => m a -> m a -> Alternating (m a)
(<!>) = Alternating
(<?>) :: (MonadError e m, Semigroup o) => Prism' e o -> Alternating (m a) -> m a
(<?>) = error "left to the reader"
getToken =
_GetTokenError <?>
getTokenFromEnvironment <|>
getTokenFromCommandLine
What about laws?
This is not the happy end of the story. The Alternative
type class specifies that <|>
must be an associative operation. This means that we probably need to prove that our <|>
and its evil twin <|?>
are associative operators. What does that mean for instances of MonadError e m
? By the way what is even a MonadError e m
?
I was pretty shocked to realize that MonadError
doesn’t come up with any laws in Haskell! We need to turn to Purescript for this:
-- | - Left zero: `throwError e >>= f = throwError e`
-- | - Catch: `catchError (throwError e) f = f e`
-- | - Pure: `catchError (pure a) f = pure a`
And we probably need an additional law for our <|>
operator:
-- | - Catch associativity: `catchError (catchError ma f1) f2 =
-- catchError ma (\e -> catchError (f1 e) f2)`
Since many instances already exist for MonadError
the best we can do is to check if those laws hold for these instances.
I was at first very enthusiastic about that. There are 2 libraries in Haskell which can be used to retroactively find properties for a give API: quickspec
and speculate
. However I was unable to get any of them to compile / find laws. Since I had been shaving too many yaks on this application I decided to stop there. But I hope I will get some time to get those libraries to find laws.
Anyway, the best I could do was to check the laws for all the instances of MonadError
: EitherT
, ListT
, MaybeT
, ExceptT
and so on. I verified a bunch of them using QuickCheck and they seem to hold. Actually if that wasn’t the case I would be surprised and would probably question the implementation. I also proved that the laws were holding for the Either
instance of MonadError
. It goes like this, for the associativity of catch
:
-- this is what we want to prove
catchError (catchError m k2) k1 == catchError m (\\e -> catchError (k2 e) k1)
--> case 1: m = Left t
catchError (catchError (Left t) k2) k1 == catchError (Left t) (\\e -> catchError (k2 e) k1)
-- we apply on each side the definition of catchError for Either when the (m a)
-- value is a Left value which is to call the handler, CQFD
catchError (k2 t) k1 = catchError (k2 t) k1
--> case 2: m = Right a
catchError (catchError (Right a) k2) k1 == catchError (Right a) (\\e -> catchError (k2 e) k1)
-- we apply on each side the definition of catchError for Either when the (m a) value
-- is a Right value which is to leave it as it is, CQFD
catchError (Right a) k1 = Right a
-- we do it once more on the left side of the equation
Right a = Right a
I suspect that it is possible to prove the laws for other MonadError
instances, StateT
would probably be interesting.
What have we learned?
There are plenty of lessons for me in this exercise:
Haskell is full of interesting generalization,
<|>
is one of them and worth spotting in production codeHaskell libraries are not necessarily complete and can probably benefit from contributions
The modularity story of Haskell is not as obvious as what you might think. I had to think hard to find a satisfying solution and one huge problem is still unsolved. How can I test my full application code with a dummy authentication, which would always succeed? This will be the subject of a following post!
No comments:
Post a Comment