ICFP 2019 in Berlin is now over. It was a fantastic week packed with great content. I also particularly enjoyed the “Hallway track” this year and the numerous discussions I had around effects, Haskell, Functional Programming, modularity.

You will find below my notes for some of the talks I went to, they are most likely to be useful to me alone but don’t hesitate to ask me for more details or to head to the conference program to find links to specific papers that have sparked your interest.

Pre-conference

Idris 2

Idris 2 is the reimplementation of the Idris language using

Idris 1 as its implementation language
a new type theory: QTT (Quantified Type Theory)
Chez Scheme to compile the QTT language

Outcomes

implementing Idris in Idris shows if the language scales to large programs (answer: not entirely, hence Idris 2). Some other “productivity” features have been added, for example “generate” to automatically split cases and find an implementation. And thanks to QTT more implementations can now be found.
QTT is nice because it allows the safe erasing of types which are just proofs at run-time. Such an “erasable” proof is annotated as 0. Some other types are dependent types which are needed at run-time for pattern matching (annotated as 1). This also a way to introduce linear typing: lmap : (1 a -> b) -> (1 xs : List a) -> List b (a linear map function). That’s still limited for now, there is no polymorphism for linearity annotations and no “borrow checker” but Edwin Brady said he would love to work on that (and even safely support imperative programming with types!)
The compilation time and generated code is now competitive with GHC (except for high-order functions, more work to do there) thanks to the compilation to Chez Scheme (other Schemes are also supported with different trade-offs, like the ease of creating an executable). In the future OCaml could be a good backend as well (in the ML workshop there was a talk on how to FFI Idris with OCaml to access a HTTP library)

Shifted names

You remember all the presentations on the lambda calculus talking about renaming and substitution and the difficulty of dealing with free variables to make sure that names don’t clash?

The usual answer is “say we introduce a ‘fresh’ name here”, meaning we need to keep some state around. “Shifted names” are a simple idea: prior to do anything with something named x put an index on all variables named x in a term and increment the index (so x becomes x0 the first time, x1 next and so on). This means that we can substitute variables for values, rename variables, introduce names, and so on without requiring any “freshness condition”. All those operations have been formalized and proven correct in Coq, which then helps when proving programs, like proving the correctness of a CPS transformation (because a proof might just have to establish that a variable named x is not the same as a variable named k which fortunately Coq doesn’t have trouble in believing).

Deferring the details and deriving programs

How to program in Agda, do imperative programming (swap 2 variables, loop over a list to sum elements,…) but separate proofs from the program to make the code easier to read?

Liam introduces an Applicative to delay proofs. Those proofs can be provided by some preconditions on programs for example by the loop invariant when summing a list, i < length list , then you know that it is safe to access an element of the list list !! i inside the loop

The code is available at https://github.com/liamoc/dddp

Cubes, cats, effects

It didn’t get much of that talk but it was entertaining nonetheless!

Conor McBride explains that the equality used by type theorists is usually quite boring because it starts from x = x. Not only this is quite obvious but the real proof that those 2 labels ‘x’ are the same depends on the implementation of your language. Then the equality proofs that you get in proofs for type systems are generally quite weak.

So the main idea here is to develop a notion of types where we know much better how a value of a given type was built. It must have a “path” from a type T0 to a type T1. Not only that but we track which part of the typing context we used to produce a given type judgment. Conor hopes that this will give us a much tighter way to write proofs because some of the functions we use to write typing judgments will now be accompanied with corresponding proofs showing which type equalities hold. He makes a link with category theory in the sense that it is like putting our hands on a value proving that “the diagram commutes”.

Generic Enumerators

How to enumerate constrained data for testing? We can use type indexed families to constrain data, but how can we enumerate them?

It is also easy to generically enumerate recursive types (think “sum of products”). We can go further by using dependent types and generate indexed families (types indexed by other types) by seeing an indexed family as a function from an index to the description of a data type. A generalized coproduct can be enumerated by providing a selector and a list of enumerators.

This is a dependent pair because the chosen selector will trigger a different data type. The paper show an example of how to generate sized trees (with a statically-known size). You need to supply an enumerator to enumerate all the ways to split the size l in n + m and the library does the rest.

Augmenting type signatures for program synthesis

This is a way to add more information to type signatures to get more programs to be synthesized. The application is to be able to program for more back-ends: CPUs, GPUs,… by modelling libraries for those backends and automatically use the best of them in user code.

The author shows how to synthesize code for a C function void gemv(int m, int n, float *a, float *x, float *y) once a few properties on m, n, a, x, y are given. Then the program synthesizer uses code templates like a for loop, an if then else to try to find the code fulfilling the properties. And the user can use a small logic language with facts, conjunction, unification, negation,… to drive heuristics for the synthesis.

They got results for rewriting some existing code to better libraries for ML, signal processing, string processing, etc…

FreezeML complete and easy type inference for first-class polymorphism

Can we have something between ML and System F in terms of polymorphism and inference?

ML inference is decidable but doesn’t have first class polymorphism
that’s the opposite for System F

For example if you have singleton :: forall a . a -> [a], and id :: forall a -> a, what is the type of singleton id?

in ML: [a -> a] (id is instantiated).
in System F: [forall a . a -> a]

The idea? “freeze” which instantiation you want with 2 operators @ or $ to indicate if you want to keep “forall” polymorphism or not. This brings back type inference.

ICFP 2019 Day 1

Blockchains are functional

Manual Chakravarty did a great presentation of Blockchains and how functional programming and formal methods apply to that space. The general approach at IOHK is to go from formal specifications which are verified using Coq (+Adga and Isabelle) to a direct translation to Haskell.

Since the resulting Haskell code is not performant and lacks concrete considerations like networking and persistence, they program the real implementation in Haskell and use property-based testing to test that implementation against the model.

The other important point is that they use a model of transactions which is called the UTxO (Unspent transaction output) model. That model is a more functional approach than the “account model” which can be found in Ethereum (where you just model the state of wallets). Thanks to this model and a restricted functional language used for on-chain code, they can predict more precisely how much fees will have to be paid for the execution of a contract, contrary to Ethereum again where some contracts can not be executed because they haven’t got enough “gas” (gas is a retribution system to incentivise nodes in the system to validate transactions).

Which language are they using for those validations? Plutus Core: this is based on System F-omega with recursion, which is a type system very well studied. That being said they realized that going from the System F papers to a real implementation raised many interesting questions that they cover in some upcoming papers this year. They formalized the language in Agda and have an abstract interpreter for it.

The whole “smart contract” language is Plutus where you write Haskell programs, delimiting what needs to be executed on the chain with TemplateHaskell (enabling the generation of Plutus core code thanks to a GHC plugin).

It is still hard to program in Plutus for regular “application developers” so they have a DSL dedicated to financial contracts called Marlowe. Similar DSLs for other domains are in preparation.

Compiling with continuations, or without? Whatever.

https://www.cs.purdue.edu/homes/rompf/papers/cong-preprint201811.pdf

Should we use an intermediate language with continuations or not?

The idea of this paper is that it is possible to introduce a “control” operator, which will decide what is translated to “2nd class continuations” when it makes sense. Experiments on subsets of Scala show that more labels and goto are emitted this way and get more stack allocations than before (so less heap allocations).

Upcoming work: adding parametric polymorphism and first class delimited continuations in the user language (by building a type-and-effect system).

Lambda calculus with algebraic simplification for reduction parallelization by equational reasoning

How to parallelize the evaluation of complex expressions?

For example sums over an array, folds of a tree? A lambda calculus with “algebraic simplification” can help there. It consists of:

the normal lambda calculus + semi-ring operations (addition / multiplication)
a “delta abstraction” where the evaluation of the body can be simplified with the algebraic properties of + and * before passing arguments

This allows the parallel evaluation of functions and arguments. This even works for more complex expressions containing conditionals:

-- "breaking" sum
sum (x : xs) acc =
  if x < 0 then acc else sum xs (acc + x)

This function, applied to [5, 3, 1, 7, -6, 2] for example, will be split in two evaluations, one for [5, 3, 1] and one for [7, -6, 2]. The first list is then partially evaluated with a continuation so that we try to evaluate \f -> if ... else if 5 > 0 && 3 > 0 && 1 > 0 then f (5 + 3 + 1). Then we can simplify 5 + 3 + 1 to 9.

The other important part of this work is that there is a type system based on linearity conditions to say when it is “safe” to expand or factor expressions. Interestingly the best expansions or factoring of expressions happens when variables (in polynomials for example) are used linearly.

Fairness in Responsive Parallelism

For cooperative scheduling of threads a notion of fairness is developed and an algorithm has a provable estimation of the expected execution time of high-priority tasks.

Narcissus: Correct-By-Construction Derivation of Decoders and Encoders from Binary Formats

Encoding and decoding binary formats can have lots of bugs, in particular in network stacks. Those bugs can be exploited by attackers. Some issues: formats are context sensitive (parsing the rest depends on a length field). They are also under-specified.

This framework allows the definition of non-deterministic specifications of formats in Coq and then generates encoders and decoders from that specification.

Interesting considerations:

an encoder can fail! Indeed encoders can have invariants which must be respected. If you pass a buffer to write in, the buffer might be too small, or the number of elements in a list can be too big.
a decoder might decode more formats than what the encoder can encode
encoders need to handle state to support things like the building of compression tables

Key features of Narcissus:

readibility of the format specifications
they are extensible, context-sensitive
there can be multiple choices in encoding (to version the payload for example)
there’s an extension for checksum

This work was validated for many encoders for the TCP/IP stack: Ethernet, ARP, IPv4, TCP, UDP and used in MirageOS (and Coq can extract OCaml code). A new format like UDP is 10-20 lines of format + 10-20 of special data types. The performance loss is around 10% (compared to manual code)

More work to come: certified extensions to extract code in other languages without an OCaml dependency.

Question: how hard would it be to apply this to protobuf? There is a paper on verifying protobuf (https://www.cs.purdue.edu/homes/bendy/Narcissus/protobuf.pdf)

Closure Conversion is Safe for Space

Formally verified compilers (like CakeML) can not reason about intentional properties: is my generated program safe for space and time?

There are some theorems for memory-managed language already. However they don’t work for closures and there are some notorious examples where V8 javascript closures are leaking memory.

This work: first formal proof that closure conversion is safe for space by adding profiling semantics for source and target.

The main drawback of this approach IMO: the closure environment they use is proved to be safe but it is also not memory-efficient. On the other hand the closure environment chosen in V8 is unsafe (and fully mutable). This is a known issue with V8 and no one knows how to do it better yet.

Selective Applicative Functors

I’m a big fan of Andrey Mokhov’s work (for example what he has started with algebraic graphs). Andrey presents “Selective Applicative Functors” which are kind of between:

applicative functors where you have independent computations that you can combine
and Monads where you can have dependent, arbitrary computations where the result of one computation creates a new one.

Selective applicative functors are for the case where you want to start independent computation but you can the result of one of them if it finishes earlier and cancel the other. So on one hand the “shape” of computations is known in advanced but you still have “conditional” computations. He dubs this super-power “Speculative execution”. The many applications of selective functors can be found in the paper.

Coherence of Type Class Resolution

A very well rehearsed presentation showing how to prove coherence of typeclasses with no overlapping instances nor orphan instances. In the presence of super-classes can we be sure that the same function is always picked when it comes from 2 different type classes inheriting from the same parent? Is the resolution algorithm also working in the presence of the FlexibleContexts extension where it is possible to locally specify what should be one type if a type class has several type parameters?

The answer seems intuitively “yes” but it turns out to be really hard to prove! https://arxiv.org/pdf/1907.00844.pdf

ICFP 2019 Day 2

Keynote: Rosette - solver-aided tools

Rosette is built on top of Racket and is a platform for implementing DSLs which can be translated to SMT constraints (and solved with Z3 for example). The trick they employ for this is a variant of “bounded model-checking” where they can efficiently explore, using symbolic variables, the possible executions of a program without incurring an exponential number of generated constraints (still, polynomials can be large!).

They have all sort of applications, from languages used for education, OS verification, type systems proofs, and so on. They can show that implementations are correct (they found several bugs in a sub-module of Linux), find counter-examples, do program synthesis to fix programs and so on.

Demystifying differentiable programming: shift/reset the penultimate backpropagator

When doing machine learning and computing gradient descent we need to compute derivatives of functions with multi-variables.

There are 2 ways of doing this: Forward Differentiation which is simple but not very efficient when we compute a large number of variables to get a single result, and Backward Differentiation which is more complicated because more intermediate results have to be kept around (but more performant).

It turns out that the structure of those computations form continuations. The paper shows how to use Scala continuations to write code which can be backward-differentiated automatically using the shift/reset continuation operators for its implementation.

Efficient differentiable programming in a functional array-processing language

This talk is almost the opposite of the previous one! It takes a Forward Differentiation approach for a functional language (a subset of F# with efficient array operations) and shows that a whole lot of techniques can be used to optimise the generated code (sub-expressions elimination, fusion, loop fission, partial evaluation,…) and be on par with the best imperative libraries.

BobKonf summer edition

Relative Functional Reactive Programming

This is a very promising idea which I see possibly replacing actors with something which is better typed. The idea is to extend functional reactive programming which uses events and behaviours across time to events and behaviours across space and time.

Then messages become “perceptions” which travel across space and time (until they reach a node) and a “dialog” between a client and a server becomes a list of perceptions. Using this modelling + CRDTs the talk shows that we can develop a full peer-to-peer todo application.

A library is currently being implemented on top of the reflex framework in Haskell (to show that it works for real :-)) but I would like to see how this could be integrated with Ivan Perez “Monadic streams”: http://www.cs.nott.ac.uk/~psxip1/papers/2016-HaskellSymposium-Perez-Barenz-Nilsson-FRPRefactored-short.pdf. They are also a few hard questions to solve which are the subject of the thesis of the presenter, for example how do you “garbage-collect” events which are not necessary anymore (the current approach supposes an infinite memory!).

Statistical testing of software

Can we measure software quality really? Two development processes, Cleanroom software engineering and Software reliability engineering give us some answers:

Clean room aims to totally prevent bugs, not fix them. You start from a formal specification and the team separates between developers refining the spec while the testers write tests just to measure the quality of the product. The developers can not test, not even access a syntax checker!
Software reliability engineering: describe and quantify the expected use of your system to spend most of your resources on the cases which really matter.

This gives us some ideas for doing for statistical testing:

model the usage of a product
what is really called? How often?
how often calls are related to each other?

From that you can create QuickCheck/Hedgehog generators which can vary their behaviour based on some current state. This means that we are effectively working on Markov Chains (a state diagram containing the probabilities of firing transitions). And there is lot of literature on how to cover Markov Chains, how often we get to a given state and so on.

Then we can measure reliability by counting the number of tests passing before the first failure. Even better we can measure how many times an edge in the Markov chain graph is failing, so that failing paths which are not called very often will not count for much.

Liquidate your assets

This is an example of using Liquid Haskell with a “Tick” monad the count the number of times a resource is being accessed. From there we can statically enforce that some algorithms have specific bounds and check some properties such as how long it should take to sort an already sorted list. Nicky Vazou also showed how you can more or less interactively write proofs using Liquid Haskell and a small proof library.

A functional reboot for deep learning

Any talk by Conal Elliot is worth attending (cf “The essence of automatic differentiation”, or “Compiling to categories”). This time Conal wants to reboot Machine Learning by removing all non-essential complexity and generalize things where they actually are un-necessarily specialized.

For now this is more the intent of a program that an accomplished vision but he already has some of the brick thanks to his previous work on automatic differentiation and his Haskell compiler for “compiling to categories”.

Conal argues that modelling neural networks as graph is wrong. This hides the fact that we are effectively working on functions and forces a sequentiality of computations (in “layers”) which doesn’t have to be. We also have much better way to represent the search space we are exploring than forcing everything into multi-dimensional arrays. In particular representable functors offer a much better way to represent our data, while still being isomorphic to arrays which permits efficient computing on GPUs.

He is asking for help to pursue his vision, both from the Haskell community (there is still work to do on his plugin) and from the ML community (to better understand some statistical aspects and ways of thinking of ML problems).

ICFP Day 3

Call by need is clairvoyant call by value

It is surprising to me that there are still things to say about the lambda calculus. In particular here we address the difficulty of dealing with the operational semantics of call by need.

Because call by need memoizes the function parameters once they are called, in order to do a proper description of their semantics you need to maintain a heap for those values. It makes it hard to predict the runtime behaviour of programs as any Haskell debugging a memory leak would know. It turns out that it gets much easier to have an evaluation strategy making as if you had evaluated only the arguments which will be used later on. This is a form on non-determinism: what if we knew the result in the future? I don’t fully understand the details but it seems that it is like programming in a monad evaluating a tree of possibilities, eventually being able to say which path should be taken. Ultimately this simplifies proving the properties of call by need languages.

Lambda: the ultimate sublanguage

What if you taught functional programming using a much simpler language than Haskell? That language is System F-omega, which is effectively the core of Haskell.

By using only this language Jeremy Yallop is teaching an 8 weeks course, progressively introducing lambda-calculus, polymorphism, modules, existential types,… One of the assigments for example presents the Leibnitz equality: “2 things are the same if you cannot distinguish them” and the Church equality: “2 things are equal if they can always be reduced to the same thing” (I don’t think I got this right) and use the calculus to show that they are the same notion of equality (one gives you the other).

Most participants report than the course was pleasingly challenging and gave them a thorough understanding of the field. You can find the course online here: https://www.cl.cam.ac.uk/teaching/1617/L28/materials.html

Workshops

MiniKanren tutorial

MiniKanren is a “relational programming” library with only 4 operations. It started in Scheme but has now been ported to 15 languages.

It can be put to many uses, from unification in type systems, to program synthesis (it can find quines!), to better querying pharmaceutical data. This should probably be my first thought if I ever stumble on a problem requiring difficult unification in the wild.

Modular effects in Haskell through effect polymorphism and explicit dictionary applications - A new approach and the μVeriFast verifier as a case study

Dominique Devriese proposed last year an extension to Haskell called DictionaryApplications. With this extension you can locally pass an explicit dictionary where a type class instance is required. More importantly the paper proves that it is possible to have this feature without compromising coherence and global uniqueness of instances. How? By making “as if” you were creating a “wrapper” newtype on the fly and coercing it back and forth with the data type for which you want an instance.

Ok, that gives you super powers, but in practice how useful is this? In order to show the usefulness of this feature Dominique has taken an existing OCaml application which was actually using lots of effects lumped together: reader, writer, non-determinism, state, etc… and has rewritten it to typeclasses and dictionary applications.

The resulting code is a lot clearer and quite flexible since it allows the local creation and passing of instances which can rely on IO for their creation. For example you can pass a State s effect around (to computations needing MonadState s) which is effectively implemented with a mutable ref. The paper goes further and even abstract on the creation of such “mutable cell” by making it an effect in itself.

Haskell Implementors Workshop

Haskell use and abuse

Haskell was used on a secret project at X, the Google incubator. They had 3 months to prove the viability of a system containing both hardware and software (that looked a lot like accelerating machine learning to me).

The result was 800 files, 200.000 lines of code, more than 70 extensions used. What worked, what did not work?

the good: lens library to index complex data structures, a good data model representing the hardware, an array library with statically-typed shapes, smart people attracted to the project
the bad: one person full-time to integrate the project to the rest of the infrastructure of Google, too many programming styles, too many DSLs that only a few team members understand, ugly type errors, ugly type signatures (because dependent typing is too weak)

In passing they showed a trick. How to pass parameters to a pattern synonym in Haskell? Use implicit paramters! (and one more extension,…)

Configuration, but without CPP

How to deal with #ifdef in many Haskell libraries?

Problems: it is hard to test, we rarely test all the configurations, it’s text-based (tooling is out), it’s dynamically scoped so hard to write correctly.

A quick analysis shows it is used for:

not configurable: ghc version, base version, platform
configurable: package dependencies, user defined predicates

How to replace the configurable part?

Solutions: there are already solutions (100 papers) to this problem! One promising idea: “Variational typing”, describe in a type the different possibilities for an expression:

  e :: D <Int, Char>
  e = D <5, 'a'>

For example the syntax for having GHCJS specific code would look like:

  GHCJS< liftIO $ ..., return Nothing>

This type system encodes static choices about a program, a choice applies to the all the program. A few rules can be formalized

distribution: D < A -> B, C -> D> = D < A, C > -> D < B, D >
idempotence: D <A, A> = A
domination: D < D <A, B>, C> = D <A, C> (if you choose the left side, always choose the left side)

In terms of performance, typechecking should still be fast (in particular because practically speaking there are few people nesting #ifdefs).

There are some variants of this idea for module systems, for example: “Variation aware module system” where import and exports can depend on the variation.

HIE files in GHC 8.8

hiedb indexes all the source files. Gives fast documentation, types, jump to definition (local or not), find usages, etc… (and there’s a Haskell API).

Future work:

show typeclass evidence (generate that into hie and make it searchable)
integrate with .hi files to keep only one format
capture more type information in the typed AST (some leaves are not currently typed)

References:

Explicit dictionary application: from theory to practice

With explicit dictionary application we

remove the need to have *By functions like nubBy
avoid the creation of And, Or, … newtypes to get different monoids with foldMap
we can use an IORef s to implement a MonadState s instance
we can replace deriveVia, for example use a function semigroupFromMonoid to implement a Semigroup instance

Potential problems:

global uniqueness of instances, this should not be allowed

insert :: Ord a => a -> Set a -> Set a
insert @{reverseOrd} 1 (insert 1 (insert 2 empty))`
-> [1, 2, 1]

coherence: we should get the same parent typeclass for Eq and Ord

whichEq :: (Eq a, Ord a) =>  a -> Bool
whichEq x = x == x

The previous work was showing that the use of representational roles allows the type checker to exclude problematic programs. This year a plan for a better implementation is proposed, but that proposal needs some (wo)man-power to be completed.

So the talk is also a big ask for help!

State of GHC

There’s a lot of activity in GHC-lang (in no particular order and missing many more)

8.8 (Aug 19)
visible kinds application. g :: t @Type a -> ...
pattern signature to bring some types in scope Just (x :: a) -> [x, x] :: [a]
around 70 new contributors each year (for the past 6 years)
10 PRs per day
1400 new issues per year (not sure if that’s a good sign!)
linear types are coming!
stand alone kind signatures
hole-fit plugin
pattern matching with betteer overlap/coverage checks
more compile time performances
visualisations of GHC core

More things cooking:

class morphisms: proposal coming up
GHCJS: plugins, more like normal GHC, profiling
8.10: better codegen
NoFieldSelectors proposal

GHC proposals:

56 proposals accepted, some of them still need to implemented
more accepted proposals than rejected because of the good discussions before proposal
but not always good. For example ExtraCommas: 166 messages on the thread, unclear outcome, people unhappy

GHC devops:

5 releases since last year
everything moved to gitlab
build on Hadrian
8.8.1 this week-end
8.10.1 release next February (small release)
8.12.1 branch in May 20
more platforms tests i.e Alpine Linux, integer-simple is better tested -> releases of core libraries is more of a challenge
Hackage repository to check ~ 150 packagees against a GHC head (head.hackage)
compiler performance is now tested in CI

If Hadrian goes well we can deprecate make in 8.12

AND THAT’S IT!

Pages

24 August 2019

ICFP 2019