# coinduction

This is a topic I need to page back in, so I thought I’d pick it up today, even though my thinking on the topic is linked both with *observational type theory* and with *local time for recursion and corecursion*. Am I writing these posts in a feasible order? Let’s find out.

Firstly, what is coinduction? It’s a process for *generating* tree-like structures in which *some* paths *may* be *infinite*. And it’s named by duality to induction, which is a process for *consuming* tree-like structures in which *all* paths *must* be *finite*. What’s a good example of a coinductive structure? Not bloody *streams*, that’s for sure: they’re like picking natural numbers as an example of inductive datatypes — disingenuously simple in a way that must be overcome to achieve learning. So let’s not do that. I prefer the labelled, possibly infinite binary trees. The Strathclyde Haskell Enhancement (or rather, a text editor) lets me write

codata Tree x = Tree {tree :: (x, Maybe (Tree x, Tree x))}

Exploring such a tree, you can always ask “What’s here?”, and you will be presented with some number (zero or two) possible ways to move deeper into the tree. We can build trees which always let us go on forever.

calculateWilfully :: Tree (Integer, Integer) calculateWilfully = grow (1,1) where grow x@(n,d) = Tree (x, Just (grow (n, n + d), grow (n + d, d)))

We can build trees which sometimes stop.

huntDiag :: (Integer, Integer) -> Tree (Integer, Integer) huntDiag x@(a, b) = Tree (x, if a == b then Nothing else Just (huntDiag (a - b, b), huntDiag (a, b - a)))

Your basic primitive construction tool for coinductive data is to *unfold* a *coalgebra*. Let’s see what that amounts to here.

unfoldTree :: (s -> (x, Maybe (s, s))) -> s -> Tree x unfoldTree f s = Tree (x, case m of Nothing -> Nothing Just (ls, rs) -> Just (unfoldTree f ls, unfoldTree f rs)) where (x, m) = f s

The coalgebra, f, tells us how to grow one layer of structure from a *seed*. In the place for each subtree, we get the seed from which it should be grown. It’s hopefully easy to see that my two examples, above, are given by unfolding coalgebras whose seeds are pairs of integers — the corecursive calls happen just where the corresponding coalgebra needs to put a seed.

Note that the projection function, tree, is a coalgebra…

unfoldTree tree :: (x, Maybe (Tree x, Tree x)) -> Tree x

…with the same type as the coconstructor, Tree. It’s not hard to see, informally, that the two will do essentially the same job, in a handwavy sort of a way. But how to be precise…?

Meanwhile, in jolly old dishonest Haskell, we can write all sorts of thrilling (in that “is this safe?” sort of way) programs which fool about with coinductive data. For this example, I need streams and trees.

codata Stream x = x :> Stream x

Note that I reserve ‘stream’ for the *definitely* infinite sequences: those which may (but don’t have to) stop are ‘colists’. The example I’m thinking of is good old Gibbons-and-Jones breadth-first labelling. The task is to take a Tree () and deliver a Tree Integer, labelled with 0 at the top, 1 and 2 the next layer down, and so on. Step one, boggle at this.

spiral :: (Stream a -> (b, Stream a)) -> a -> b spiral f a = b where (b, as) = f (a :> as)

I call these ‘Oxford’ programs, because that’s where a lot of them get written: they’re hilariously circular in a somewhat underexplained manner. In the case of spiral, the type falls a long way short of explaining which f’s can possibly achieve a useful outcome: they’re rather special. But now, the main deal.

label :: Tree () -> Tree Integer label t = spiral (help t) 0 where help :: Tree () -> Stream Integer -> (Tree Integer, Stream Integer) help (Tree ((), m)) (i :> is0) = case m of Nothing -> (Tree (i, Nothing), (i + 1) :> is0) Just (lt, rt) -> let (lt', is1) = help lt is0 (rt', is2) = help rt is1 in (Tree (i, Just (lt', rt')), (i + 1) :> is2)

What’s going on? The labelling task is easy if we know which number we need to kick off each layer. So just imagine we already know that: we give ourselves a stream of ‘initialisers’ — integers telling us which numbers to use down the left spine of the tree. The head initialiser labels the top node, and the tail tells us how to label any subtrees which might exist. We give back not only the tree, but also the stream of ‘leftovers’ — the successors of the labels used down the right spine of the tree, ready to be used to label the next layer down. The spiral function ties the knot, using the leftovers stream as the tail of the initialisers. The program is all about how the data are used spatially, regardless of what happens when. Haskell’s laziness ensures that if there is away to flatten the data dependency graph into a temporal order, some such order will be found, and in this case, more by luck than the typing judgment, we do indeed win. It’s not remotely obvious how to render label as an unfold, because the key information we use when growing the tree is *threaded*, not just distributed from the root. We don’t learn the leftover labels to kick off the right subtree ‘until’ we have built the left.

So, how on earth are we to make this sort of stuff make sense in an honest language? How are we to reason about it in a proof language? The state of the art is kind of painful in several different ways. The moment you say ‘productivity checking’, you’re dead. Functions like spiral show us that productivity is only the tip of the *continuity* iceberg: it’s not enough for spiral’s argument to be a function which is productive on streams; the function needs the rather particular property that it does not inspect the tail of its input in the course of computing the head of its output.

In, *Productive Corecursion via Guarded Recursion*, Bob and I used a modal operator to enforce such continuity properties in a toy language. How do we integrate such a technology with type theory as we know it? One thing’s for sure, none of the currently popular languages based on type theory has anything to say about continuity, with the honourable and significant exception of Andreas Abel’s “sized types”, delivered as an extension in Agda. Certainly, Andreas can write breadth-first tree-labelling, but can he write the generic spiral? How easy is it to negotiate a variety of continuity disciplines, locally? His work is certainly part of the solution.

The reasoning problem is, of course, a manifestation of our old bugbear, *equality*. In honest type theory, we can equip a coinductive type with construction by unfolding (e.g., unfoldTree) and destruction by projection (e.g., tree). We do not need to take coconstructors as primitive, because they are derivable. Agda gives you the projection, but replaces the formal unfold by recursion-with-productivity-checking. Agda is, moreover, especially careful to forbid *dependent* case analysis of coinductive data (is Idris?), effectively ensuring that nondependent projection is the only way to take things apart. Indeed, the new ‘copattern’ notation capitalises on the idea of projection-only access by making you present codata in terms of their projections: all the examples in the relevant publications use streams, which is lucky, because streams have no coproduct structure to make you regret giving up case analysis.

What’s wrong with dependent case analysis? Nothing. But it’s incompatible with rubbish intensional propositional equality — *that’s* what’s wrong. The thing is, unfolds compute lazily: they only actually unfold when you attempt to project from them. If Haskell had copatterns, I’d have written

unfoldTree :: (s -> (x, Maybe (s, s))) -> s -> Tree x tree (unfoldTree f s) = (x, case m of Nothing -> Nothing Just (ls, rs) -> Just (unfoldTree f ls, unfoldTree f rs)) where (x, m) = f s

because that explains when and how to run the program. And nobody used a coconstructor.

In particular, computation does not give us that

Tree (tree t) = t -- makes no demand, does not compute

but if we do take a primitive coconstructor, we can have that

tree (Tree ts) = ts -- is a demand, can compute

The dependent case analysis principle says that all codata are given by the coconstructor (whether that’s a primitive, or derived by unfolding the projection). Excuse the notation, which is Haskell but not as we

know it.

treeCase :: forall (c :: Tree x -> *) -> (pi (ts :: (x, Maybe (Tree x, Tree x))) -> c (Tree ts)) -> pi (t :: Tree x) -> c t treeCase c p (Tree ts) = p ts

Coq lets us have that. But now I can *prove* the above law which does not compute. By case analysis, it holds if

Tree (tree (Tree ts)) = Tree ts

and that is indeed the case by reflexivity, because the tree in the middle annihilates the inner Tree.

Pick any particular ts, and we get closed proof of an equation which does not hold by computation alone. Indeed, the proof will compute to reflexivity, but reflexivity does not typecheck as a proof of the equation, because the equation doesn’t hold by computation alone. Subject reduction fails. Consistency doesn’t. But it’s still disturbing that typability is unstable in that way. It’s also annoying that we pay the cost without getting the whole of the benefit: although that equation between codata is provable, plenty of other simulations are not.

The trouble, of course, is that intensionally, treeCase isn’t telling the whole truth: not every coinductive structure is *implemented* by coconstructor; rather, every coinductive structure can be simulated up to observation by one that’s implemented by coconstructor. If propositional equality captured not the intensional equality of codata, but rather their bisimilarity, we could derive treeCase given only the nondependent tree.

treeCase c p t = fixup (p (tree t)) where fixup :: c (Tree (tree t)) -> c t fixup -- uses substitutivity of equality, given simulation Tree (tree t) = t

In *Let’s See How Things Unfold*, an invited talk abstract which got rather out of hand, I sketched how to get equality up to bisimulation in the context of an observational type theory. A similar story ought to work in homotopy type theory. I started writing a journal version (which would have been fully refereed, unlike my talk abstract), but whizz went the deadline and spike went the draft.

I think we’re in a bit of a mess. We don’t have the right reasoning tools. We don’t have compositional coprogramming tools. It might be clear, in principle, that coinduction (not general bloody recursion) is the way to make software whose job is to persist and respond to user demands, but there’s a serious need to re-examine both the foundations and the pragmatics of the whole coinduction business, before we can make that principled observation a sustainable reality.

## Trackbacks