## The story

I’ve made it to the final stage of my computer science PhD, you know the one where you start looking for jobs, get yourself a copy of Cracking the Coding Interview (affiliate link), and realise that you haven’t done much of the kind of programming that about half of the companies expect you to do at their job interviews (and at their job interviews only).

At some point in the book, it says “know how to implement these data structures
by heart: dynamically sized arrays, hash tables, […], **binary heaps**,
[…]”. It dawned on me that I remember the heap property and the heap
interface, but not how to implement it. I was horrified when I remembered
despite conceptually being a tree, binary heaps are implemented using arrays.
Despite having used Haskell as my primary language, decided to implement it in
Ruby—my prior primary language. Some time and indexing errors later, I got it
working. Then ported it to Haskell’s using the `ST`

monad.
After writing `STRef`

one too many times, I got that working too, but
it left much to be desired. “Save the trees” yelled my terminal!

Finally, I consulted Dr Chris Okasaki’s Purely Functional Data Structures (affiliate link). A leftist heap is one of the first data structures discussed in the book. It has better worst-case asymptotic complexity than the binary heap, is represented as a tree, and doesn’t need mutation. Great! I was pleased to have a heap under my belt that was much easier to remember and much more difficult to get its implementation wrong.

Staring at it for a while (and being bored while trying to find various substrings with various properties in linear time and constant space under an hour over the phone), I got a burning desire to encode the titular properties of a leftist heap using fancy types. Having listened to hundreds of people complain about the state of type-level programming in Haskell, I found the process to be rough around the edges, but functional (see what I did there).

## The spiel

This brings us to the post at hand. This post is a bit long, but the upside is there is something for everybody. Hopefully, some of the following piques your interest:

- leftist heaps as a purely functional alternative to array-based binary heaps,
- complexity analysis of operations on leftist heaps,
- a case study on the internalist approach to verifying data structures,
- a tutorial on most major features of type-level programming in Haskell,
- a commentary on the ergonomics of verification using fancy types in Haskell,
- and practical advice on avoiding pitfalls when using fancy types.

Beginners beware! Type-level programming can be daunting. It certainly was for me for a long time. I’ll attempt to explain fancy types from scratch. If you find yourself getting confused, it’s almost certainly my fault. Just let me know (contact details on my homepage) and I’ll clarify the post.

## The itinerary

Here are the sections and what to expect from them.

- A simple heap covers the generic heap interface through a typeclass and a trivial instance for it. Type-level features: associated type families;
- A leftist heap describes a data type for leftist heaps without using fancy types and discusses the asymptotic complexities of its operations;
- Terms, types, and kinds covers the basic entities in modern Haskell and how they relate to each other. Type-level features: data type promotion, kind polymorphism, and levity polymorphism;
- Verifying the leftist property explains the data type encoding the leftist property and the implementation of its property preserving operations. Type-level features: generalised algebraic data types, singletons, and existential types through the heap instance. We also reinvent type-level natural numbers and prove theorems about them using typed holes;
- Verifying the heap property encodes both the leftist and the heap properties into a data type. Most of this section is on the property preserving merge. Type-level features: closed type families and propositional equality. Additionally, it extends on theorem proving and use of existential types;
- Simulating heap operations tests the
functional equivalence of the heap implementations in this post. We use
QuickCheck to simulate evaluation of
*arbitrary*sequence of insertions and deletions. Type-level features: visible type-applications, explicit foralls, and scoped type variables; - Conclusion acknowledges people who made this post possible and reminds some take-aways.

The exposition of the code is fragmented and out of order, but a well-organised
version of the **SOURCE
CODE**
exists. We won’t use any libraries except
QuickCheck.

# A simple heap

A heap is a (conceptually) tree-based data structure used to quickly access and
maintain access to the maximum or the minimum of a collection of values. It
satisfies the *heap property*, that is (for a maximum heap) the label of a node
is bigger than or equal to that of its children. The following typeclass
summarises its common operations.

```
class Heap heap where
{-# MINIMAL
isEmpty, empty,
(singleton | insert),
(fromList | (singleton, merge)),
(insert | (merge, singleton)),
(merge | (decompose, insert)),
(decompose | (findMax, deleteMax))
#-}
type Elem heap
-- Predicates
isEmpty :: heap -> Bool
-- Access
findMax :: heap -> Maybe (Elem heap)
= fmap fst <$> decompose
findMax
-- Creation
empty :: heap
singleton :: Elem heap -> heap
= (`insert` empty)
singleton
fromList :: [ Elem heap ] -> heap
= -- O(n) for leftist heaps
fromList xs case go (map singleton xs) of
-> heap
[ heap ] -> empty
[ ] -> error "Fatal error. Did not converge to a single heap."
_ where
= []
go [] = [ x ]
go [ x ] : y : rest) = go (merge x y : go rest)
go (x
-- Modification
insert :: Elem heap -> heap -> heap
= merge (singleton x)
insert x
merge :: heap -> heap -> heap
`merge` heap2 =
heap1 case decompose heap1 of
Just (heapMax, heapRest) -> heapRest `merge` (heapMax `insert` heap2)
Nothing -> heap2
decompose :: heap -> Maybe (Elem heap, heap)
=
decompose heap case (findMax heap, deleteMax heap) of
Just heapMax, Just heapRest) -> Just (heapMax, heapRest)
(Nothing , Nothing ) -> Nothing
(Just _ , Nothing ) -> error
("Impossible happened. There is a max but the heap is empty."
Nothing , Just _ ) -> error
("Impossible happened. Heap is non-empty but there is a max."
deleteMax :: heap -> Maybe heap
= fmap snd <$> decompose deleteMax
```

This is a bit mouthful because many operations are inter-definable as reflected
by the `MINIMAL`

pragma.

The `Elem`

*type family* (enabled by `TypeFamilies`

extension)
associated with `Heap`

gives the type of elements for a particular
instance. This is nothing but a function from types of containers to types of
their elements. We could have equally used `MultiParamTypeClasses`

and
`FunctionalDependencies`

extensions to establish the same container-element
relationship. I chose a type family here because we will use type families in a
moment anyway and because I think `Elem heap`

has less cognitive
overhead than remembering functional dependencies between type variables.

Although `insert`

, `findMax`

and `deleteMax`

are
the most commonly used operations of `Heap`

, `merge`

is the
one that we care the most about. For all data structures we’ll use as heaps
today, implementing `isEmpty`

, `findMax`

,
`singleton`

, and `empty`

are trivial. Then with
`merge`

, we can implement `insert`

, `fromList`

,
`decompose`

, and `deleteMax`

. As we see in the next section,
implementing `merge`

and deriving the rest is not only optimal in
terms of productivity but also in terms of performance for leftist
heaps.

Before implementing this interface for a leftist heap, let’s look at a much simpler instance.

```
instance Ord a => Heap [ a ] where
type Elem [ a ] = a
= null
isEmpty
= []
empty
= xs
fromList xs
= (:)
insert
= (<>)
merge
= Nothing
decompose [] = Just (heapMax, left ++ tail right)
decompose xs where
= maximum xs
heapMax = span (/= heapMax) xs (left, right)
```

This may be the easiest heap implementation. Insertion is \(O(1)\), merging is \(O(n)\), conversion from a list is \(O(1)\), and decomposing (and subsequently finding and deleting the maximum) is \(O(n)\). If it wasn’t for that last \(O(n)\), this would have been a perfectly fine heap implementation, alas here we are.

This implementation is obviously correct, thus any other correct heap
implementation should be *functionally equivalent* to it. This means performing
the same operations on two empty heaps of different implementations should
result in two heaps with the same maximum. Hence, this simple heap
implementation is perfect for testing other implementations’
correctness.

# A leftist heap

Since we’ll go through the trouble of implementing leftist heaps multiple times, let’s spend a second on comparing it to array-based binary heaps.

Why bother with the leftist heap? It is persistent (hence better suited for multi-threaded computation), both conceptually and implementation-wise a tree, and more resilient against off-by-one errors. Why bother with the array-based binary heap? Better average case complexity of insertions; its operations are in place; and it probably performs better in practice because of good locality of reference (this is a hunch and I’d like to be proven wrong).

We can also look at their complexities more concretely. Leftist heaps have \(O(\lg{n})\) worst-case complexity for insertion and deleting the maximum, while maintaining \(O(1)\) complexity for finding the maximum. Building a heap out of a collection is \(O(n)\). So far we’re on par with binary heaps. But we can do one better. While merging binary heaps is \(O(n)\), it’s only \(O(\lg{n})\) for leftist heaps. In fact, this is why insertion and deletion are \(O(\lg{n})\).

## The data structure and its properties

A leftist heap is as a tree and we implement it as such.

`data LeftistHeap a = Leaf | Node a Int (LeftistHeap a) (LeftistHeap a)`

The tree is standard except for the `Int`

parameter. This is the
*rank* of the `Node`

, which is the least distance to a
`Leaf`

. The rank of a `Leaf`

is 0 and the rank of a
`Node`

is one more than the minimum of its children’s ranks.

Let’s briefly look at the relationship between the size of a tree and its rank.

A first question is how many elements there needs to be in the tree if its rank
is \(R\)? If the rank of a tree is \(R\), then it must be the case that each path
from the root has \(R\) `Node`

s, otherwise the rank of the tree would be
fewer. This means the tree has at least \(2^{R} - 1\) elements.

Then the followup question is, if a tree has \(N\) elements, what is its maximum rank? Well, we know that the rank imposes a lower bound on the tree size, so conversely, the tree size should impose a maximum on the rank. If \(R\) is the maximum rank, we have \(2^{R} - 1 \leq N \lt 2^{R + 1} - 1\), so \(R \leq \lg{(N + 1)} < R + 1\). Hence, \(\left\lfloor{\lg{(N + 1)}} \right\rfloor\) is the desired maximum.

The leftist heap has the *leftist property*. In short, the shortest path from
any node to a `Leaf`

must be the right-most one. Since each subtree in
a leftist heap is also a leftist heap, the rank of any left child is at least as
big as that of the right, hence the name.

How can we refine the earlier calculation about the maximum rank for leftist
heaps? The distance between the root and the right-most `Leaf`

is at
most \(\left\lfloor{\lg{(N + 1)}} \right\rfloor\) if the leftist heap has \(N\)
elements in it. This is the critical information we’ll use to derive the
complexity of the `merge`

operation.

Accessing the rank is handy, so let’s create a typeclass for it.

```
class HasRank a where
type RankType a
rank :: a -> RankType a
instance HasRank (LeftistHeap a) where
type RankType (LeftistHeap a) = Int
Leaf = 0
rank Node _ r _ _) = r rank (
```

Here is the `Heap`

instance for the `LeftistHeap`

. The
`Ord`

constraint is for the heap property. The element of a
`LeftistHeap a`

is `a`

. Its operations are implemented over
the next two sections.

```
instance Ord a => Heap (LeftistHeap a) where
type Elem (LeftistHeap a) = a
```

## Merging two heaps

Let’s tackle the most important operation head-on.

```
Leaf heap = heap
merge Leaf = heap
merge heap @(Node x _ left1 right1)
merge h1@(Node y _ left2 right2) =
h2if x > y
then mkNode x left1 (merge right1 h2)
else mkNode y left2 (merge right2 h1)
where
mkNode :: a -> LeftistHeap a -> LeftistHeap a -> LeftistHeap a
=
mkNode a heap1 heap2 if rank heap1 <= rank heap2
then Node a (rank heap1 + 1) heap2 heap1
else Node a (rank heap2 + 1) heap1 heap2
```

The base cases are simple as `Leaf`

acts as the identity element for
`merge`

.

In the inductive case, we walk over the right-most paths of the input heaps. You can see this in the recursive calls; they never touch the left children.

To preserve the heap property, we recurse on the right child of the argument heap with the bigger label.

To build a new node, we use `mkNode`

helper rather than
`Node`

constructor directly. The helper does two things. First, it
makes the child with the lowest rank the right child. Since the arguments to
`mkNode`

are leftist heaps themselves, this flip ensures the
right-most path to `Leaf`

is still the shortest. Second, it calculates
the new rank which is one more than the rank of the right child.

Now what is the complexity of this? At each recursive call we potentially do a flip, increase the rank, and construct a tree node, these are all constant time operations. So the question is the number of recursive calls. If the leftist heaps being merged have \(L\) and \(R\) elements inside, we know their right-most paths are at most length \(\left\lfloor lg{(L + 1)}\right\rfloor\) and \(\left\lfloor lg{(R + 1)}\right\rfloor\) respectively. Hence, we at most do \(\left\lfloor\lg{(L + 1)} + \lg{(R + 1)}\right\rfloor\) calls. So the overall complexity is \(O(\lg{(L \times R)})\) which is a subset of \(O(\lg{(L + R)})\) (can you see why?). In short, the merge operation is logarithmic in the size of the output.

This is not where the beauty of `merge`

ends. Recall that most leftist
heap elements live outside the right-most path. Then since we only recurse over
the right-most path, we never touch the trees where most elements live. We just
move them around. In a purely functional language, this means the output tree
does not have to allocate new memory for those trees, it can just share them
with the input heaps.

## Every other operation

The remaining operations needed to satisfy the typeclass are as follows.

```
Leaf = True
isEmpty = False
isEmpty _
= Leaf
empty
= Node x 1 Leaf Leaf
singleton x
Leaf = Nothing
decompose Node x _ left right) = Just (x, merge left right) decompose (
```

From `merge`

follows everything else. Maximum is maintained at the
root, so accessing it is easy. The `decompose`

operation returns the
maximum along with the rest of the heap with the maximum removed by merging the
two children of the root. Insertion (the default implementation) creates a
singleton heap out of the inserted label and merges it into the heap.

Since `merge`

has logarithmic complexity, so does `insert`

and `deleteMax`

. Since we store the maximum at the root,
`findMax`

runs in constant time.

Conversion from a list is more interesting. The obvious implementation is to fold over the list of elements and insert them into the heap, this turns out not to be the most efficient way. If we instead turn each element into a singleton heap and repeatedly merge two heaps at a time (with multiple passes) until one heap is left, conversion happens in linear time. The following default implementation does exactly that.

```
fromList :: [ Elem heap ] -> heap
= -- O(n) for leftist heaps
fromList xs case go (map singleton xs) of
-> heap
[ heap ] -> empty
[ ] -> error "Impossible. Did not converge to a single heap."
_ where
= []
go [] = [ x ]
go [ x ] : y : rest) = go (merge x y : go rest) go (x
```

Why does this run in linear time? Assume for simplicity that there are \(2^R\) elements. Then in the first pass, we do \(2^{R-1}\) \(O(\lg{2})\) operations. In the next pass, we do \(2^{R-2}\) \(O(\lg{4})\), then \(2^{R-3}\) \(O(\lg{8})\) operations and so on. So the overall complexity is \(O(\sum^{R}_{i = 1}{(\lg{2^i}) 2^{R-i}}\,)\) which is \(O(\sum^{R}_{i = 1}{i \; 2^{R-i}}\,)\) and that is \(O(2^{R})\). That is the number of elements we started with, so conversion from a list is done in linear time.

# Terms, types, and kinds

Before doing verification with fancy types, we need to understand terms, types, and kinds. Here’s the gist: all terms have types, all types have kinds, and there is no distinction between types and kinds since GHC 8.0, but terms and types (for now) occupy different realms.

For example, just as you can say `42 :: Int`

, you can also say `Int :: Type`

and `Type :: Type`

(`*`

is a deprecated
synonym of `Type`

; import `Data.Kind`

for `Type`

).
We can read these as “`42`

is an `Int`

”, “`Int`

is
a `Type`

”, and “`Type`

is a `Type`

” (yup, not a
typo).

Just as you can use `:type`

or `:t`

learn the type of a term in `ghci`

, you can
use `:k`

or `:kind`

to learn the kind of a type.

We now look at types and kinds in more detail. It may be too much information to soak in at once, but the broad-strokes should be enough for this post. For a broader overview of the subject, see Diogo Castro’s amazing blog post. More generally, one can get away without an in-depth understanding of these and still be able to verify data structures. But then we’d be relying on GHC to yell at us when certain extensions are missing and not understand why we’re being yelled at.

## Proofs and contradictions

Famously, Ludwig Wittgenstein wasn’t terribly concerned about inconsistencies in mathematics as most were, including Alan Turing. They even have a direct exchange on this subject. Surprisingly, Haskell’s type system seems to agree more with Wittgenstein than with Turing.

If `Type :: Type`

makes you uncomfortable, you’re right, it is
problematic and it leads to Russel’s
paradox. This is one reason
people don’t like type-level programming in Haskell. It means as a proof system,
Haskell’s type system is inconsistent. What that means is that *we don’t have
the ability to tell the truth*. The expectation, due to the Curry-Howard
correspondance, is
that if we have a type corresponding to some logical statement, a term for that
type (if it exists) is a proof. Inconsistency means, we can have terms that are
not valid proofs of the statement, but satisfy the type checker. In particular,
`Type :: Type`

leads to such menace.

That said, since Haskell already has `let x = x in x`

,
`undefined`

, and `error "QED"`

satisfying types of
propositions, we didn’t have the ability to tell the truth to start with. Hence,
we are not worse off. At least, this is the argument in Prof. Stephanie
Weirich’s paper as
well as the GHC
documentation.

One might think existing flaws don’t justify adding new ways to break a system.
Ordinarily, that’s right, but contradictions are infectious. As soon as there is
a little crack, it is difficult to contain. So the marginal harm done by `Type :: Type`

is less than expected.

There are *paraconsistent
logics* limiting the harm
done by inconsistencies, but they are not employed in type systems as far as I
know.

To sum up, Haskell proofs are partial. If a term (proof) corresponding to a type (proposition) compiles, one of two things happened. The term is a valid proof or its evaluation will diverge. By contrast, Agda and Idris proofs are always terminating and are thus valid proofs as long as the type checker says so (up to compiler bugs). Hence, despite the syntactic similarity, you should have more faith in the latter.

## Why is `Type`

a misnomer?

The kind `Type`

has a very confusing name. It should really be named
`LiftedType`

. Let’s understand why.

It has two important features. The term `undefined`

(or \(\bot\) in
academic papers) is a valid term for any type with kind `Type`

. This
makes `Type`

the kind of *lifted* types. Consequently, all of these types are
*inhabited*.

The GHC manual (until recently) called `Type`

“the kind of types with
values”. This is not true. If we enable the `MagicHash`

extension and import
`GHC.Prim`

, we get access to unlifted types such as `Int#`

.
`Int#`

definitely has values as witnessed by `42# :: Int#`

,
but when we query `:k Int# :: Type`

, we get an error saying “Expecting
a lifted type, but `Int#`

is unlifted”. So there are inhabited types
without kind `Type`

.

It is also wrong to say that `Type`

is the kind of types that
definitely has inhabitants. Once again the kind of `Int#`

is `TYPE 'IntRep`

and `Int#`

is the only type of that kind. We
already know it has inhabitants. In fact, in a sense `TYPE 'IntRep`

is
superior because `data Void`

creates a type `Void :: Type`

where the only inhabitants are degenerate such as `error "Oops!"`

and
`undefined`

. Neither of these are proper values. `TYPE 'IntRep`

can claim to be *a* kind of types that has non-degenerate
inhabitants.

As a final piece of evidence about why `Type`

is a bad name, you can
consult `GHC.Types`

which defines the kind `Type`

as `TYPE 'LiftedRep`

. Even GHC admits that `Type`

is more specific
than what the name implies.

So `Type`

is a bad name because of non-`Type`

types! We’ve
already seen `Int#`

,
let’s find some more.

`Type`

constructors

`Maybe`

takes a `Type`

and returns a `Type`

. How
about `Either`

? It takes two `Type`

s and returns a
`Type`

. You can say they are type-level functions and you wouldn’t be
wrong, but we can be more specific. We can say that `Maybe`

and
`Either`

construct `Type`

s just like `(:)`

and
`[]`

at the term level.

Are `Maybe`

and `Either`

types themselves? They are types
but not `Type`

s. Asking `ghci`

reveals that `Maybe`

has
kind `Type -> Type`

and `Either`

has kind `Type -> Type -> Type`

.

`Type -> Type`

is not the same thing as `Type`

, but (here is
the confusing part) `Type -> Type`

has the kind `Type`

. Get
your head around that! If you can’t, that’s fine. The intuition is that types
and kinds are one and the same, then so are the function arrow `(->)`

and kind arrow `(->)`

. A more concrete explanation will follow once we
cover levity polymorphism. One implication of `Type -> Type :: Type`

is that, it is inhabited. For example, `id :: Type -> Type`

type checks.

We have `Type`

s; we have things that construct `Type`

s; and
we have unlifted types such as `Int#`

. What else?

## Data type promotion

So far, we’ve only seen inhabited types. `Int`

and `Int#`

are obviously examples, but `Type -> Type`

is also inhabited since
that too has kind `Type`

.

Emphasising inhabitation as a property implies that there must be some uninhabited kinds. In fact, these are the pillars of theorem proving and property encoding in Haskell.

Consider the following `List`

declaration.

`data List a = Nil | Cons a (List a)`

In vanilla Haskell, this generates a type `List`

and two data
constructors `Nil`

and `Cons`

.

```
List :: Type -> Type
Nil :: List a
Cons :: a -> List a -> List a
```

With the `DataKinds`

extension, you also get the following.

```
'Nil :: List a
'Cons :: a -> List a -> List a
```

Despite looking pretty similar, these are different beasts. Since there is no
distinction between types and kinds, the type constructor `List`

is
also a kind constructor. Then, `'Nil`

and `'Cons`

are type
constructors, but they are not `Type`

constructors, they are `List a`

constructors! All promoted types are automatically uninhabited. So
there is no term `t`

with `t :: 'Cons Int 'Nil`

.

This promotion feature alone spawns multiple reasons why people do not like fancy types in Haskell:

The

`'`

prefix of promoted type-constructors is optional, but terms and types are completely separate. So when I type`Nil`

, GHC figures out whether it is a term or a type constructor depending on the context. In the absence of`'`

, we need to disambiguate ourselves.The built-in list type

`[a]`

is automatically promoted. This means there is`[]`

, the equivalent of`Nil`

. There is`[]`

, the type and kind constructor equivalent to`List`

. Then there is`'[]`

, the type constructor equivalent to`'Nil`

. Remember that`'`

is optional. So when I use`[]`

, we don’t know, if it is the type constructor`List`

or the type constructor`Nil`

. A similar situation occurs with tuples, where the term and the type share similar syntax.

Note that this is the improved state of affairs. Kinds and types used to be
separated and there was also a separate kind `[]`

with sort (the
classification of kinds) `BOX`

.

Nevertheless, promoted types are a blessing. They act as indices to other data types and help encoding various properties at type level. We come back to this while introducing GADTs.

## Kind polymorphism

Just as there are polymorphic types such as `[a] -> [a]`

, there are
also polymorphic kinds. In fact, `'Cons`

has kind `a -> List a -> List a`

where `a`

is a kind variable. We can see this in `ghci`

.

The kind variable `a`

can be `Type`

,

```
> :k 'Cons Int
'Cons Int :: List Type -> List Type
```

Or it can be the kind of a type constructor such as `Type -> Type`

```
> :k 'Cons Maybe
'Cons Maybe :: List (Type -> Type) -> List (Type -> Type)'
```

We can also use a promoted kind such as `List a`

, which results in
another kind polymorphic type.

```
> :k 'Cons 'Nil
'Cons 'Nil :: List (List a) -> List (List a)
```

## Levity polymorphism

The distinction between types with and without inhabitants stand on solid ground in GHC and leads to beautiful generalisations over types that have inhabitants. We now explore that.

This section consolidates the previous discussions of type habitation. If you’re solely interested in verification, you can skip it.

What is the kind of the `Type`

constructor `List`

?

```
> :k List
Type -> Type
```

The return kind makes sense, it’s a `Type`

constructor after all, but
why the input kind `Type`

? Since `Cons`

’s first parameter
has type `a`

, constructing a term `Cons x xs`

necessitates a
term `x :: a`

, hence `a`

must be a type with kind
`Type`

.

Hopefully, my rant about `Type`

being a misnomer made you doubt the
last statement. What about `Int#`

? Since `Int#`

has
inhabitants, by the reasoning above `a`

can also be `Int#`

.
More generally, we want `a`

to be a type that has a runtime
representation.

You remember `TYPE`

? The kind that spawns `Type`

and `TYPE 'IntRep`

. Let see what kind it has.

```
> :k TYPE
TYPE :: RuntimeRep -> Type
```

Aha! `TYPE`

constructs things that have runtime representations. So we
want the type variable of `List`

to have kind `TYPE rep`

, so
that it ranges over everything that has a runtime representation. This idea of
abstracting over runtime representations is called *levity polymorphism*.

But why doesn’t GHC infer that as the kind of `a`

? Let’s try declaring
a levity polymorphic `List`

explicitly.

`> data List (a :: TYPE rep) = Nil | Cons a (List a)`

```
A levity-polymorphic type is not allowed here:
Type: a
Kind: TYPE rep
```

The reason this doesn’t work and why GHC defaults `a`

to
`Type`

is because if we want to create a data type, we need to know
its runtime representation in advance to lay down the data while generating
code. For example, `Int#`

requires 32 bits but `Int`

requires a pointer to a thunk, hence 64 bits. Unless you know how big the data
is you can’t generate the code (at least not without introducing runtime code
generation or indirection which defeats the purpose of unlifted types).

More generally, the paper introducing levity polymorphism has the following maxim for its usage:

Never move or store a levity-polymorphic value.

This rules out making a function as simple as `id`

levity polymorphic
because it moves values.

This raises the question, what can be levity polymorphic? The classic example is
`error`

. It has type `String -> a`

, so `a`

needs
to be runtime representable. It neither stores nor moves whatever `a`

is. Hence, it can be and is levity polymorphic in GHC:

`error :: forall (rep :: RuntimeRep) (a :: TYPE rep). String -> a`

You need `-fprint-explicit-runtime-reps`

flag and the `+v`

option to `:t`

to get
the signature.

```
> :set -fprint-explicit-runtime-reps
> :t +v error
```

Let’s look at something more interesting. What is the kind of the function type
constructor `(->)`

?

`(->) :: forall {r :: RuntimeRep} {s :: RuntimeRep}. TYPE r -> TYPE s -> Type`

This shows why `Type -> Type`

has inhabitants. It is because
`(->)`

is a `Type`

constructor. More importantly, it shows
that when we write a function between lifted and unlifted data types, we are in
fact using the same arrow rather than syntactic magic.

## Inhabitable out of uninhabitable

What is the kind of the following data type?

`data MyProxy a = MkMyProxy`

If we ask `ghci`

, we get `Type -> Type`

again. However, this time
`a`

does not appear as a type parameter to the sole constructor of
`MyProxy`

, so there is no reason for it to have a runtime
representation. In principle, the type argument to `MyProxy`

can be
*anything*. This sounds kind polymorphic.

GHC, by default, assumes that the type variables of a type constructor have the
kind `Type`

even if they can be more generic. If you turn on the
`PolyKinds`

extension, GHC correctly infers the kind `k -> Type`

to
`Proxy`

, where `k`

is a kind variable.

This is nice because it is general, but also unmotivated at the moment because we haven’t yet made any use of poly-kindedness. Later, we define a poly-kinded equality type illustrating the utility of kind polymorphism.

## Summary

Haskell is slowly evolving into a practical language that unifies terms and types. We are not quite there yet and the gradual transition creates some interesting and tough-to-wrap-your-head-around language concepts. This section gives a bird’s-eye view of these concepts that we shall use as building blocks of useful type-level programming.

# Verifying the leftist property

Let’s encode the leftist property in a data type. That is, we will ensure that
each the rank of each right child of a node is less than or equal to the rank of
its left child. This necessitates access to ranks at the type-level. Previously,
we used `Int`

for ranks, but ranks are really just natural numbers.

We have two (main) options for type-level naturals:

- importing
`GHC.TypeLits`

, which uses compiler magic to define a`Nat`

kind - or defining a
`Nat`

kind inductively from scratch.

The advantage of (1) is it is efficient and we get to use numeric literals such
as `42`

. The advantage of (2) is that it is not compiler magic and we
get to see how theorem proving works in action. Hence, we’ll do (2).

If you reproduce this implementation using (1), you should probably use the
`singletons`

library to fake
dependent types and the fantastic GHC type-checker plugin
`ghc-typelits-natnormalise`

to use arithmetic properties of natural numbers. Because type-level naturals are
not inductively defined, we can’t do the kind of proofs that dependently-typed
languages are good at. Thus, we rely on external solvers.

Here’s the plan: reinvent natural numbers and \(\leq\); use those to define a data
type that makes non-leftist heaps illegal; attempt and fail to define
`merge`

; go prove some properties about natural numbers; and finally
succeed at implementing `merge`

. Ready? It will be fun; I promise.

## Natural numbers

We need the type-level natural numbers and \(\leq\) relation between them. Let’s start with naturals.

`data Nat = Z | S Nat deriving (Eq, Show)`

This gives us a type and a kind `Nat`

, data constructors `Z :: Nat`

and `S :: Nat -> Nat`

, and type constructors `'Z :: Nat -> Nat`

and `'S :: Nat -> Nat`

.

### Generalised algebraic data types

Type-level naturals were pretty easy. Next, we need to define the \(\leq\)
relation using generalised algebraic data types (GADTs) enabled via `GADTs`

extension. However, since \(\leq\) is a mean first example for GADTs, we start
with natural numbers that remember whether they are zero or not at the type
level.

GADTs provide an alternative syntax for declaring data types and the ability to
discriminate types based on constructors. The `AnotherNat`

data type
in GADT syntax below is exactly the same as `Nat`

.

```
data AnotherNat :: Type where
AZ :: AnotherNat
AS :: AnotherNat -> AnotherNat
```

The (boring) return types of constructors are now explicit. GADTs shine when the constructor choice affects the return type. Consider a data type for natural numbers encoding zeroness of a natural.

```
data Zeroness = Zero | NonZero
data TaggedNat :: Zeroness -> Type where
TZ :: TaggedNat 'Zero
TS :: TaggedNat a -> TaggedNat 'NonZero
```

This says if a term is constructed using `TZ`

, we have a
`'Zero`

natural. But if it is constructed with `TS`

instead,
regardless the natural number being succeeded, we have a `'NonZero`

natural number.

With this we can write a total function a `div :: TaggedNat a -> TaggedNat 'NonZero -> TaggedNat b`

and the compiler will complain every time we
try to divide by zero.

Now if we implement `div`

, it is tempting to be a good developer and
write the following case.

`div _ TZ = error "Impossible! The compiler ensures it!"`

Doing so prompts GHC to kindly inform us that this case cannot occur and thus is not needed. In fact, GHC wouldn’t even bother generating the code to raise the infamous “non-exhaustive patterns” exception because it knows the type checker wouldn’t let such an offence reach code generation. So not only do GADTs improve safety, but also, other things equal, lead to less code and consequently improve efficiency.

### Less than or equal to relation

Let’s finally define \(\leq\).The `TypeOperators`

extension is needed to use infix
operators at the type level.

```
data (<=) :: Nat -> Nat -> Type where
Base :: 'Z <= 'Z
Single :: n <= m -> n <= 'S m
Double :: n <= m -> 'S n <= 'S m
```

This defines a relation between two `Nat`

s. It records a series of
steps that gets us to the desired \(n \leq m\) starting from an indisputable fact.
The record of these steps is a proof of membership to this relation.

The indisputable fact is \(\vdash_{\mathit{PA}} 0 \leq 0\) (\(\vdash_{\mathit{PA}}\)
means the statement is provable with the Peano
axioms which constitutes the
everyday theory of arithmetic) encoded by the `Base`

constructor.
Then by applying a series of `Single`

s and `Double`

s, we try
to produce the desired inequality \(n \leq m\). These constructors encode the
following statements: \(\vdash_{\mathit{PA}} n \leq m \implies n \leq m + 1\), and
\(\vdash_{\mathit{PA}} n \leq m \implies n + 1 \leq m + 1\). Hopefully, neither
are controversial as our verification claims hinge on these being sound.

For example, to establish \(1 \leq 2\) we need to give a term of type `'S 'Z <= 'S ('S 'Z)`

.

```
oneLEQtwo :: 'S 'Z <= 'S ('S 'Z)
= Single $ Double $ Base oneLEQtwo
```

This proof can be read as “from \(0 \leq 0\), we can get to \(1 \leq 1\) by incrementing both sides; and from there, we can get to \(1 \leq 2\) by incrementing the right side.”.

Besides soundness, we also care about *completeness*. Is it the case that by
applying `Single`

s and `Double`

s to `Base`

, we can get to
all valid \(n \leq m\)? Yes, but we won’t prove it in this post. In fact there are
multitude of ways of proving a valid \(n \leq m\). As an example, let’s look
at another proof of `oneLEQtwo`

.

```
oneLEQtwo' :: 'S 'Z <= 'S ('S 'Z)
= Double $ Single $ Base oneLEQtwo'
```

You might have noticed that, the order of `Single`

s and
`Double`

s doesn’t matter. Then, a proof of \(n \leq m\) always starts
with `Base`

, and is followed by \(m - n\) `Single`

s and \(n\)
`Double`

s in any order.

### Singletons: faking dependent types

We can use the insight for reaching a valid \(n \leq m\) to recover \(n\) and \(m\) given an inequality. What would be the type of a function doing that? We could try the following.

```
recoverAttempt :: n <= m -> (n,m)
= undefined recoverAttempt
```

This doesn’t work because \(n\) and \(m\) have kind `Nat`

but the
`(,)`

type constructor has the kind signature `Type -> Type -> Type`

. So is it just the case that `(,)`

is not the right
kind of container? The problem runs deeper. Eventually we want access to
`n`

and `m`

at runtime, this implies there should be some
terms `t :: n`

and `r :: m`

. We know this won’t work because
`n`

and `m`

have kind `Nat`

and types of that kind
don’t have inhabitants.

In a dependently-typed language this wouldn’t be an issue because there is no distinction between terms and types, so every entity, let it be term or type, can survive compilation.

Sadly, Haskell is not there yet, so we need to fake it using *singletons*. The
idea is to create an indexed data type, so that there is exactly one term for
each indexing type. That is all a bit vague, let’s just do it for naturals.

```
data SNat :: Nat -> Type where
SZ :: SNat 'Z
SS :: SNat n -> SNat ('S n)
```

You see while the type `'Z`

has kind `Nat`

, the type `SNat 'Z`

has kind `Type`

and it has exactly one inhabitant:
`SZ`

. This correspondence is true for all naturals. Hence, we can use
the singleton type `SNat n :: Type`

as the term-level representative
for `n :: Nat`

.

Now, we can write the recovery function.

```
recover :: n <= m -> (SNat n, SNat m)
Base = (SZ, SZ)
recover Single nLEQsm) | (x,y) <- recover nLEQsm = ( x, SS y)
recover (Double nLEQm) | (x,y) <- recover nLEQm = (SS x, SS y) recover (
```

This function uses the inductive structure of `n <= m`

. We know from
their types that, `Single`

increments the right side of `<=`

and `Double`

increments both sides. We just turn them to explicit
increments to recover the singletons for `n`

and `m`

.

In fact, we can be sure that our implementation is correct. The type is so
specific that unless we use a degenerate term like `undefined`

, there
is simply no way of getting a buggy implementation to type check.

## Rank encoded leftist heaps

We have everything needed to encode the leftist property at the type level.
Since the leftist property involves comparing ranks of subheaps, we need access
to rank at type level. `Rank`

does that using a an `SNat`

.
We also define a helper to increment the rank for later use.

```
newtype Rank n = Rank { _unRank :: SNat n }
inc :: Rank rank -> Rank ('S rank)
Rank snat) = Rank (SS snat) inc (
```

Since heaps need to be indexed by their rank, we use a GADT.

```
data SafeHeap :: Nat -> Type -> Type where
Leaf' :: SafeHeap 'Z a
Node' :: a -> Rank ('S m) -- Node' data
-> SafeHeap n a -> SafeHeap m a -- Children
-> m <= n -- Leftist invariant
-> SafeHeap ('S m) a
```

This data type hopefully doesn’t look scary anymore. The `Leaf'`

constructor creates a `SafeHeap`

of rank 0. The `Node'`

constructor grows the heap only when we can show that the rank of the right
subheap is less than or equal to that of the left subheap. Further, the
resulting heap has rank one more than that of the right subheap.

What did we just do? We created a data type whose inhabitants are either vacuous or that it is a tree satisfying the leftist property. Let’s try some examples.

```
heap1 :: SafeHeap ('S 'Z) Int
= Leaf' heap1
```

This fails because the `Leaf'`

forces the rank to be `'Z`

instead of `'S 'Z`

as required by the signature.

```
heap2 :: SafeHeap ('S 'Z) Int
= Node' 42 ('SS 'SZ) Leaf' Leaf' Base heap2
```

This type checks because `Leaf'`

s has rank `'Z`

and
`Base`

proves `'Z <= 'Z`

.

```
heap3 :: SafeHeap ('S 'Z) Int
= Node' 42 ('SS 'SZ) heap2' Leaf' (Single Base) heap3
```

This also type checks because the right child has a lower rank (`'Z`

)
than the left child (`'S 'Z`

) and `Single Base`

proves `'Z <= 'S 'Z`

.

```
heap4 :: SafeHeap ('S 'Z) Int
= Node' 42 ('SS 'SZ) Leaf' heap2' ??? heap4
```

Unless we replace `???`

with a degenerate term, we won’t be able to
find a proof for `'S 'Z <= 'Z`

on the account of its being false. This
is precisely how the data type prevents us from violating the leftist property.

We just made terms that violate the leftist property illegal. Pretty cool, huh?

## Heap instance for SafeHeap

Making property violating terms illegal is one thing, defining operations on legal ones another.

The `Heap`

instance for `LeftistHeap`

was directly on that
data type. Consequently, the signatures of heap operations all involved
`LeftistHeap`

. This direct approach tends to fail with
property-encoding fancy types.

Say we tried to make `SafeHeap rank a`

an instance of
`Heap`

.

```
instance Ord a => Heap (SafeHeap rank a) where
type Elem (SafeHeap rank a) = a
```

We’re already in a bad place. This forces the type of `merge`

to be
`SafeHeap rank a -> SafeHeap rank a -> SafeHeap rank a`

. This type is
too restrictive! We want to be able to merge heaps of different ranks and to
produce a heap of rank not identical to the input heaps.

Let’s say that we gave up on the typeclass and decided to define all the
operations at the top level. Then we could give `merge`

the type
`SafeHeap rank1 a -> SafeHeap rank2 a -> SafeHeap (Fx rank1 rank2) a`

.
Here `Fx`

is some type-level function. This allows inputting heaps of
different ranks. We still need to figure out the rank of the output, but that
*depends* on the input heaps in their entirety not just their ranks. The word
“depend” is a red flag. Do we need to create singletons for
`SafeHeap`

s as well? This line of thinking will lead to too much
work.

Time to take a deep breath and think. What was our original goal? It was to
preserve the leftist property. Does that require knowing the effects of
operations on the rank of the heap at type level? No, not really. For `merge`

,
we want to input two `SafeHeap`

s of *some* rank and produce a
`SafeHeap`

of *some* rank. The fact that the output is a
`SafeHeap`

is enough to ensure the leftist property is preserved,
which is our goal.

So we want to perform the operations a data type that is indifferent to the rank
at the type-level just as the untyped version was. Existential types enabled via
`ExistentialQuantification`

can achieve this.

`data SomeSafeHeap label = forall rank. SSH (SafeHeap rank label)`

Despite writing `forall`

, what we mean is “within `SomeSafeHeap a`

,
**there exists** a `rank`

such that `SafeHeap rank a`

”.
Hence, the name existential types. Let’s try the `Heap`

instance
again.

```
instance Ord label => Heap (SomeSafeHeap label) where
type Elem (SomeSafeHeap label) = label
```

Now, `merge`

has type `SomeSafeHeap a -> SomeSafeHeap a -> SomeSafeHeap a`

which makes no assertions about ranks. Yet its inputs
and output contain `SafeHeap`

s and thus satisfy the leftist property.

**The key take away:** if you use fancy types, reach for the existential as soon
as possible.

This is not to say that you should never implement operations relating fancy types of inputs and outputs. If you succeed, it will give you more static guarantees! The question is whether they justify the effort.

The `singleton`

function for `SomeSafeHeap a`

is a trivial
example of this. We know that the singleton heap should have rank 1. Since rank
information is hidden behind an existential, there is nothing preventing us from
defining `singleton x`

to be `empty`

.

`= empty singleton x `

This type checks just fine. One way to improve the situation is to extract the
`SafeHeap ('S 'Z) a`

into its own definition and reduce the likelihood
of such a mistake.

```
= SSH singleton'
singleton x where
singleton' :: SafeHeap ('S 'Z) a
= Node' x (Rank (SS SZ)) Leaf' Leaf' Base singleton'
```

This is less formal verification and more trying to squeeze every bit of safety through self-discipline, but it did help me catch bugs more than once.

## Merging `SafeHeap`

s

Here’s the partial code for the `merge`

operation on safe heaps.

```
SSH Leaf') heap = heap
merge (SSH Leaf') = heap
merge heap (@(SSH (Node' x _ left1 right1 _))
merge heap1@(SSH (Node' y _ left2 right2 _)) =
heap2if x > y
then mkNode x (SSH left1) (merge (SSH right1) heap2)
else mkNode y (SSH left2) (merge (SSH right2) heap1)
where
mkNode :: a -> SomeSafeHeap a -> SomeSafeHeap a -> SomeSafeHeap a
SSH h1) (SSH h2) = _ mkNode a (
```

This bit is almost identical to the unverified version except for some
unwrapping and wrapping with `SSH`

. This makes sense because as we
pointed in the unverified version, `mkNode`

is where the leftist
property is preserved by placing the heap with the smaller rank to the right.

### Comparing without forgetting

In the unverified leftist heap’s `mkNode`

, the term-level
`(<=)`

operator decided on which child is placed to the right. It
seems all we need is to provide an analogous operator for `SNat`

s.
However, we now also need a proof that the rank of the right child is less than
the rank of the left child. Sadly, `(<=)`

determines the desired
property and forgets it immediately by returning a `Bool`

.

What we want here is *connexity*, that is given any \(n\) and \(m\), either \(n \leq m\) or \(m \leq n\). This holds for total
orders including \(\leq\) on natural
numbers. We can express this in Haskell easily.

```
lemConnexity :: SNat n -> SNat m -> Either (n <= m) (m <= n)
SZ y = Left (lemZLEQAll y)
lemConnexity SZ = Right (lemZLEQAll x) lemConnexity x
```

The base cases are simple, we need to show \(0 \leq m\) and \(0 \leq n\). A lemma
for \(0 \leq x\) for an arbitrary \(x\) would handle both cases. Let’s suppose
`lemZLEQAll :: SNat n -> Z' <= n`

for the moment.

This make-believe with lemmas is how top-down proofs work. To focus on the proof at hand, we postulate reasonable looking statements.

The inductive case is also simple and is a good opportunity to demonstrate doing
proofs with *typed holes*.

`SS x) (SS y) = _ lemConnexity (`

The `_`

symbol is a typed hole. It causes an error stating what type
of term needs to go in `_`

and the types of terms within the scope.
Incremental development with typed holes is helpful with fancy types because
when we pattern match on a GADT, the goal and the argument types automatically
get refined (especially if type-level functions are involved, more on that
later). Without the compiler telling you, it’s difficult to
keep track of the refined types in different branches.

Since `SNat`

is a GADT, pattern matching on `SS`

refines the
type of `_`

. The goal becomes `Either ('S n1 <= 'S n2) ('S n2 <= 'S n1)`

and the arguments become `x :: SNat n1`

and `y :: SNat n2`

. If we had `Either (n1 <= n2) (n2 <= n1)`

, we could probably make progress. Recursively calling
`lemConnexity`

gives us that.

```
SS x) (SS y) =
lemConnexity (case lemConnexity x y of
Left xLEQy -> _
Right yLEQx -> _
```

Now we have two typed holes. We need to build terms of `Either ('S n1 <= 'S n2) ('S n2 <= 'S n1)`

from `xLEQy :: n1 <= n2`

and from `yLEQX :: n2 <= n1`

independently. The `Double`

constructor
produces a new inequality from an existing one by incrementing both sides.
Combined with `Left`

or `Right`

depending on the case, we
reach the desired terms.

```
SS x) (SS y) =
lemConnexity (case lemConnexity x y of
Left xLEQy -> Left (Double xLEQy)
Right yLEQx -> Right (Double yLEQx)
```

We are now almost done with `lemConnexity`

. Earlier we postulated
`lemZLEQAll`

. That still needs proving.

```
lemZLEQAll :: SNat n -> 'Z <= n
SZ = Base
lemZLEQAll SS x) = Single (lemZLEQAll x) lemZLEQAll (
```

That’s it. We just proved some order theoretic properties.

The ergonomics of this process are lacking. Errors are not a productive way to interact with typed holes. Inspecting terms and their types available at a given hole without clutter would be useful, but GHC spits 50 lines per hole and very little of that is helpful. There isn’t much help with crafting the proof either.

By contrast, Idris and Agda provide editor integration to quickly inspect the context of holes, to search and fill in terms that satisfy the hole, and to refine the whole when given a partial expression.

Haskell has *valid hole fits* which are term suggestions for holes. They are
also verbose (but can be made less so with `-funclutter-valid-hole-fits`

flag)
and, in my experience, not very helpful because imported libraries often include
degenerate terms with polymorphic return types (like `error`

). Hence
valid hole fits always include them but are not what we want. For example, in
the source code of this project I’ve imported QuickCheck from early on. It
exports `discard :: forall a. a`

which gets suggested as a valid hole
fit to every single hole.

That said, with some discipline about making the lemmas small and properties simple, we can go a long distance as long as we are not trying to mechanise an entire field of mathematics.

If you are interested in this style of *type-driven development*, there is a
(w)hole book (affiliate link) on it by the Idris
language creator Dr Edwin Brady.

### Making nodes with proofs

By harnessing `lemConnexity`

, implementing `mkNode`

is a breeze.

```
mkNode :: a -> SomeSafeHeap a -> SomeSafeHeap a -> SomeSafeHeap a
SSH h1) (SSH h2) =
mkNode a (case lemConnexity (_unRank . rank $ h1) (_unRank . rank $ h2) of
Left r1LEQr2 -> SSH $ Node' a (inc $ rank h1) h2 h1 r1LEQr2
Right r2LEQr1 -> SSH $ Node' a (inc $ rank h2) h1 h2 r2LEQr1
```

The lemma tells us which heap has the lower rank (hence needs to be the right
child) as well as giving us a proof for it which is all that is needed to
construct a `Node'`

.

# Verifying the heap property

Let’s do it one last time with both the heap and the leftist properties.

So what is the heap property mathematically? For a node \(n\) and its children \(l\) and \(r\), it is the case that \(\mathit{label}(l) \leq \mathit{label}(n)\) and \(\mathit{label}(r) \leq \mathit{label}(n)\).

We now need labels in types as well as ranks. So far we used arbitrary
`Type`

s for labels. As a simplification, we decree that they must
effectively be `SNat`

s, so that we can build on our existing theory of
naturals.

## Rank and label encoded leftist heaps

To avoid confusion with ranks which also use `SNat`

s, we wrap the
label `SNat`

s.

`newtype Label n = Label { _unLabel :: SNat n }`

We don’t need anything else to declare the data type that makes non-heaps illegal.

```
data SaferHeap :: Nat -> Nat -> Type where
Leaf'' :: SaferHeap 'Z 'Z
Node'' :: Label a -> Rank ('S m) -- Node' data
-> SaferHeap n b -> SaferHeap m c -- Children
-> m <= n -- Leftist property
-> b <= a -> c <= a -- Heap property
-> SaferHeap ('S m) a
```

`SaferHeap`

looks a lot like `SafeHeap`

, except for
`Node''`

’s fancy-typed label and two inequalities to maintain the heap
property.

`Leaf''`

carries a label `'Z`

because every
`SaferHeap`

needs a type-level label. Using `'Z`

is a hack.
It is the least natural number, hence it can’t inhibit a `Node''`

construction whatever its label may be.

There are two immediate alternatives to this:

Using

`Maybe Nat`

rather than`Nat`

for the label. This requires modifying the heap property so that rather than`b <= a`

, we’d need “given`b`

is`'Just b'`

,`b' <= a`

” and similarly for`c <= a`

.Using

`Maybe Nat`

again, but instead of changing the heap property, we create three`Node''`

like constructors: one with both children having`'Just`

labels, one with only the left child having a`'Just`

label, and one with neither having`'Just`

labels (why don’t we need the fourth case?). This way, the heap property remains simple.

These may be cleaner than exploiting `'Z`

being the least element in a total
order. Both, however, complicate every function that needs to scrutinise a
`SaferHeap`

.

The next step is to wrap the data type in an existential just like last time.

`data SomeSaferHeap = forall rank label. SSH' (SaferHeap rank label)`

At this point, if we construct some `SaferHeap`

s, we’d get bored
quickly due to constructing tedious explicit proofs. Luckily we use existentials
and `Heap`

to interact with the data type.

## Heap instance for SaferHeap

Just as before the instance is for the existentially wrapped type.

```
instance Heap SomeSaferHeap where
type Elem SomeSaferHeap = Nat
```

`Elem`

for `SomeSaferHeap`

is interesting because we don’t
actually store `Nat`

s anywhere. So `insert`

requires a
conversion from `SNat`

to `Nat`

and `findMax`

from
`Nat`

to `SNat`

.

The `SNat`

to `Nat`

direction is easy.

```
demote :: SNat n -> Nat
SZ = Z
demote SS n) = S (demote n) demote (
```

But the opposite direction would have a signature `Nat -> SNat n`

.
Let’s try to write that.

```
promoteAttempt :: Nat -> SNat n
Z = SZ
promoteAttempt S n) | snat <- promoteAttempt n = SS snat promoteAttempt (
```

Well, this doesn’t type check because one branch returns `SNat 'Z`

and
the other `SNat ('S m)`

for some `m`

. Neither `'Z`

nor `'S`

unifies with `n`

in the signature.

This might initially sound like a bad reason because when recovering
`SNat`

s out of a `(<=)`

, the function signature is `n <= m -> (SNat n, SNat m)`

, so it seems the same unification problem should
occur there as well. The difference is that because the argument also contains
`n`

and `m`

and that `(<=)`

is a GADT, pattern
matching *locally* refines what `n`

and `m`

are. However,
pattern matching on `Nat`

in `promoteAttempt`

does not
refine the return type.

Since heap operations are performed on existentially wrapped types, we only need
the type-level `n`

in the function body and not in the type signature
for the operation. So we do not really care what `n`

is going to be so
long as there is *some* `n`

that we can embed in the heap. This
sounds like an existential. So the `promote`

returns an existentially
wrapped `SNat`

.

```
data SomeNat = forall n. SomeNat (SNat n)
promote :: Nat -> SomeNat
Z = SomeNat SZ
promote S n) | SomeNat snat <- promote n = SomeNat $ SS snat promote (
```

A good exercise is to implement `singleton`

for `SomeSaferHeap`

. It is
similar to that for `SomeSafeHeap`

, except for the use
`promote`

and the evidence for the heap property. See the source code
for the answer.

## Merging `SaferHeap`

s

The merge operation is, once again, all we care about. The overall structure is going to be the same, but we’ll need more lemmas and plumbing.

### Making nodes

This time let’s start from `mkNode`

. Here’s an attempt.

```
mkNode :: Label c
-> SomeSaferHeap -> SomeSaferHeap
-> SomeSaferHeap
SSH hA) (SSH hB)
mkNode vc (| rA <- rank hA, rB <- rank hB =
case lemConnexity (_unRank rA) (_unRank rB) of
Left arLEQbr -> SSH $ Node'' vc (inc rA) hB hA arLEQbr ??? ???
Right brLEQar -> SSH $ Node'' vc (inc rB) hA hB brLEQar ??? ???
```

This attempt fails due to a lack of evidence for the heap property.

We have the parent and the children labels, so we could decide on the evidence.
But this requires us to handle the case of `vc`

being smaller than one
of the children labels. This should be established in the body of
`merge`

if it is anything like the previous version. Let’s assume
`merge`

indeed passes down the evidence.

```
mkNode :: Label c
-> SomeSaferHeap -> SomeSaferHeap
-> a <= c -> b <= c
-> SomeSaferHeap
SSH hA) (SSH hB) aLEQc bLEQc
mkNode vc (| rA <- rank hA, rB <- rank hB =
case lemConnexity (_unRank rA) (_unRank rB) of
Left arLEQbr -> SSH $ Node'' vc (inc rA) hB hA arLEQbr bLEQc aLEQc
Right brLEQar -> SSH $ Node'' vc (inc rB) hA hB brLEQar aLEQc bLEQc
```

This doesn’t work either because we have an `a <= c`

for the first
`SomeSaferHeap`

, but that type hides `a`

. As far as the type
checker is concerned the rank of `hA`

has nothing to do with `Rank a`

.

It seems we’re hiding too much. Since `mkNode`

is not part of the
`Heap`

typeclass, perhaps we can use `SaferHeap`

instead of
`SomeSaferHeap`

in the signature of `mkNode`

.

```
mkNode :: Label c
-> SaferHeap r1 a -> SaferHeap r2 b
-> a <= c -> b <= c
-> SaferHeap ??? c
```

This type signature relates the input heaps to the evidence, but it also
requires handling ranks. There are three viable choices for `???`

:
`r3`

, `'S r1`

, and `'S r2`

. The first one runs
into the same problem as `promote`

, the calculated node rank won’t
unify with `r3`

. The last two can be made the work but it presupposes
that the call site already knows which heap is going to be the right child,
hence `mkNode`

would be pointless.

So `SomeSaferHeap`

hides too much and `SaferHeap`

hides too
little. What we need is something in the middle to hide the rank, but expose the
label. Once again existential types come to the rescue.

`data AlmostSomeSaferHeap label = forall rank. ASSH (SaferHeap rank label)`

With this, the `mkNode`

we need is not too different from that for
`SomeSafeHeap`

.

```
mkNode :: Label c
-> AlmostSomeSaferHeap a -> AlmostSomeSaferHeap b
-> a <= c -> b <= c
-> AlmostSomeSaferHeap c
ASSH hA) (ASSH hB) aLEQc bLEQc
mkNode vc (| rA <- rank hA, rB <- rank hB =
case lemConnexity (_unRank rA) (_unRank rB) of
Left arLEQbr -> ASSH $ Node'' vc (inc rA) hB hA arLEQbr bLEQc aLEQc
Right brLEQar -> ASSH $ Node'' vc (inc rB) hA hB brLEQar aLEQc bLEQc
```

### Merging nodes

We’d like to work with `AlmostSomeSaferHeap`

for the `merge`

implementation as well. In short, unless we do that the implementation doesn’t
go through, but it is going to take some time to explain exactly why. For now,
bear with me.

Let’s observe the use of the intermediary function `merge'`

and its
type signature.

```
SSH' h1) (SSH' h2) | ASSH mergedHeap <- merge' (ASSH h1) (ASSH h2) =
merge (SSH' mergedHeap
merge' :: AlmostSomeSaferHeap a -> AlmostSomeSaferHeap b
-> AlmostSomeSaferHeap (Max a b)
```

The type of `merge'`

will be very precise. The label of a merge result is a
function of the labels of the inputs. Particularly, it is the maximum of the
input labels.

This brings me to type-level functions. We already use type families within the
`Heap`

and `HasRank`

typeclasses, but those are both simple
maps of types. `Max`

is more sophisticated and uses recursion.

#### Type families

The type-level `Max`

function is defined using a closed type family
enabled by the `TypeFamilies`

extension.

```
type family Max (n :: Nat) (m :: Nat) :: Nat where
Max 'Z m = m
Max n 'Z = n
Max ('S n) ('S m) = 'S (Max n m)
```

This is analogous to the following term level `max`

function on
`Nat`

s.

```
max :: Nat -> Nat -> Nat
max Z m = m
max n Z = n
max (S n) (S m) = S (max n m)
```

You might be wondering why not just write that and get a promoted version of
`max`

just as we did with data types and kinds? It’s an excellent
question and this syntactic dichotomy is another reason why people don’t like
type-level programming in Haskell. In Idris and Agda, you can write one function
and use it for both terms and types.

The problem is multifaceted. For one thing, type families existed in GHC since 2007, whereas data-type promotion was added in 2012, and the mandate for moving term and type levels closer is fairly recent. Further, adding type-level computation into Haskell is an after-thought, so you need to retrofit the syntax. On top of that, the behaviour of type families is different than functions, the patterns of a type family can do unification whereas pattern matches of a function can’t.

For example, the following returns the type `Int`

if its arguments
unify and `Char`

otherwise.

```
type family Same a b where
Same a a = Int
Same _ _ = Char
sameInt :: Same [ a ] [ a ]
= 42
sameInt
sameChar :: Same (Maybe a) [ a ]
= 'c' sameChar
```

So although we can promote term-level functions to the type families
(`singletons`

library does this via Template Haskell), they are not exactly
equivalent because term-level variables act differently than their type-level
counterparts.

I don’t know what the current plan is, but since changing the behaviour of type families would break backwards compatibility, the way forward seems to be allowing unification at the term level or adding another extension that makes it illegal at the type-level. Since types heavily rely on unification, the former seems more likely to me, but I’m not an expert!

#### Type families vs GADTs

We could have encoded `Max`

as a GADT as well and similarly we could
have encoded `(<=)`

as a type family.

```
data AltMax :: Nat -> Nat -> Nat -> Type where
L :: AltMax n 'Z n
R :: AltMax 'Z m m
B :: AltMax m n r -> AltMax ('S m) ('S n) ('S r)
type family n <== m :: Bool where
'Z <== m = 'True
<== 'Z = 'False
n 'S n) <== ('S m) = n <== m (
```

These would have also worked. The proofs would have looked different, but they’d have worked. However, there are reasons to choose one over the other.

If you intend to do induction over your relation, then constructors are helpful, so GADTs get a point.

If what you have is a function, then the GADT encoding forces you to add another type variable for the result and the fact that the arguments determine the result get lost.

Conversely, if you have a relation that is not a function and you choose to use a type family, since there is no clear result variable, you need to return a type of kind

`Bool`

or its equivalent.With type families, when you learn more information about the type, the reduction happens automatically, whereas with GADTs you need to pattern match and pass the relation around.

More pragmatically, you might already have some type-level relations in your codebase and you might want to stay consistent with the related relations. Beyond consistency, this might allow you to reuse some lemmas by exploiting duality and/or generalising the lemmas with minor effort.

My choices in this post embody the first three principles.

#### Getting back to the merge

We are ready to look at the base cases for `merge'`

.

```
merge' :: AlmostSomeSaferHeap a -> AlmostSomeSaferHeap b
-> AlmostSomeSaferHeap (Max a b)
ASSH Leaf'') heap = heap
merge' (ASSH Leaf'') = heap merge' heap (
```

Pattern matches reveal the `Leaf''`

labels as `'Z`

to
the type checker. We need to show that `Max 'Z b`

and `Max a 'Z`

are `b`

and `a`

respectively. These are proved definitionally
because `'Z`

hits the base cases of `Max`

.

In the previous version, term-level `(<=)`

decided on the top
label and the subheap to recurse on. We’ve already seen that
`lemConnexity`

is the replacement we need for comparing
`SNat`

s. By mimicking the structure of the previous implementation, we
can provide a partial implementation.

```
ASSH hA@(Node'' vA@(Label sA) _ aLeft aRight _ lLEQa rLEQa))
merge' (ASSH hB@(Node'' vB@(Label sB) _ bLeft bRight _ lLEQb rLEQb)) =
(case lemConnexity sA sB of
Left aLEQb ->
let child1 = ASSH bLeft
= _
c1LEQp = merge' (ASSH bRight) (ASSH hA)
child2 = _
c2LEQp in mkNode vB child1 child2 c1LEQp c2LEQp
Right bLEQa ->
let child1 = ASSH aLeft
= _
c1LEQp = merge' (ASSH aRight) (ASSH hB)
child2 = _
c2LEQp in mkNode vA child1 child2 c1LEQp c2LEQp
```

I said at the beginning of `merge`

implementation that I’d explain why
we used `AlmostSomeSaferHeap`

as opposed to `SomeSaferHeap`

for the `merge'`

implementation. We can now see why. We need to pass
`child2`

to `mkNode`

and also produce it as a result of a
recursive call to `merge'`

, if `merge'`

used
`SomeSaferHeap`

, even if we extract a `SaferHeap`

out of
`child2`

, its label wouldn’t relate to the label used in
`c2LEQp`

. This is the same problem we ran into in `mkNode`

with too much hiding.

Let’s focus on the `Left`

branch first. The type checker complains
about the holes, but more importantly it complains about the `mkNode`

application. GHC says that it could not deduce `Max a b ~ b`

, where
`~`

means types are equal.

But of course! We have `aLEQb`

of type `a <= b`

, but the
type checker is too stupid to know that this implies `Max a b`

is just
`b`

. So we need to prove this. That brings us to *propositional
equality*.

#### Propositional equality

Let me introduce you to `(:~:)`

, the data type that tells the type
checker that two types are equal. It took me ages to get my head around it. So
you have trouble with it, keep using it until it clicks.

```
data (:~:) :: k -> k -> Type where -- Same as that in Data.Type.Equality
Refl :: a :~: a
```

It is a poly-kinded `Type`

constructor. Good that it is poly-kinded,
if it was restricted to `Type`

or `Nat`

that would restrict
which types we can show to be equal. This definition makes no assumptions.

If we pattern match on a term with type `a :~: b`

, the only case is
the `Refl`

constructor. Just like the previous pattern matches on
GADTs, this refines the typing context. In this case, it reveals `a`

and `b`

to
be the same (the signature for `Refl`

says so). Hence, the type
checker learns `a ~ b`

.

We need `Max n m ~ m`

given `n <= m`

, so we show `Max n m :~: m`

. Proceed by induction on `n <= m`

.

```
lemMaxOfLEQ :: n <= m -> Max n m :~: m
Base = Refl lemMaxOfLEQ
```

In the base case, the pattern match on `Base`

reveals both
`n`

and `m`

to be `'Z`

. So we need to show `'Z :~: 'Z`

. We can use `Refl`

by instantiating `a`

to
`'Z`

in `a :~: a`

behind the scenes.

Then comes the first inductive case with `Double`

constructor.

`Double xLEQy) | Refl <- lemMaxOfLEQ xLEQy = Refl lemMaxOfLEQ (`

Here, `xLEQy`

has type `n <= m`

and we need to show `Max ('S n) ('S m) :~: 'S m`

. By the inductive case of `Max`

’s
definition, the goal reduces to `'S (Max n m) :~: 'S m`

. Since
`xLEQy`

is smaller than the original argument, we can recursively call
`lemMaxOfLEQ`

to get a term of type `Max n m :~: m`

.
Pattern matching on that tells the compiler `Max n m ~ m`

, so the goal
reduces to `'S m ~ 'S m`

. Once again, `Refl`

trivially
proves this.

The last case was fairly mechanical and this is often the case. The final inductive case is more interesting.

`Single xLEQy) = _ lemMaxOfLEQ (`

The term `xLEQy`

has type `n <= m`

and we need to prove `Max n ('S m) :~: 'S m`

. Since we don’t know if `n`

is built
with `'S`

constructor (it could be `'Z`

), we don’t get an
automatic reduction of our goal like last time. We still have `xLEQy`

,
so we could apply `lemMaxOfLEQ`

recursively. That would get us `Max n m :~: m`

, but pattern matching on that doesn’t reduce the goal any
further.

The mechanical process got stuck. It’s time to take a step back and think.
Taking inspiration from the previous case, if we knew that `n`

was of
the form `'S k`

, our goal would reduce to `'S (Max k m) :~: 'S m`

. Then we could show `Mak k m :~: m`

and that would reduce
the overall goal to a measly `'S m ~ 'S m`

. To obtain `Max k m :~: m`

, we need a recursive call to `lemMaxOfLEQ`

with a term of
type `k <= m`

, but we only have `'S k <= m`

. Luckily we now
from grade school that \(\vdash_{\mathit{PA}} k + 1 \leq m \implies k \leq m\). So
all we need is a lemma with type `'S n <= m -> n <= m`

.

But we forgot something! This all hinges on `n`

being of the form `'S k`

, what if it isn’t? Well, then it must be `'Z`

, and `Max 'Z m :~: m`

reduces to `m :~: m`

, so we are good.

```
Single xLEQy) =
lemMaxOfLEQ (case fst $ recover xLEQy of
SZ -> Refl
SS _ | Refl <- lemMaxOfLEQ (lemDecLEQ xLEQy) -> Refl
```

Alright, we can prove our final lemma.

```
lemDecLEQ :: 'S n <= m -> n <= m
= uncurry go (recover snLEQm) snLEQm
lemDecLEQ snLEQm where
go :: SNat ('S n) -> SNat m -> 'S n <= m -> n <= m
SZ _ = error "Impossible case."
go _ SS _) (Single leq) = Single (lemDecLEQ leq)
go _ (SS SZ) y (Double _) = lemZLEQAll y
go (SS (SS _)) (SS _) (Double leq) = Double (lemDecLEQ leq) go (
```

There is nothing particularly difficult about this lemma apart from the number of arguments the induction involves, but it has some lessons.

Haskell doesn’t have a termination checker. This is a feature, but when we do proofs, it feels like walking barefoot right after breaking a glass. We can easily create an infinite loop that type checks but does not constitute a valid proof (the circular argument fallacy).

This proof is particularly vulnerable to accidental loops because we pattern match on three variables which makes it difficult to see that we are recursing on something smaller in every case.

We remedy this in `lemDecLEQ`

by making the recursive calls in the
body of `go`

to `lemDecLEQ`

rather than `go`

itself. This is
less efficient, but makes it easier to confirm that each recursive call is to
something strictly smaller than what we started with.

The other interesting bit is the base case of `go`

.

```
go :: SNat ('S n) -> SNat m -> 'S n <= m -> n <= m
SZ _ = error "Inaccessible case." go _
```

We use `error`

due to a deficiency of Haskell. When the second argument is
`SZ`

, there is no constructor of `(<=)`

that can make `'S n <= m`

, but we have a proof of this, namely the third argument. We can
see this by pattern matching on the third argument.

```
go :: SNat ('S n) -> SNat m -> 'S n <= m -> n <= m
SZ Base = undefined
go _ SZ Single{} = undefined
go _ SZ Double{} = undefined go _
```

If you compile this, GHC will give you pattern match inaccessible warnings combined with type errors about why these arguments can’t coexist together.

In Agda and Idris, you can syntactically make it an inaccessible case which the compiler can confirm or deny. In Idris, it looks like this:

`SZ _ impossible go _ `

We don’t have this in Haskell, but there is a stale proposal. What if we omit this case altogether? This leads to a non-exhaustive pattern match warning because only when we pattern match on the second argument GHC learns that the pattern is inaccessible.

Using `error`

as a way around is dangerous because we might prove a
lemma, tweak it slightly, and not realise that the inaccessible case is now
perfectly accessible. It would type check because we have `error "Inaccessible case"`

. This makes the proof incomplete at best.

#### Getting back to the merge again

Focusing on the `Left`

branch only, we make it known to the compiler
that `a <= b`

implies `Max a b ~ b`

using
`lemMaxOfLEQ`

.

```
ASSH hA@(Node'' vA@(Label sA) _ aLeft aRight _ lLEQa rLEQa))
merge' (ASSH hB@(Node'' vB@(Label sB) _ bLeft bRight _ lLEQb rLEQb)) =
(case lemConnexity sA sB of
Left aLEQb | Refl <- lemMaxOfLEQ aLEQb ->
let child1 = ASSH bLeft
= _
c1LEQp = merge' (ASSH bRight) (ASSH hA)
child2 = _
c2LEQp in mkNode vB child1 child2 c1LEQp c2LEQp
Right bLEQa -> ...
```

This gets rid of the type error due to the application of `mkNode`

.

Inspecting the errors for the holes should inform us about the terms we need.
Well, according to GHC, both holes demand type `t`

(distinct rigid
`t`

s). This is incredibly unhelpful and I don’t know why we get
`t`

.

Inlining the `let`

binding improves the situation.

`ASSH bLeft) (merge' (ASSH bRight) (ASSH hA)) _ _ mkNode vB (`

The first hole needs a term of type `l <= b`

and that’s exactly the
type of `lLEQb`

. The second hole needs `Max r a <= b`

, but
we do not yet have a term corresponding to this type. However, we have `rLEQb :: r <= b`

and `aLEQb :: a <= b`

. Since `Max`

is a
selective function (one that returns one of its arguments), the required result
holds we just need to turn it into a lemma.

```
lemDoubleLEQMax :: n <= l -> m <= l -> Max n m <= l
lemMaxSelective :: SNat n -> SNat m -> Either (Max n m :~: n) (Max n m :~: m)
```

The proofs are left as exercises and the solutions are in the source code.

We can now give the full definition of the `Left`

branch.

```
Left aLEQb | Refl <- lemMaxOfLEQ aLEQb ->
let child1 = ASSH bLeft
= lLEQb
c1LEQp = merge' (ASSH bRight) (ASSH hA)
child2 = lemDoubleLEQMax rLEQb aLEQb
c2LEQp in mkNode vB child1 child2 c1LEQp c2LEQp
```

The right branch is analogous to the left one, so you should be able to fill it
yourself. There is going to be a technicality requiring a simple lemma that is
not required in the `Left`

branch because of the way we set things up.
If you write the `Right`

as we did `Left`

, the type error
should give you a clue about what is needed. If you can’t figure it out, that
too is in the source.

#### Other operations?

We could implement the other methods, but they are all too simple. We are done! No more verification.

# Simulating heap operations

So after going through all this trouble to prove properties of our code, why bother testing?

We didn’t verify everything. Earlier we gave the example of implementing

`singleton`

with the wrong rank by ignoring the input. A common theme is to accidentally discard a subpart of a data structure or an input and use another part twice.Haskell’s type system is unsound. So we might have a fallacious proof somewhere that makes it look like the property holds while being buggy.

It gives me a chance to talk about another type-level feature: visible type applications.

I really want to use QuickCheck to run simulations on the heap.

It is worth noting that if Haskell had linear types (it soon will!), we could have addressed (1) by forcing inputs to be used.

## Generating actions

We’re only going to simulate insertion and deleting the min. Here’s the initial encoding of the actions.

`data Action a = Insert a | DeleteMax deriving Show`

Let’s give `Arbitrary`

instances for `Nat`

and
`Action`

to randomly generate sequences of `Action`

s.

```
instance Arbitrary Nat where
= fromInt . getNonNegative <$> arbitrary @(NonNegative Int)
arbitrary where
fromInt 0 = Z
fromInt n = S (fromInt (n - 1))
instance Arbitrary a => Arbitrary (Action a) where
= do
arbitrary <- arbitrary @Bool
shouldAddInsert if shouldAddInsert
then Insert <$> arbitrary
else pure DeleteMax
```

Somewhat arbitrarily we choose between a deletion and an insertion with \(50\%\)
probability. This may or may not be a realistic simulation, but it is something
easy to adjust. We can have multiple wrappers over `Action a`

such as
`DeleteHeavy a`

and `InsertionHeavy a`

to simulate different
scenarios.

We haven’t seen the `@`

symbols in the `Arbitrary`

instances
before. They are visible type applications enabled by `TypeApplications`

extension. In addition to drastically improving handling ambiguous types, they
allow us to learn more about type variables.

### There is \(\Lambda\) then there is \(\lambda\)

If someone asks for the explicitly-typed polymorphic lambda term for the identity function (as one does), we’d probably write \(\lambda x : \alpha. x\), where \(\alpha\) is a polymorphic type variable. We’d expect this function to be closed, that is to say it shouldn’t depend on the context. Indeed, our identity function looks unaffected by the term-level context because the only variable \(x\) is \(\lambda\)-bound. However, \(\alpha\) doesn’t seem to be bound by anything, hence it looks dependent on the context.

Appearances can be deceiving. If this term should be interpreted as Haskell
interprets terms and their signatures, it just abbreviates \(\Lambda \alpha : \mathit{Type}. \lambda x : \alpha. x\). All free type variables in signatures are
implicitly \(\Lambda\)-bound. The Haskell syntax is `forall`

and is only allowed
in signatures (with `ExplicitForAll`

enabled).

The following is the same identity function in Haskell but with explicit quantification.

```
id :: forall (a :: Type). a -> a
id x = x
```

Just as the binders are hidden behind the scenes, so are the applications. When
we apply `id`

to `42`

, the `Int`

gets passed to
the type-level function first. `TypeApplications`

enables syntax to do this
explicitly. We just pass the type with a `@`

prefix. For example in
`ghci`

, `:t id @Int`

yields `Int -> Int`

.

This works as an alternative to using `::`

when a type is ambiguous.
Often it lets us get away with fewer parentheses and looks cleaner in general.

What happens if there are multiple type variables? The applied type is unified with the first type variable. This is how term-level application works as well. This raises the question of type-variable ordering.

In the absence of an explicit `forall`

, the ordering of type variables
is up to GHC. We can query the order of type variables used by GHC via `:type +v`

and `-fprint-explicit-foralls`

flag. However, I think placing explicit
`forall`

s is good practice.

We look at type applications more interesting than those in the
`Arbitrary`

instances momentarily.

## Executing actions

We can now interpret the initial representation of our actions. Nothing exciting here.

```
execAction :: Heap heap => Action (Elem heap) -> heap -> Maybe heap
Insert x) = Just . (x `insert`)
execAction (DeleteMax = deleteMax
execAction
carryOutActions :: Heap heap => [ Action (Elem heap) ] -> Maybe heap
= foldlM (flip execAction) empty carryOutActions
```

## QuickChecking functional equivalence

It’s time to use QuickCheck. The desired property: given two data types
implementing `Heap`

and a series of actions, executing these actions
on the `empty`

of each should yield the same maximum.

```
sameMaxAfterActions :: forall heap heap'
. Heap heap => Heap heap'
=> Elem heap ~ Elem heap'
=> Eq (Elem heap)
=> [ Action (Elem heap) ] -> Bool
=
sameMaxAfterActions acts @heap acts == maxOfActions @heap' acts
maxOfActions where
maxOfActions :: forall h . Heap h
=> [ Action (Elem h) ] -> Maybe (Maybe (Elem h))
= fmap findMax . carryOutActions @h maxOfActions
```

The computation is not very interesting, but the way we direct it is. We use an
explicit `forall`

. This is not to fix the type variable ordering.
Since both `heap`

and `heap'`

are passed to the same
function, the ordering is irrelevant. `ScopedTypeVariables`

extension allows
`forall`

quantified type variables to be referenced in the function
body which we need to pick the implementation `maxOfActions`

should
use.

The type applications `@heap`

and `@heap'`

determine the
computation. This is not just disambiguation. If both `maxOfActions`

were
applied to `heap`

instead, we’d create a property that is trivially
satisfied.

We use another type application in the body of `maxOfActions`

. This is
for disambiguation. Without it, all `carryOutActions`

sees is `Elem h`

and `Elem`

isn’t injective. This means given `Elem h`

, we can’t know what `h`

is. For example, if I told you `Elem h`

was `Int`

, `h`

could be `LeftistHeap`

or
`SomeSafeHeap`

as both can have `Int`

labels. Hence, we use
a type application to tell `carryOutActions`

what `h`

is.

All there remains is to actually check the property between different implementations.

```
main :: IO ()
= do
main @(LeftistHeap Int) @[ Int ])
quickCheck (sameMaxAfterActions @(SomeSafeHeap Int) @[ Int ])
quickCheck (sameMaxAfterActions @SomeSaferHeap @[ Nat ])
quickCheck (sameMaxAfterActions
putStrLn ""
<- sample' (arbitrary @(Action Int))
sampleActions print sampleActions
print $ carryOutActions @[ Int ] sampleActions
```

Remember that the list-based heap implementation was our reference implementation. Using type applications, we test functional equivalence between the reference implementation and the untyped leftist heap, the leftist property verified leftist heap, and the leftist and heap property verified leftist heap.

Then I just sample some actions and see the result of carrying them out on the
terminal because I’m paranoid like that (the `Arbitrary`

instance
could also be buggy 🙃).

At this point, we can be reasonably sure that these implementations work (for insertion and deletion).

# Conclusion

After such a long post, I’ll keep the conclusion short and sweet. Here’s what we did in a gist:

- learnt about leftist heaps a purely functional replacement to array-based binary heaps;
- looked at major parts of Haskell’s type-level computation features;
- ran simulations to test functional equivalence of various implementations;
- did a commentary on Haskell as an interactive theorem prover.

What is the overall verdict on that last point? It’s not ideal at all, but it
works for simple data structures and properties. If we had used the `singletons`

library and type-checker plugins, we could have gone further quicker.

The often overlooked point is this: if we want to do verification natively in a language that is designed to build programs with an optimising compiler and a massive ecosystem, Haskell is the singular choice. I for one am looking forward to Haskell’s typed and bright future.

## Acknowledgements

I’d like to thank Dr Dominic Orchard and Lex van der Stoep for their comments on the drafts of this post and I appreciate Vilem Liepelt’s post-publication corrections and suggestions.

This post wouldn’t be possible without the heroic work of a vibrant research community and GHC implementers. They are too many to name exhaustively, but the following deserves a special round of applause: Dr Richard Eisenberg and Prof. Stephanie Weirich (closed type families, singletons), Dr James Cheney (GADTs), and Dr Hongwei Xi (also GADTs).

The code wouldn’t be as slick if it wasn’t for Prof. Weirich’s presentations on verifying red-black trees (alternative).