# Dependable Types 2: Correctness by Construction

Part two already, is it? Well, I suppose it’s about time to write some code! Last time, we covered the constructions of the lambda calculus, as well as De Bruijn indices. Today, we’re going to cover basically the same thing, but encoding everything we learnt in Idris! If you’re not familiar with Idris or dependently-typed programming, things are going to start getting… weird.

As you may remember, the ultimate goal is to represent the simply-typed lambda calculus. With that in mind, we should probably define some simple types:

``````data ProgramType
= PFunction ProgramType ProgramType
| PInt
``````

So, for now, we only have two types: functions (from one type to another), and integers. If you want to add more simple types, it’s as easy as you think!

Now that’s sorted, let’s have a go at building expressions with an abstract data type.

``````data Expression
= Variable Int
| Abstraction ProgramType Expression
| Application Expression Expression
``````

Just as the last article did, we’re saying there are essentially only three types of expressions:

• Variables, which we can reference by De Bruijn index (if zero-indexed, this is essentially how many variables have been introduced since the one I care about?)

• Abstractions, which introduce a new variable with a given type that can be referenced within the given body.

• Applications, which take the value on the right, and apply it to the function on the left.

This is a perfectly sensible way to represent the calculus, and feel free to write it an interpreter - we’ll certainly be writing one later in the series! I’d imagine you’ll have a type signature like this:

``````interpret : Expression -> Maybe Expression
``````

So, that’s… fine… but that `Maybe` is kind of ugly, right? Why does it need to be there? If you’ve had some experience with Haskell-like languages before, you’ve probably already had a suspicion or two:

• How do we know the index in every `Variable` is valid? What if it’s higher than the outermost variable’s index? What if it’s negative?!

• How do we know the left expression in `Application` is a function? How do we know that its input type matches the right expression’s type?

The more you stare at it, the more holes you can see, and that `Maybe` looks more and more inevitable. Today, friends, we’ll ask whether we could do… better.

Here’s where we officially break from Elm and PureScript. Still not from Haskell, providing you have the `GADTs` and `DataKinds` extensions enabled. Within Idris, we can have greater control of our data type using a clever notion called GADTs. We don’t have enough time in this post to explain them in depth, so let’s just say that they give us more control over the construction of our type’s values. If that means nothing to you, don’t worry! Here’s an example of the above ADT rewritten as a GADT:

``````data Expression : Type where
Variable
: Int
-> Expression

Abstraction
: ProgramType
-> Expression
-> Expression

Application
: Expression
-> Expression
``````

Great. So far, we’ve just got a more long-winded way of doing the same thing. What’s the big deal? Well, let’s have a go at solving that type mismatch issue: Let’s index our type by a ProgramType.

``````data Expression : ProgramType -> Type where
Abstraction
: (paramType : ProgramType)
-> Expression bodyType
-> Expression (PFunction paramType bodyType)

Application
: Expression (PFunction input output)
-> Expression input
-> Expression output

Variable : ... -- More on this later...
``````

Here’s where things get truly magical. In the first line, we say that `Expression` is a type constructor that takes a `ProgramType` (just like `Array` is a type constructor that takes a `Type` - `Array Int`, ```Array String```, and so on). This means, whenever we mention an `Expression`, we also mention its type - now, we can be sure that `Application`’s components have the right types!

This is going to look really weird to Haskell-like language users. `ProgramType` is the data type we defined earlier, and now we’re using it inside a type! This is what makes Idris, and dependent types, special: we use the same language to talk about types and values. We’ll see later that we can even pull things out of a type to use as values… and vice versa.

We can pat ourselves on the back: we have solved the type mismatch issue. However, you’ll notice I’ve tiptoed around `Variable`. Our issue was that we would need to bounds check the `Int`; how are we going to fix that?

## A small digression: `Elem`

Idris has a lovely little type called `Elem` that is written like this:

``````data Elem : a -> List a -> Type where
Here  : Elem x (x :: xs)
There : Elem x xs -> Elem x (y :: xs)
``````

Let’s break it down. `Elem` is a type indexed by two things: a value of some type `a`, and a list of values of type `a`. `Here` says, the first thing in the list is the value (`::` is how we write `cons` in Idris: ```x :: xs``` is a list where `x` is the item at the front, and `xs` is the rest of the items). `There` says, it isn’t the first thing, but it is in the list. Here are a few examples:

``````valid : Elem 1 [1, 2, 3]
valid = Here

alsoValid : Elem "az" ["oo", "ar", "az"]
alsoValid = There (There Here)

invalid : Elem 0 [1, 2, 3] -- Type error!

alsoInvalid : Elem 0 [1, 0]
alsoInvalid = Here -- Should be There Here!
``````

What we have is a type-level way of proving that something exists within a list. All well and good, but how does this help us?

## Third time lucky: doubly-indexed GADTs

Here’s where we solve all our problems. Firstly, let’s talk about variables. Every time we use `Abstraction`, we introduce a new variable with a given type into the “context” of our expression. How do you suppose we can keep track of the list of variables and their types, though?

``````Context : Type
Context = List ProgramType
``````

… Well, that was straightforward. Really sorry if that was a little underwhelming. Because we’re just describing expressions, we don’t need to store values at all - just knowing the types of the variables that are in context (think of this as “scope”) is enough to know whether our expression is valid!

As a little aside, this is one of the things I found a bit weird about Idris: because types and values use the same language, we can declare type aliases just as we declare values. We even give them type signatures!

Here comes the big reveal. This is the bit that blew my mind (which, I can tell you, was very distressing and inconvenient for poor Liam, who had just written it). As well as indexing our expressions by their type, we can also index them by the context they exist in:

``````data Expression
: (ctx : Context)
-> (ptype : ProgramType)
-> Type where

Abstraction
: (param : ProgramType)
-> (body  : Expression (param :: ctx) ptype)
-> Expression ctx (PFunction param ptype)

Application
: (func : Expression ctx (PFunction i o))
-> (arg  : Expression ctx i)
-> Expression ctx o

Variable -- One more minute...
``````

For the benefit of mobile users, these names have been abbreviated to stop line wrapping. If you would prefer a more verbose format, the code for this post uses long-hand names.

Wow! This is where we start to see the power of Idris. We now start by saying that `Expression` is a type indexed by a `Context` and a `ProgramType`, but we also give that `Context` a name - `ctx` - so we can use it later!

An `Abstraction` is made up of some `ProgramType` to represent the type of the input, and an `Expression` of some type to represent the output. What’s new here, however, is that `body`’s context must be the same as the output, but with a new variable introduced which matches the given parameter type! We are saying, “within an `Abstraction` body, there is one more variable available: the parameter”. Because the output type doesn’t contain that extra variable, we are completely encapsulating it within the body expression. We are saying that in the type. I can’t tell you how excited this makes me.

`Application` is actually… pretty much the same as it was before. All our `Context` is saying here is that they must exist within the same context, which is… well, what you’d expect. The point here is that we’re still carrying the context in the type so we can use it for `Abstraction` and `Variable`! Speaking of `Variable`… where is it?

``````Variable
: (ref : Elem ptype ctx)
-> Expression ctx ptype
``````

This, for me, is the most beautiful part of all. Now, instead of taking a number to represent a De Bruijn index, we take an `Elem`. We can see that this maps quite happily:

De Bruijn `Elem`
1 `Here`
2 `There Here`
3 `There (There Here)`
4 `There (There (There Here))`

… Well, you get the picture. The point is that, in order to construct a `Variable` expression, we have to prove that the variable is in the context:

``````good : Expression [] (PFunction PInt PInt)
good = Abstraction PInt (Variable Here)
``````

This is the identity function for `PInt`: create a function abstraction with a `PInt`-type parameter, and then return the last-introduced variable! What if we tried to write this:

``````bad : Expression [] (PFunction PInt PInt)
bad = Abstraction PInt (Variable (There Here))
``````

Now, we’re saying, “Get me the last-but-one-introduced variable”, but our context is empty - this variable doesn’t exist! Just when we think all hope is lost, up pops a compiler error:

``````When checking right hand side of second with expected type
Expression [] (PFunction PInt PInt)

When checking argument later to constructor Data.List.There:
Type mismatch between
Elem x (x :: xs) (Type of Here)
and
Elem PInt [] (Expected type)

Specifically:
Type mismatch between
PInt :: xs
and
[]
``````

This is an actual compiler error telling us exactly that: the context was empty, but we’ve tried to use a proof that `PInt` is in there! Because Idris wants to see `Elem PInt (x :: xs)`, but actually sees `Elem PInt []`, it knows that there has been a problem. We are now in a situation where we cannot write invalid expressions if we want our code to compile. Because we’ve indexed our type by both the program type and context of the expression, and used a GADT to constrain them, we know that they will always be what we expect! We can now write what we’ve always wanted to write:

``````eval : Expression -> Expression
``````

… but we’ll save that for next time!

This has been… a lot of information. A lot of information very quickly, but hopefully not too quickly. In any case, I’m sorry if this one takes a couple readthroughs. Idris is a very different beast to Elm, PureScript, and even Haskell: the ways that you can interact between types and values are strange and unfamiliar, but we’re starting to see why this newfound power is helpful. With the aid of indexed types and GADTs, we can produce types that simply don’t allow for invalid values. We have, with these features, made illegal states unrepresentable. This is also what we call correctness by construction: if I can build a value of type `Expression`, it must be valid.

If I’ve rushed through anything, or something seems suspicious, don’t hesitate to send me a tweet or leave an issue on this website’s repository! As always, there is code for this article, all of which can be found within `Expr.idr` and `ProgramType.idr`, and I encourage you to play around with it!

Otherwise, have fun, and I’ll see you next time!

Take care ♥