You're already familiar with "value-level" programming, whereby you manipulate values such as42 :: Int
or 'a' :: Char
. In languages like Haskell, Scala, and many others, type-level programming allows you to manipulate types like Int :: *
or Char :: *
where *
is the kind of a concrete type (Maybe a
or [a]
are concrete types, but not Maybe
or []
which have kind * -> *
).
Consider this function
foo :: Char -> Int
foo x = fromEnum x
Here foo
takes a value of type Char
and returns a new value of type Int
using the Enum
instance for Char
. This function manipulates values.
Now compare foo
to this type family, enabled with the TypeFamilies
language extension.
type family Foo (x :: *)
type instance Foo Char = Int
Here Foo
takes a type of kind *
and returns a new type of kind *
using the simple mapping Char -> Int
. This is a type level function that manipulates types.
This is a very simple example and you might wonder how this could possibly be useful. Using more powerful language tools, we can begin to encode proofs of the correctness of our code at the type level (for more on this, see the Curry-Howard correspondence).
A practical example is a red-black tree that uses type level programming to statically guarantee that the invariants of the tree hold.
A red-black tree has the following simple properties:
- A node is either red or black.
- The root is black.
- All leaves are black. (All leaves are same colour as the root.)
- Every red node must have two black child nodes. Every
path from a given node to any of its descendant leaves contains the
same number of black nodes.
We'll use DataKinds
and GADTs
, a very powerful type level programming combination.
{-# LANGUAGE DataKinds, GADTS, KindSignatures #-}
import GHC.TypeLits
First, some types to represent the colours.
data Colour = Red | Black -- promoted to types via DataKinds
this defines a new kind Colour
inhabited by two types: Red
and Black
. Note that there are no values (ignoring bottoms) inhabiting these types, but we aren't going to need them anyways.
The red-black tree nodes are represented by the following GADT
-- 'c' is the Colour of the node, either Red or Black
-- 'n' is the number of black child nodes, a type level Natural number
-- 'a' is the type of the values this node contains
data Node (c :: Colour) (n :: Nat) a where
-- all leaves are black
Leaf :: Node Black 1 a
-- black nodes can have children of either colour
B :: Node l n a -> a -> Node r n a -> Node Black (n + 1) a
-- red nodes can only have black children
R :: Node Black n a -> a -> Node Black n a -> Node Red n a
GADT
lets us express the Colour
of the R
and B
constructors directly in the types.
The root of the tree looks like this
data RedBlackTree a where
RBTree :: Node Black n a -> RedBlackTree a
Now it is impossible to create a well-typed RedBlackTree
that violates any of the 4 properties mentioned above.
- The first constraint is obviously true, there are only 2 types inhabiting
Colour
.
- From the definition of
RedBlackTree
the root is black.
- From the definition of the
Leaf
constructor, all leaves are black.
- From the definition of the
R
constructor, both it's children must
be Black
nodes. As well, the number of black child nodes of each subtree are equal (the same n
is used in the type of both left and right subtrees)
All these conditions are checked at compile time by GHC, meaning that we will never get a runtime exception from some misbehaving code invalidating our assumptions about a red-black tree. Importantly, there is no runtime cost associated with these extra benefits, all the work is done at compile time.