As far as I understand, recursive data types from Haskell correspond to initial algebras of endofunctors from the Hask
category [1, 2]. For example:
- Natural numbers,
data Nat = Zero | Succ Nat
, correspond to the initial algebra of the endofunctorF(-) = 1 + (-)
. - Lists,
data List a = Nil | Cons a (List a)
, correspond to the initial algebra of the endofunctorF(A, -) = 1 + A × (-)
.
However, it's not clear to me what the endofunctor corresponding to the rose trees should be:
data Rose a = Node a (List (Rose a))
What confuses me is that there are two recursions: one for the rose tree and the other for the list. According to my calculations, I would get the following functor, but it doesn't seem right:
F(A, •, -) = A × (1 + (-) × (•))
Alternatively, rose trees can be defined as mutually recursive data types:
data Rose a = Node a (Forest a)
type Forest a = List (Rose a)
Do mutually recursive data types have an interpretation in category theory?
I would discourage talk of "the Hask Category" because it subconsciously conditions you against looking for other categorical structure in Haskell programming.
Indeed, rose trees can be seen as the fixpoint of an endofunctor on types-and-functions, a category which we might be better to call Type
, now that Type
is the type of types. If we give ourselves some of the usual functor kit...
newtype K a x = K a deriving Functor -- constant functor
newtype P f g x = P (f x, g x) deriving Functor -- products
...and fixpoints...
newtype FixF f = InF (f (FixF f))
...then we may take
type Rose a = FixF (P (K a) [])
pattern Node :: a -> [Rose a] -> Rose a
pattern Node a ars = InF (P (K a, ars))
The fact that []
is itself recursive does not prevent its use in the formation of recursive datatypes via Fix
. To spell out the recursion explicitly, we have nested fixpoints, here with bound variable names chosen suggestively:
Rose a = μrose. a * (μlist. 1 + (rose * list))
Now, by the time we've arrived in the second fixpoint, we have a type formula
1 + (rose * list)
which is functorial (indeed, strictly positive) in both rose
and list
. One might say it is a Bifunctor
, but that's unnecessary terminology: it's a functor from (Type, Type)
to Type
. You can make a Type -> Type
functor by taking a fixpoint in the second component of the pair, and that's just what happened above.
The above definition of Rose
loses an important property. It is not true that
Rose :: Type -> Type -- GHC might say this, but it's lying
merely that Rose x :: Type
if x :: Type
. In particular,
Functor Rose
is not a well typed constraint, which is a pity, as intuitively, rose trees ought to be functorial in the elements they store.
You can fix this by building Rose
as itself being the fixpoint of a Bifunctor
. So, in effect, by the time we get to lists, we have three type variables in scope, a
, rose
and list
, and we have functoriality in all of them. You need a different fixpoint type constructor, and a different kit for building Bifunctor
instances: for Rose
, life gets easier because the a
parameter is not used in the inner fixpoint, but in general, to define bifunctors as fixpoints requires trifunctors, and off we go!
This answer of mine shows how to fight the proliferation by showing how indexed types are closed under a fixpoint-of-functor construction. That's to say, work not in Type
but in i -> Type
(for the full variety of index types i
) and you're ready for mutual recursion, GADTs, and so on.
So, zooming out, rose trees are given by mutual fixpoints, which have a perfectly sensible categorical account, provided you see which categories are actually at work.
This is not really an answer to the question you're asking, but perhaps interesting anyway. Note that with
Rose a = a * List (Rose a)
List a = 1 + a * List a
and the fact that *
distributes over +
, you have
Rose a
= {- definition of `Rose` -}
a * List (Rose a)
= {- definition of `List` -}
a * (1 + Rose a * List (Rose a))
= {- `*` distributes over `+` -}
a + a * Rose a * List (Rose a)
= {- `*` is commutative -}
a + Rose a * a * List (Rose a)
= {- definition of `Rose` -}
a + Rose a * Rose a
(the equality really denotes isomorphism). So you might as well have defined
Rose a = a + Rose a * Rose a
or in Haskell,
data Rose a = Leaf a | Bin (Rose a) (Rose a)
Which is to say, rose trees are isomorphic to ordinary (leaf-labelled) binary trees, and which clearly form a normal initial algebra.
As you noticed, the definition of the functor for Rose a
is trickier due to the fact that the recursive occurrence of the type is fed into a List
. The problem is that List
is itself a recursive type obtained as a fixed point. List (Rose a)
basically corresponds to an "arbitrary number of elements of Rose a
", something that you cannot express with a signature of products and sums alone, hence the need for additional abstraction over these multiple recursive points.
A functor F A - : * -> *
will not work, as we would need to find something such that
F A X ≃ A × (1 + X × List X)
F A X ≃ A × (1 + X × (1 + X × List X))
F A X ≃ A × (1 + X × (1 + X × (1 + X × List X)))
...
One way to do it is to just treat List
as primitive. Then Rose a
is just the fixed point of
RoseF A : * -> * = λ X . A × List X
Another, more interesting way is to follow the suggestion in the reference you posted, and notice that the type of Rose a
can be generalized to abstract over the functor in which the recursive occurrence is fed into
GRose F A ≃ A × F (GRose F A)
now GRose
has type (* -> *) -> (* -> *)
, hence it is an higher order functor mapping an endofunctor into another one. In our example, it would map the functor List
into the type of rose trees.
Notice however that GRose is still recursive, so the above is actually stating an isomorphism rather than a solution to our problem. We can try to fix (wink wink) this by additionally abstracting over the recursive point
HRose G F A = A × F (G F A)
Notice that now HRose
is a regular higher-order functor of type ((* -> *) -> (* -> *)) -> (* -> *) -> (* -> *)
, hence it maps higher-order functors into higher-order functors. Computing the least fixed point of HRose
gives us
μ(HRose) F A ≃ A × F (μ(HRose) F A)
So if we put Rose ≡ μ(HRose) List
, we get
Rose A ≃ A × List (Rose A)
which is exactly the defining equation for rose trees. You can find many further examples of the theory and practice of generic programming using fixed points over higher-order functors. Here, for example, Bird and Paterson develop it in the context of nested datatypes (but the definitions clearly hold in general). They also show the systematic construction of folds over datatypes defined in such way, as well as various laws.
You seem to understand how this is modelled
data List a = Nil | Cons a (List a)
by taking, for any given A
, the initial algebra of the endofunctor F(A, -) = 1 + A × (-)
. Let's call this initial algebra L(A)
.
If we forget the morphism in L(A)
, we can sat that L(A)
is an object of our category. Better, L(-)
is not only a mapping from objects to objects, but can be seen as an endofunctor.
Once L
is seed as an endofunctor, the recursive type
data Rose a = Node a (List (Rose a))
is interpreted by taking, for any A
m the initial algebra of the functor
G A = A * L A
which is a functor obtained by composing L
and *
(and the diagonal functor).
Hence, the same approach works.
来源:https://stackoverflow.com/questions/45900358/initial-algebra-for-rose-trees