Confused on DataKinds extension

问题

I learn type programming of Haskell from Basic Type Level Programming in Haskell but when it introduce DataKinds extension, there are something seems confusing from the example:

{-# LANGUAGE DataKinds #-}
data Nat = Zero | Succ Nat

Now, Nat is promoted to Kind, it is okay. But how about Zero and Succ?

I try to get some information from GHCi, so I type:

:kind Zero

it gives

Zero :: Nat

that is okay, Zero is a type has kind Nat, right? and I try:

:type Zero

it still gives:

Zero :: Nat

that means Zero has type Nat, that is impossible, since Nat is a kind not type, right? Does Nat is both a type and a kind ??

And other confusing thing is, above blog also mentioned that, when create Nat kind, there are two new types: 'Zero and 'Succ are created automatically. When I try it from GHCi again:

:kind 'Zero

gives

'Zero :: Nat

and

:type 'Zero

gives

 Syntax error on 'Zero

Okay, it proves that 'Zero is a type. but what is the purpose of creating 'Zero and 'Succ'??

回答1:

In unextended Haskell, the declaration

data A = B

defines two new entities, one each at the computation and type level:

At the type level, a new base type named A (of kind *), and
At the computation level, a new base computation named B (of type A).

When you turn on DataKinds, the declaration

data A = B

now defines four new entities, one at the computation level, two at the type level, and one at the kind level:

At the kind level, a new base kind A,
At the type level, a new base type 'B (of kind A),
At the type level, a new base type A (of kind *), and
At the computation level, a new base computation B (of type A).

This is a strict superset of what we had before: the old (1) is now (3), and the old (2) is now (4).

These new entities explain the following interactions you described:

:type Zero
Zero :: Nat
:kind 'Zero
'Zero :: Nat
:type 'Zero
Syntax error on 'Zero

I think it's clear how it explains the first two. The last one it explains because 'Zero is a type-level thing -- you can't ask for the type of a type, only the type of a computation!

Now, in Haskell, every place where a name occurs, it is clear from the surrounding syntax whether that name is intended to be a computation-level name, a type-level name, or a kind-level name. For this reason, it is somewhat annoying to have to include the tick mark in 'B at the type level -- after all, the compiler knows we're at the type level and therefore can't be referring to the unlifted computation-level B. So for convenience, you are permitted to leave off the tick mark when that is unambiguous.

From now on, I will distinguish between the "back end" -- where there are only the four entities described above and which are always unambiguous -- and the "surface syntax", which is the stuff you type into a file and pass off to GHC that includes ambiguity but is more convenient. Using this terminology, in the surface syntax, one may write the following things, with the following meanings:

Surface syntax    Level           Back end
Name              computation     Name
Name              type            Name if that exists; 'Name otherwise
'Name             type            'Name
Name              kind            Name
---- all other combinations ----  error

This explains the first interaction you had (and the only one left unexplained above):

:kind Zero
Zero :: Nat

Because :kind must be applied to a type-level thing, the compiler knows the surface syntax name Zero must be a type-level thing. Since there is no type-level back end name Zero, it tries 'Zero instead, and gets a hit.

How can it be ambiguous? Well, notice above that we defined two new entities at the type level with one declaration. For simplicity of introduction, I named the new entities on the left- and right-hand side of the equation different things. But let's see what happens if we just tweak the declaration slightly:

data A = A

We still introduce four new back end entities:

Kind A,
Type 'A (of kind A),
Type A (of kind *), and
Computation A (of type A).

Whoops! There is now both an 'A and an A at the type level. If you leave off the tick mark in the surface syntax, it will use (3), and not (2) -- and you can explicitly choose (2) with the surface syntax 'A.

What's more, this doesn't have to happen all from a single declaration. One declaration may produce the ticked version and another the non-ticked version; for example

data A = B
data C = A

will introduce a type-level back end name A from the first declaration and a type-level back end name 'A from the second declaration.

回答2:

{-# LANGUAGE DataKinds #-} does not change what the data keyword usually does: it still creates a type Nat and two value constructors Zero and Succ. What this extension does do is, it allows you to use such types as if they were kinds, and values as if they were types.

Thus, if you use Nat in a type-level expressions, it'll just use it as a boring Haskell98 type. Only if you use it in an unabiguously kind-level expression, do you get the kind version.

This automatic lifting could sometimes cause name clashes, which is, I think, the reason for the ' syntax:

{-# LANGUAGE DataKinds #-}
data Nat = Zero | Succ Nat
data Zero

Now, Zero by itself is the plain (empty) data type Zero :: *, so

*Main> :k Zero
Zero :: *

In principle, thanks to DataKinds, Zero is now also a type lifted from the value constructor Zero :: Nat, but this is shadowed by the data Zero. Thus the quoting-syntax, which always refers to lifted types, never to directly defined ones:

*Main> :k 'Zero
'Zero :: Nat

来源：https://stackoverflow.com/questions/53812685/confused-on-datakinds-extension

标签

haskell

types

data-kinds