How should the general type of a “lemma” function be understood?

问题

Perhaps this is a stupid question. Here's a quote from the Hasochism paper:

One approach to resolving this issue is to encode lemmas, given by parameterised equations, as Haskell functions. In general, such lemmas may be encoded as functions of type:
∀ x1 ... xn. Natty x1 → ... → Natty xn → ((l ~ r) ⇒ t) → t

I thought I understood RankNTypes, but I can't make sense of the last part of this proposition. I'm reading it informally as "given a term which requires l ~ r, return that term". I'm sure this interpretation is wrong because it seems to lead to a circularity: we don't know l ~ r until the conclusion of the proof itself, so how can I be expected to provide as an assumption of the proof a term which requires that?

I would have expected an equality proof to have a type more like this:

Natty x1 → ... → Natty xn → l :~: r

Informally, "given a bunch of Nattys, return a proof of the proposition that l and r are equal" (using GHC's Data.Type.Equality). This makes far more sense to me, and seems to align with what you'd say in other dependently typed systems. I'm guessing it's equivalent to the version in the paper, but I'm struggling to mentally square away the two versions.

In short, I'm confused. I feel like I'm missing a key insight. How should I read the type ((l ~ r) => t) -> t?

回答1:

I'm reading it as "given a term which requires l ~ r, return that term"

It's "given a term whose type contains l, return that term with all ls being substituted by rs in the type" (or in the other direction r -> l). It's a very neat trick, that allows you to delegate all cong, trans, subst and similar stuff to GHC.

Here is an example:

{-# LANGUAGE GADTs, DataKinds, PolyKinds, TypeFamilies, TypeOperators, RankNTypes #-}

data Nat = Z | S Nat

data Natty n where
    Zy :: Natty Z
    Sy :: Natty n -> Natty (S n)

data Vec a n where
    Nil  :: Vec a Z
    Cons :: a -> Vec a n -> Vec a (S n)

type family (n :: Nat) :+ (m :: Nat) :: Nat where
    Z     :+ m = m
    (S n) :+ m = S (n :+ m)

assoc :: Natty n -> Natty m -> Natty p -> (((n :+ m) :+ p) ~ (n :+ (m :+ p)) => t) -> t
assoc  Zy     my py t = t
assoc (Sy ny) my py t = assoc ny my py t

coerce :: Natty n -> Natty m -> Natty p -> Vec a ((n :+ m) :+ p) -> Vec a (n :+ (m :+ p))
coerce ny my py xs = assoc ny my py xs

UPDATE

It's instructive to specialize assoc:

assoc' :: Natty n -> Natty m -> Natty p ->
            (((n :+ m) :+ p) ~ (n :+ (m :+ p)) => Vec a (n :+ (m :+ p)))
                                               -> Vec a (n :+ (m :+ p))
assoc'  Zy     my py t = t
assoc' (Sy ny) my py t = assoc ny my py t

coerce' :: Natty n -> Natty m -> Natty p -> Vec a ((n :+ m) :+ p) -> Vec a (n :+ (m :+ p))
coerce' ny my py xs = assoc' ny my py xs

Daniel Wagner explained what's going on in comments:

Or, to say it another way, you can read ((l ~ r) => t) -> t as, "given a term that is well typed assuming that l ~ r, return that same term from a context where we have proven l ~ r and discharged that assumption".

Let's elaborate the proving part.

In the assoc' Zy my py t = t case n is equal to Zy and hence we have

((Zy :+ m) :+ p) ~ (Zy :+ (m :+ p))

which reduces to

(m :+ p) ~ (m :+ p)

This is clearly identity and hence we can discharge that assumption and return t.

At each recursive step we maintain the

((n :+ m) :+ p) ~ (n :+ (m :+ p))

equation. So when assoc' (Sy ny) my py t = assoc ny my py t the equation becomes

((Sy n :+ m) :+ p) ~ (Sy n :+ (m :+ p))

which reduces to

 Sy ((n :+ m) :+ p) ~ Sy (n :+ (m :+ p))

due to the definition of (:+). And since constructors are injective

constructors_are_injective :: S n ~ S m => Vec a n -> Vec a m
constructors_are_injective xs = xs

the equation becomes

((n :+ m) :+ p) ~ (n :+ (m :+ p))

and we can call assoc' recursively.

Finally in the call of coerce' these two terms are unified:

 1. ((n :+ m) :+ p) ~ (n :+ (m :+ p)) => Vec a (n :+ (m :+ p))
 2.                                      Vec a ((n :+ m) :+ p)

Clearly Vec a ((n :+ m) :+ p) is Vec a (n :+ (m :+ p)) under the assumption that ((n :+ m) :+ p) ~ (n :+ (m :+ p)).

回答2:

I would have expected an equality proof to have a type more like this:
Natty x1 → ... → Natty xn → l :~: r

That's a reasonable alternative. In fact, it's logically equivalent to the one in the Hasochism paper:

{-# LANGUAGE GADTs, RankNTypes, TypeOperators, ScopedTypeVariables #-}
module Hasochism where

data l :~: r where
  Refl :: l :~: l

type Hasoc l r = forall t. (l ~ r => t) -> t

lemma1 :: forall l r. Hasoc l r -> l :~: r
lemma1 h = h Refl 

lemma2 :: forall l r. l :~: r -> Hasoc l r
lemma2 Refl t = t

In a sense, Hasoc l r is an impredicative encoding of the constraint l ~ r.

The Hasochistic variant is slightly easier to use than the :~: one, in that once you have e.g.

type family A a
-- ...
proof1 :: Proxy a -> Hasoc a (A a)
proof1 _ = -- ...

you can simply use it as in

use1 :: forall a. [a] -> [A a]
use1 t = proof1 (Proxy :: Proxy a) t

Instead, with

proof2 :: Proxy a -> a :~: A a
proof2 _ = -- ...

you would need

use2 :: forall a. [a] -> [A a]
use2 t = case proof2 (Proxy :: Proxy a) of Refl -> t

回答3:

We've had some excellent answers, but as the perpetrator, I thought I'd offer some remarks.

Yes, there are multiple equivalent presentations of these lemmas. The presentation I use is one of them, and the choice is largely a pragmatic one. These days (in a more recent codebase), I go as far as to define

-- Holds :: Constraint -> *
type Holds c = forall t . (c => t) -> t

This is an example of an eliminator type: it abstracts over what it delivers (the motive of the elimination) and it requires you to construct zero or more methods (one, here) of achieving the motive under more specific circumstances. The way to read it is backwards. It says

If you have a problem (to inhabit any motive type t), and nobody else can help, maybe you can make progress by assuming constraint c in your method.

Given that the language of constraints admits conjunction (aka tupling), we acquire the means to write lemmas of the form

lemma :: forall x1 .. xn. (p1[x1 .. xn],.. pm[x1 .. xn])        -- premises
                       => t1[x1 .. xn] -> .. tl[x1 .. xn]       -- targets
                       -> Holds (c1[x1 .. xn],.. ck[x1 .. xn])  -- conclusions

and it might even be that some constraint, a premise p or a conclusion c, has the form of an equation

l[x1 .. xn] ~ r[x1 .. cn]

Now, to deploy such a lemma, consider the problem of filling a hole

_ :: Problem

Refine this _ by the elimination lemma, specifying the targets. The motive comes from the problem at hand. The method (singular in the case of Holds) remains open.

lemma target1 .. targetl $ _

and the method hole will not have changed type

_ :: Problem

but GHC will know a bunch more stuff and thus be more likely to believe your solution.

Sometimes, there's a constraint-versus-data choice to make for what's a (constraint) premise and what's a (data) target. I tend to pick these to avoid ambiguity (Simon likes to guess the x1 .. xn, but sometimes needs a hint) and to facilitate proof by induction, which is much easier on targets (often the singletons for type level data) than on premises.

As to deployment, for equations, you can certainly switch to a datatype presentation and break out a case analysis

case dataLemma target1 .. targetl of Refl -> method

and indeed, if you equip yourself with the Dict existential

data Dict (c :: Constraint) :: * where
  Dict :: c => Dict c

you can do a bunch at once

case multiLemma blah blah blah of (Refl, Dict, Dict, Refl) -> method

but the eliminator form is more compact and readable when there is at most one method. Indeed, we can chain multiple lemmas without sliding ever rightward

lemma1 ..   $
...
lemmaj ..   $
method

If you have such an eliminator with two or more cases, I think it's often better to wrap it up as a GADT, so that usage sites helpfully tag each case with a constructor label.

Anyhow, yes, the point is to choose the presentation of the facts which most compactly enables us to extend the reach of GHC's constraint solving machinery so that more stuff just typechecks. If you're in a scrap with Simon, it's often a good strategy to explain yourself to Dimitrios next door.

来源：https://stackoverflow.com/questions/30224796/how-should-the-general-type-of-a-lemma-function-be-understood

标签

haskell

theorem-proving

dependent-type

higher-rank-types