Many statically typed languages have parametric polymorphism. For example in C# one can define:
T Foo(T x){ return x; }
In a call site
A few remarks to complement the two already-excellent answers.
First, one cannot say that forall
is lambda at the type-level because there already is a notion of lambda at the type level, and it is different from forall
. It appears in system F_omega, an extension of System F with type-level computation, that is useful to explain ML modules systems for example (F-ing modules, by Andreas Rossberg, Claudio Russo and Derek Dreyer, 2010).
In (a syntax for) System F_omega you can write for example:
type prod =
lambda (a : *). lambda (b : *).
forall (c : *). (a -> b -> c) -> c
This is a definition of the "type constructor" prod
, such as prod a b
is the type of the church-encoding of the product type (a, b)
. If there is computation at the type level, then you need to control it if you want to ensure termination of type-checking (otherwise you could define the type (lambda t. t t) (lambda t. t t)
. This is done by using a "type system at the type level", or a kind system. prod
would be of kind * -> * -> *
. Only the types at kind *
can be inhabited by values, types at higher-kind can only be applied at the type level. lambda (c : k) . ....
is a type-level abstraction that cannot be the type of a value, and may live at any kind of the form k -> ...
, while forall (c : k) . ....
classify values that are polymorphic in some type c : k
and is necessarily of ground kind *
.
Second, there is an important difference between the forall
of System F and the Pi-types of Martin-Löf type theory. In System F, polymorphic values do the same thing on all types. As a first approximation, you could say that a value of type forall a . a -> a
will (implicitly) take a type t
as input and return a value of type t -> t
. But that suggest that there may be some computation happening in the process, which is not the case. Morally, when you instantiate a value of type forall a. a -> a
into a value of type t -> t
, the value does not change. There are three (related) ways to think about it:
System F quantification has type erasure, you can forget about the types and you will still know what the dynamic semantic of the program is. When we use ML type inference to leave the polymorphism abstraction and instantiation implicit in our programs, we don't really let the inference engine "fill holes in our program", if you think of "program" as the dynamic object that will be run and compute.
A forall a . foo
is not a something that "produces an instance of foo
for each type a
, but a single type foo
that is "generic in an unknown type a
".
You can explain universal quantification as an infinite conjunction, but there is an uniformity condition that all conjuncts have the same structure, and in particular that their proofs are all alike.
By contrast, Pi-types in Martin-Löf type theory are really more like function types that take something and return something. That's one of the reason why they can easily be used not only to depend on types, but also to depend on terms (dependent types).
This has very important implications once you're concerned about the soundness of those formal theories. System F is impredicative (a forall-quantified type quantifies on all types, itself included), and the reason why it's still sound is this uniformity of universal quantification. While introducing non-parametric constructs is reasonable from a programmer's point of view (and we can still reason about parametricity in an generally-non-parametric language), it very quickly destroys the logical consistency of the underlying static reasoning system. Martin-Löf predicative theory is much simpler to prove correct and to extend in correct way.
For a high-level description of this uniformity/genericity aspect of System F, see Fruchart and Longo's 97 article Carnap's remarks on Impredicative Definitions and the Genericity Theorem. For a more technical study of System F failure in presence of non-parametric constructs, see Parametricity and variants of Girard's J operator by Robert Harper and John Mitchell (1999). Finally, for a description, from a language design point of view, on how to abandon global parametricity to introduce non-parametric constructs but still be able to locally discuss parametricity, see Non-Parametric Parametricity by George Neis, Derek Dreyer and Andreas Rossberg, 2011.
This discussion of the difference between "computational abstraction" and "uniform abstract" has been revived by the large amount of work on representing variable binders. A binding construction feels like an abstraction (and can be modeled by a lambda-abstraction in HOAS style) but has an uniform structure that makes it rather like a data skeleton than a family of results. This has been much discussed, for example in the LF community, "representational arrows" in Twelf, "positive arrows" in Licata&Harper's work, etc.
Recently there have been several people working on the related notion of "irrelevance" (lambda-abstractions where the result "does not depend" on the argument), but it's still not totally clear how closely this is related to parametric polymorphism. One example is the work of Nathan Mishra-Linger with Tim Sheard (eg. Erasure and Polymorphism in Pure Type Systems).