Why do Maybe/Optional types use a Just/Some type instead of the actual type?

问题

In Idris, the Maybe type is defined as followed:

data Maybe a = Just a | Nothing

It's defined similarly in Haskell:

 data Maybe a = Just a | Nothing
     deriving (Eq, Ord)

Here's the ML version:

datatype 'a option = NONE | SOME of 'a

What are the benefits of using Just and Some?
Why not define the type without them?

example:

data Maybe a = a | Nothing

回答1:

What would then be the difference between

Maybe a

and

Maybe (Maybe a)

There's supposed to be a difference between Nothing and Just Nothing.

回答2:

The key problem of allowing any value to be null (the "billion-dollar mistake") is that interfaces receiving a value of a type T have no way to declare whether or not they can handle a null, and interfaces that provide one have no way to declare whether or not they might produce null. This means that all of the operations that are usable on T essentially "might not work" when you pass them a T, which is a pretty gaping hole in all of the guarantees supposedly provided by compile-time type-checks.

The Maybe/Optional solution to this is to say that the type T does not contain a null value (in languages that had this from the beginning, that's literal; in languages adopting an Optional type later without removing support for null, then that's only a convention that requires discipline). So now all of the operations whose type says they accept a T should work when I pass them a T, regardless of where I got the T (if you haven't managed to design to "make illegal states unrepresentable" then there will of course be other reasons why an object can be in an invalid state and cause failure, but at least when you pass a T there'll actually be something there).

Sometimes we do need a value that can be "either a T or nothing". It's such a common case that pervasive null seemed like a good idea at the time, after all. Enter the Maybe T type. But to avoid falling back into exactly the same old trap, where I get a possibly-null T value and pass it to something that can't handle null, we need that none of the operations on T can be used on a Maybe T directly. Getting a type error from trying to do that is the entire point of the exercise. So my T values can't be directly members of Maybe T; I need to wrap them up inside a Maybe T, so that if I have a Maybe T I'm forced to write code that handles both cases (and only in the case for actually having a T can I call operations that work on T).

Whether this makes a word like Just or Some appear in the source code, and whether or not this is actually implemented with additional boxing/indirection in memory (some languages do represent a Maybe T as a nullable pointer to T internally), all of that is irrelevant. But the Just a case must be different from simply having an a value.

回答3:

I am not sure it is correct to speak of "benefits" in this context. What you have here is just a consequence of the way types are implemented in Haskell and ML - basically, Hindley-Milner algebraic type system. This type system essentially assumes that every value belongs to a single type (putting aside Haskell's numeric tower and bottom, which are outside of this discussion.) In other words, there is no subtyping, and that's for a reason - otherwise the type inference would be undecidable.

When you define type Maybe a what you want is to adjoin a single additional value to the type denoted by a. But you can't do it directly - if you could then every value of a would belong to two different types - the original a and Maybe a. Instead, what is done is a is embedded in a new type - you have a canonical injection a -> Just a. In other words, Maybe a is isomorphic to a union of a and Nothing which you can't represent directly in HM type system.

So I don't think that arguments along the lines that such a distinction is beneficial are valid - you can't have a system without it which is still ML or Haskell or any familiar HM-based system.

回答4:

The problem is that if Maybe was defined the way you propose, i.e. data Maybe a = a | Nothing there would be no way to differentiate a values from Maybe a values (and Maybe (Maybe a) for that matter).

So you may ask, why do we need to have such a distinction? What are the benefits? To give you a concrete example, suppose that we have a SQL table with a integer NOT NULL column. We would represent that with an Int in haskell. Now if we later on changed the database schema to make the column optional by dropping the NOT NULL constraint, we would have to change the haskell representation of the column to Maybe Int. The clear distinction between Int and Maybe Int would make it very easy to refactor our haskell code to account for the new schema. The compiler would complain for things such as extracting a value from the db and treating it as an Int (it might not be an integer, it might be NULL).

回答5:

The benefit of the constructor (Just or Some) is that it provides a way to distinguish between the branches of the data type. That is, it avoids ambiguity.

For example if we were relying on type inference, then the type of x in the following seems fairly straightforward — String.

x = "Hello"

However, if we allowed your definition of Maybe, how would we know whether x was String, a Maybe String or a Maybe (Maybe String) etc.

Also consider a data type with more than two cases:

data Tree a
  = Empty
  | Node (Tree a) (Tree a)
  | Leaf a

If we simply removed the constructors (other than Empty), following your suggestion for Maybe, we'd end up with:

data Tree a
  = Empty
  | (Tree a) (Tree a)
  | a

I hope you can see that the ambiguity gets even worse.

回答6:

Consider this somewhat equivalent of Maybe in C++/Java'ish psuedocode...

template<class T>
abstract class Maybe<T> { ... }

template<class T>
class Just<T> : Maybe<T> {
    // constructor
    Just<T> (T val) { ... }

    ...
}

template<class T>
class Nothing<T> : Maybe<T> {
    // constructor
    Nothing () { ... }

    ...
}

That is not specific to Maybe, it can be applied to any ADT. Now what exactly will

data Maybe a = a | Nothing

model into ? (assuming that its legal syntax).

If you were to write a switch statement, to 'pattern match' against types, what will u match against (the switch is on the TYPE not the value), something like this (not necessarily valid code) :

switch (typeof (x)) {
    case Just<a> : ...
    case Nothing<a> : ...
    default : ... // Here you dont have any 'a' to get the inner type
}

来源：https://stackoverflow.com/questions/39582869/why-do-maybe-optional-types-use-a-just-some-type-instead-of-the-actual-type

标签

haskell

types

idris

maybe