Thanks to some excellent answers here, I generally understand (clearly in a limited way) the purpose of Haskell\'s Maybe
and that its definition is
Haskell's algebraic data types are tagged unions. By design, when you combine two different types into another type, they have to have constructors to disambiguate them.
Your definition does not fit with how algebraic data types work.
data Maybe a = Nothing | a
There's no "tag" for a
here. How would we tell an Maybe a
apart from a normal, unwrapped a
in your case?
Maybe
has a Just
constructor because it has to have a constructor by design.
Other languages do have union types which could work like what you imagine, but they would not be a good fit for Haskell. They play out differently in practice and tend to be somewhat error-prone.
There are some strong design reasons for preferring tagged unions to normal union types. They play well with type inference. Unions in real code often have a tag anyhow¹. And, from the point of view of elegance, tagged unions are a natural fit to the language because they are the dual of products (ie tuples and records). If you're curious, I wrote about this in a blog post introducing and motivating algebraic data types.
footnotes
¹ I've played with union types in two places: TypeScript and C. TypeScript compiles to JavaScript which is dynamically typed, meaning it keeps track of the type of a value at runtime—basically a tag.
C doesn't but, in practice, something like 90% of the uses of union types either have a tag or effectively emulate struct subtyping. One of my professors actually did an empirical study on how unions are used in real C code, but I don't remember what paper it was in off-hand.