Or rather, why isn\'t (==)
usable on every data type? Why do we have to derive Eq
ourseleves? In other languages, such as Python, C++, and surely o
You may not want to derive Eq - you might want to write your own instance.
For example, imagine data in a binary tree data structure:
data Tree a = Branch (Tree a) (Tree a)
| Leaf a
You could have the same data in your Leaf
s, but balanced differently. Eg:
balanced = Branch (Branch (Leaf 1)
(Leaf 2))
(Branch (Leaf 3)
(Leaf 4))
unbalanced = Branch (Branch (Branch (Leaf 1)
(Leaf 2))
(Leaf 3))
(Leaf 4)
shuffled = Branch (Branch (Leaf 4)
(Leaf 2))
(Branch (Leaf 3)
(Leaf 1))
The fact that the data is stored in a tree may only be for efficiency of traversal, in which case you'd probably want to say that balanced == unbalanced
. You might even want to say that balanced == shuffled
.
In Python the default equality implementation compares identity, not value. This is useful for user-defined classes, which by default are mutable and do not have to have a well-defined notion of "value". But even in that setting, it is more normal to use the is
operator to directly compare identities for mutable objects.
With Haskell's immutability and sharing this notion of "identity" doesn't make much sense. If you could compare two terms by identity you could find out whether or not they are shared, but it's generally up to the implementation whether two terms that might be shared actually are shared, so such information shouldn't be able to affect the behaviour of the program (unless you like programs that change their behaviour under different compiler optimisation strategies).
So equality in Haskell is always value equality; it tells you whether two terms represent the same value (not necessarily whether they have equal structure; if you implement a set with an unordered list then two lists with different structure can represent the same set).
Almost all of the built in types are members of Eq
already; the big exception are function types. The only really sensible notion of value equality for functions is extensional equality (do they return the same output for every input). It's tempting to say we'll use that and let the compiler access a representation of the function definition to compute this, but unfortunately determining whether two arbitrary algorithms (here encoded in Haskell syntax) always produce the same output is a known uncomputable problem; if the compiler could actually do that it could solve the Halting Problem, and we wouldn't have to put up with the bottom value being a member of every type.
And unfortunately the fact that functions can't be members of Eq
means lots of other things can't be either; lists of integers can be compared for equality, but lists of functions can't, and the same goes for every other conatiner-ish type when it's containing functions. This also goes for ADTs that you write, unless there is a sensible notion of equality you can define for that type that doesn't depend on the equality of the contained functions (maybe the function is just a convenience in the implementation, and which function it is doesn't affect the value you're representing with ADT).
So, there are (1) types that are already members of Eq
, (2) types that can't be members of Eq
, (3) types that can be members of Eq
in an obvious way, (4) types that can be a member of Eq
but only in a non-obvious way, and (5) types that can be members of Eq
in an obvious way, but the programmer would prefer an alternative way. I think the way Haskell handles these cases is actually the right way. (1) and (2) don't require anything from you, and (4) and (5) are always going to require an explicit instance declaration. The only case where the compiler could help you out a little more is (3), where it could potentially save you 12 characters of typing (4 if you're already deriving
anything else).
I think that would be a pretty small win for the cost. The compiler would have to try to construct an instance of everything and presume that anything for which that fails isn't supposed to have an Eq
instance. At the moment if you want to derive an Eq
instance and accidentally write a type for which that doesn't work, the compiler tells you then and there that there's a problem. With the proposed "implicitly make everything Eq
that you can" strategy, this error would show up as an unexplained "no Eq
instance" error at the point that you go to use the assumed instance. It also means that if I'm thinking of the type as representing values for which the reasonable equality relation isn't simple structural equality (case (5) above; remember the set represented by an unordered list?), and I forget to write my own Eq
instance, then the compiler might automatically generate a wrong Eq
instance for me. I'd much rather be told "you haven't written an Eq
instance yet" when I go to use it than have the program compile and run with a bug introduced by the compiler!
Because the way that values are compared may be custom. For example, certain "fields" might be excluded from comparison.
Or consider a type representing a case-insensitive string. Such a type would not want to compare the Chars it contains for identity.
How do you compare functions? Or existential types? Or MVars?
There are incomparable types.
Edit: MVar is in Eq!
instance Eq (MVar a) where
(MVar mvar1#) == (MVar mvar2#) = sameMVar# mvar1# mvar2#
But it takes a magic primop to make it so.
I can't think of types that can't be compared.
let infiniteLoop = infiniteLoop
let iSolvedTheHaltingProblem f = f == infiniteLoop
-- Oops!
Consider the following Python example:
>>> 2 == 2
True
>> {} == {}
True
>>> set() == set()
True
>>> [1,2,3] == [1,2,3]
True
>>> (lambda x: x) == (lambda x: x)
False
False? o_O This of course makes sense if you realize that Python == compares pointer values, except when it doesn't.
>>> f = (lambda x: x)
>>> f == f
True
Haskell encourages ==
to always represent structural equality (and it always will if you use deriving Eq
. Since nobody really knows a completely sound and tractable way to declare for certain whether or not two functions are structurally equivalent, there is no Eq
instance for functions. By extension, any data structure that stores a function in it cannot be an instance of Eq
.