In my initial attempt at creating a disjoint set data structure I created a Point
data type with a parent
pointer to another Point
:
data Point a = Point
{ _value :: a
, _parent :: Point a
, _rank :: Int
}
To create a singleton set, a Point
is created that has itself as its parent (I believe this is called tying the knot):
makeSet' :: a -> Point a
makeSet' x = let p = Point x p 0 in p
Now, when I wanted to write findSet
(i.e. follow parent pointers until you find a Point
whose parent is itself) I hit a problem: Is it possible to check if this is the case? A naïve Eq
instance would of course loop infinitely – but is this check conceptually possible to write?
(I ended up using a Maybe Point
for the parent field, see my other question.)
No, what you are asking for is known in the Haskell world as referential identity: the idea that for two values of a certain type, you can check whether they are the same value in memory or two separate values that happen to have exactly the same properties.
For your example, you can ask yourself whether you would consider the following two values the same or not:
pl1 :: Point Int
pl1 = Point 0 (Point 0 pl1 1) 1
pl2 :: Point Int
pl2 = Point 0 pl2 1
Haskell considers both values completely equal. I.e. Haskell does not support referential identity. One of the reasons for this is that it would violate other features that Haskell does support. It is for example the case in Haskell that we can always replace a reference to a function by that function's implementation without changing the meaning (equational reasoning). For example if we take the implementation of pl2
: Point 0 pl2 1
and replace pl2
by its definition, we get Point 0 (Point 0 pl2 1) 1
, making pl2
's definition equivalent to pl1
's. This shows that Haskell cannot allow you to observe the difference between pl1
and pl2
without violating properties implied by equational reasoning.
You could use unsafe features like unsafePerformIO
(as suggested above) to work around the lack of referential identity in Haskell, but you should know that you are then breaking core principles of Haskell and you may observe weird bugs when GHC starts optimizing (e.g. inlining) your code. It is better to use a different representation of your data, e.g. the one you mentioned using a Maybe Point
value.
You could try to do this using StableName
(or StablePtr
) and unsafePerformIO
, but it seems like a worse idea than Maybe Point
for this case.
In order the observed the effect you're most likely interested in—pointer equality instead of value equality—you most likely will want to write your algorithm in the ST
monad. The ST
monad can be thought of kind of like "locally impure IO, globally pure API" though, by the nature of Union Find, you'll likely have to dip the entire lookup process into the impure section of your code.
Fortunately, monads still contain this impurity fairly well.
There is also already an implementation of Union Find in Haskell which uses three methods to achieve the kind of identity you're looking for: using the IO
monad, using IntSet
s, and using the ST
monad.
In Haskell data is immutable (except IO a
, IORef a
, ST a
, STRef a
, ..) and data is function.
So, any data is "singleton"
When you type
data C = C {a, b :: Int}
changeCa :: C -> Int -> C
changeCa c newa = c {a = newa}
You don't change a variable, you destroy old data and create a new one. You could try use links and pointers, but it is useless and complex
And last,
data Point a = Point {p :: Point a}
is a infinite list-like data = Point { p=Point { p=Point { p=Point { ....
来源:https://stackoverflow.com/questions/18077428/self-reference-in-data-structure-checking-for-equality