Is there a way to tell if a list in Haskell is infinite? The reason is that I don\'t want to apply functions such as length
to infinite lists.
Applying length
to unknown lists is generally a bad idea, both practically due to infinite lists, and conceptually because often it turns out that you don't actually care about the length anyway.
You said in a comment:
I'm very new to Haskell, so now, don't infinite structures make my programs very vulnerable?
Not really. While some of us wish there were better ways to distinguish between necessarily finite and necessarily infinite data, you're always safe when you create, process, and examine lazy structures incrementally. Computing the length is clearly not incremental, but checking to see if the length is above or below some cut-off value is, and very often that's all you wanted to do anyway!
A trivial case is testing for nonempty lists. isNonEmpty xs == length xs > 0
is a bad implementation because it examines an unbounded number of elements, when examining a single one would suffice! Compare this:
isNonEmpty [] = False
isNonEmpty (_:_) = True
Not only is this is safe to apply to an infinite list, it's also much more efficient on finite lists--it takes only constant time, instead of time linear in the length of the list. It's also how the standard library function null is implemented.
To generalize this for length testing relative to a cut-off, you'll obviously need to examine as much of the list as the length you're comparing to. We can do exactly this, and no more, using the standard library function drop
:
longerThan :: Int -> [a] -> Bool
longerThan n xs = isNonEmpty $ drop n xs
Given a length n
and a (possibly infinite) list xs
, this drops the first n
elements of xs
if they exist, then checks to see if the result is non-empty. Because drop
produces the empty list if n
is larger than the length of the list, this works correctly for all positive n
(alas, there's no non-negative integer type, e.g. natural numbers, in the standard libraries).
The key point here is that it's better in most cases to think of lists as iterative streams, not a simple data structure. When possible you want to do things like transform, accumulate, truncate, etc., and either produce another list as output or examine only a known-finite amount of the list, rather than trying to process the entire list in one go.
If you use this approach, not only will your functions work correctly on finite and infinite lists both, but they'll also benefit more from laziness and GHC's optimizer, and be likely to run faster and use less memory.
No - you may at best estimate. See the Halting Problem.
The Halting Problem was first proved unsolvable by assuming a Halting Oracle existed, then writing a function that did the opposite of what that oracle said would happen. Let's reproduce that here:
isInfinite :: [a] -> Bool
isInfinite ls = {- Magic! -}
Now, we want to make a list impossibleList
that does the opposite of what isInfinite
says it should. So, if impossibleList
is infinite, it is actually []
, and if it isn't infinite, it is something : impossibleList
.
-- using a string here so you can watch it explode in ghci
impossibleList :: [String]
impossibleList =
case isInfinite impossibleList of
True -> []
False -> "loop!" : impossibleList
Try this out yourself in ghci with isInfinite = const True
and isInfinite = const False
.
We don't need to solve the Halting Problem to call 'length' safely. We just need to be conservative; accept everything that has a finiteness proof, reject everything that doesn't (including many finite lists). This is exactly what type systems are for, so we use the following type (t is our element type, which we ignore):
terminatingLength :: (Finite a) => a t -> Int
terminatingLength = length . toList
The Finite class will only contains finite lists, so the type-checker will ensure we have a finite argument. membership of Finite will be our proof of finiteness. The "toList" function just turns Finite values into regular Haskell lists:
class Finite a where
toList :: a t -> [t]
Now what are our instances? We know that empty lists are finite, so we make a datatype to represent them:
-- Type-level version of "[]"
data Nil a = Nil
instance Finite Nil where
toList Nil = []
If we 'cons' an element on to a finite list, we get a finite list (eg. "x:xs" is finite if "xs" is finite):
-- Type-level version of ":"
data Cons v a = Cons a (v a)
-- A finite tail implies a finite Cons
instance (Finite a) => Finite (Cons a) where
toList (Cons h t) = h : toList t -- Simple tail recursion
Anyone calling our terminatingLength function must now prove that their list is finite, otherwise their code won't compile. This hasn't removed the Halting Problem issue, but we have shifted it to compile-time rather than run-time. The compiler may hang while trying to determine membership of Finite, but that's better than having a production program hang when it's given some unexpected data.
A word of caution: Haskell's 'ad-hoc' polymorphism allows pretty much arbitrary instances of Finite to be declared at other points in the code, and terminatingLength will accept these as finiteness proofs even if they're not. This isn't too bad though; if someone's trying to bypass the safety mechanisms of your code, they get the errors they deserve ;)
isInfinite x = length x `seq` False
There is also the possibility of separating finite and infinite lists by design and use distinct types for them.
Unfortunately Haskell (unlike Agda for example) doesn't allow you to enforce that a data structure is always finite, you can employ techniques of total functional programming to ensure that.
And you can declare infinite lists (AKA streams) as
data Stream a = Stream a (Stream a)
which doesn't have any way how to terminate a sequence (it's basically a list without []
).