I have a Haskell program which takes a file as an input and convert it into a binary search tree.
import System.IO
data Tree a = EmptyBST | Node a (Tree a)
You'd be able to see the problem right away if you supplied fileRead
with a type signature. Let's figure out the type annotation that GHC will internally assign to fileRead
:
fileRead = do file <- readFile "tree.txt"
let t = lstToTree $ map read $ words file
return t
lstToTree :: Ord a => [a] -> Tree a
, and read
always returns a member of the Read
typeclass. So t :: (Read a, Ord a) => Tree a
. The concrete type depends on the contents of the file.
return
wraps its argument in a monad, so return t
has the type Ord a, Read a => IO (Tree a)
. Since return t
is the final statement in the do
block, it becomes the return type of fileRead
, so
fileRead :: (Read a, Ord a) => IO (Tree a)
So fileRead
is a Tree
wrapped in an IO
, and you can't pass it directly into ins
because it expects a Tree
on its own. You can't take the Tree
out of the IO
, but you can 'lift' the function ins
into the IO
monad.
Control.Monad exports liftM :: Monad m => (a -> r) -> (m a -> m r). It accepts a regular function, and turns it into one that acts on monads like IO
. It's actually a synonym for fmap (in the standard Prelude), since all monads are functors. So this code, roughly equivalent to @us202's, takes the result of fileRead
, inserts 5
, and gives you back the result wrapped in an IO
.
liftM (ins 5) fileRead
-- or --
fmap (ins 5) fileRead
I'd recommend the fmap
version. This code only makes use of the fact that IO
is a functor, so using liftM
implies to the reader that you might need it to be a monad too.
'Lifting' is the general technique for using pure functions on values wrapped in monads or functors. If you're unfamiliar with lifting (or if you're confused by monads and functors in general), I heartily recommend chapters 11-13 of Learn You A Haskell.
PS. Note that the last two lines of fileRead
should probably be combined, since return
doesn't really do anything:
fileRead :: (Read a, Ord a) => IO (Tree a)
fileRead = do file <- readFile "tree.txt"
return $ lstToTree $ map read $ words file
Or, since it's a short enough function, you could do away with do
notation altogether and use fmap
again:
fileRead :: (Read a, Ord a) => IO (Tree a)
fileRead = fmap (lstToTree . map read . words) (readFile "tree.txt")
Edit in response to your comment:
Haskell is deliberately designed to keep code that performs IO separate from regular code. There's a very good philosophical reason for this: most Haskell functions are "pure" - that is, their output depends only on the input, just like functions in maths. You can run a pure function a million times and you'll always get the same result. We like pure functions because they can't accidentally break other parts of your program, they permit laziness, and they allow the compiler to aggressively optimise your code for you.
Of course, in the real world we need a little bit of impurity. IO code like getLine
can't possibly be pure (and a program that doesn't do IO is useless!). The result of getLine
depends on what the user typed: you could run getLine
a million times and get a different string every time. Haskell leverages the type system to label impure code with the type IO
.
Here's the crux of the matter: if you use a pure function on data obtained impurely then the result is still impure, because the outcome depends on what the user did. So the whole calculation belongs in the IO
monad. When you want to bring a pure function into IO
you have to lift it, either explicitly (using fmap
) or implicitly (with do
notation).
This is a really common pattern in Haskell - look at my version of fileRead
above. I've used fmap
to operate on impure IO
data with a pure function.