Haskell IO and closing files

后端 未结 6 1979
走了就别回头了
走了就别回头了 2021-01-30 21:21

When I open a file for reading in Haskell, I\'ve found that I can\'t use the contents of the file after closing it. For example, this program will print the contents of a file:<

相关标签:
6条回答
  • 2021-01-30 21:25

    The explanation is rather long to be included here. Forgive me for dispensing a short tip only: you need to read about "semi-closed file handles" and "unsafePerformIO".

    In short - this behaviour is a design compromise between a semantic clearness and lazy evaluation. You should either postpone hClose until you are absolutely sure you woudnt be doing anything with the file content (like, call it in error handler, or somesuch), or use something else besides hGetContents to get file contents non-lazily.

    0 讨论(0)
  • 2021-01-30 21:30

    [Update: Prelude.readFile causes problems as described below, but switching over to using Data.ByteString's versions of everything works: I no longer get the exception.]

    Haskell newbie here, but currently I don't buy the claim that "readFile is strict, and closes the file when it's done":

    go fname = do
       putStrLn "reading"
       body <- readFile fname
       let body' = "foo" ++ body ++ "bar"
       putStrLn body' -- comment this out to get a runtime exception.
       putStrLn "writing"
       writeFile fname body'
       return ()
    

    That works as it stands on the file that I was testing with, but if you comment out the putStrLn then apparently the writeFile fails. (Interesting how lame Haskell exception messages are, lacking line numbers etc.?)

    Test> go "Foo.hs"
    reading
    writing
    Exception: Foo.hs: openFile: permission denied (Permission denied)
    Test> 
    

    ?!?!?

    0 讨论(0)
  • 2021-01-30 21:38

    This is because hGetContents doesn't do anything yet: it's lazy I/O. Only when you use the result string the file is actually read (or the part of it that is needed). If you want to force it to be read, you can compute its length, and use the seq function to force the length to be evaluated. Lazy I/O can be cool, but it can also be confusing.

    For more information, see the part about lazy I/O in Real World Haskell, for example.

    0 讨论(0)
  • 2021-01-30 21:46

    If you want to keep your IO lazy, but to do it safely so that errors such as this don't occur, use a package designed for this such as safe-lazy-io. (However, safe-lazy-io doesn't support bytestring I/O.)

    0 讨论(0)
  • 2021-01-30 21:50

    As previously noted, hGetContents is lazy. readFile is strict, and closes the file when it's done:

    main = do contents <- readFile "foo"
              putStr contents
    

    yields the following in Hugs

    > main
    blahblahblah
    

    where foo is

    blahblahblah
    

    Interestingly, seq will only guarantee that some portion of the input is read, not all of it:

    main = do inFile <- openFile "foo" ReadMode
              contents <- hGetContents $! inFile
              contents `seq` hClose inFile
              putStr contents
    

    yields

    > main
    b
    

    A good resource is: Making Haskell programs faster and smaller: hGetContents, hClose, readFile

    0 讨论(0)
  • 2021-01-30 21:51

    As others have stated, it is because of lazy evaluation. The handle is half-closed after this operation, and will be closed automatically when all data is read. Both hGetContents and readFile are lazy in this way. In cases where you're having issues with handles being kept open, typically you just force the read. Here's the easy way:

    import Control.Parallel.Strategies (rnf)
    -- rnf means "reduce to normal form"
    main = do inFile <- openFile "foo" 
              contents <- hGetContents inFile
              rnf contents `seq` hClose inFile -- force the whole file to be read, then close
              putStr contents
    

    These days, however, nobody is using strings for file I/O anymore. The new way is to use Data.ByteString (available on hackage), and Data.ByteString.Lazy when you want lazy reads.

    import qualified Data.ByteString as Str
    
    main = do contents <- Str.readFile "foo"
              -- readFile is strict, so the the entire string is read here
              Str.putStr contents
    

    ByteStrings are the way to go for big strings (like file contents). They are much faster and more memory efficient than String (= [Char]).

    Notes:

    I imported rnf from Control.Parallel.Strategies only for convenience. You could write something like it yourself pretty easily:

      forceList [] = ()
      forceList (x:xs) = forceList xs
    

    This just forces a traversal of the spine (not the values) of the list, which would have the effect of reading the whole file.

    Lazy I/O is becoming considered evil by experts; I recommend using strict bytestrings for most of file I/O for the time being. There are a few solutions in the oven which attempt to bring back composable incremental reads, the most promising of which is called "Iteratee" by Oleg.

    0 讨论(0)
提交回复
热议问题