问题
I have the following code which works fine unless the file has utf-8
characteres :
module Main where
import Ref
main = do
text <- getLine
theInput <- readFile text
writeFile ("a"++text) (unlist . proc . lines $ theInput)
With utf-8 characteres I get this:
hGetContents: invalid argument (invalid byte sequence)
Since the file I'm working with has UTF-8
characters, I would like to handle this exception in order to reuse the functions imported from Ref
if possible.
Is there a way to read a UTF-8
file as IO String
so I can reuse my Ref
's functions?. What modifications should I make to my code?. Thanks in Advance.
I attach the functions declarations from my Ref
module:
unlist :: [String] -> String
proc :: [String] -> [String]
from prelude:
lines :: String -> [String]
回答1:
This can be done with just GHC's basic (but extended from the standard) System.IO
module, although you'll then have to use more functions:
module Main where
import Ref
import System.IO
main = do
text <- getLine
inputHandle <- openFile text ReadMode
hSetEncoding inputHandle utf8
theInput <- hGetContents inputHandle
outputHandle <- openFile ("a"++text) WriteMode
hSetEncoding outputHandle utf8
hPutStr outputHandle (unlist . proc . lines $ theInput)
hClose outputHandle -- I guess this one is optional in this case.
回答2:
Thanks for the answers, but I found the solution by myself. Actually the file I was working with has this codification:
ISO-8859 text, with CR line terminators
So to work with that file with my haskell code It should have this codification instead:
UTF-8 Unicode text, with CR line terminators
You can check the file codification with the utility file
like this:
$ file filename
To change the file codification follow the instructions from this link!
回答3:
Use System.IO.Encoding.
The lack of unicode support is a well known problem with with the standard Haskell IO library.
module Main where
import Prelude hiding (readFile, getLine, writeFile)
import System.IO.Encoding
import Data.Encoding.UTF8
main = do
let ?enc = UTF8
text <- getLine
theInput <- readFile text
writeFile ("a" ++ text) (unlist . proc . lines $ theInput)
来源:https://stackoverflow.com/questions/33444796/read-file-with-utf-8-in-haskell-as-io-string