As part of a programming challenge, I need to read, from stdin, a sequence of space-separated integers (on a single line), and print the sum of those integers to stdout
Read is slow
Fast read, from this answer, will bring you down to 5.5 seconds.
import Numeric
fastRead :: String -> Int
fastRead s = case readDec s of [(n, "")] -> n
Strings are Linked Lists
In Haskell the String
type is a linked list. Using a packed representation (bytestring
if you really only want ascii but Text
is also very fast and supports unicode). As shown in this answer, the performance should then be neck and neck.
read
is slow. For bulk parsing, use bytestring
or text
primitives, or attoparsec
.
I did some benchmarking. Your original version ran in 23,9 secs on my computer. The version below ran in 0.35 secs:
import qualified Data.ByteString.Char8 as B
import Control.Applicative
import Data.Maybe
import Data.List
import Data.Char
main = print . sum =<< getIntList
getIntList :: IO [Int]
getIntList =
map (fst . fromJust . B.readInt) . B.words <$> B.readFile "test.txt"
By specializing the parser to your test.txt
file, I could get the runtime down to 0.26 sec:
getIntList :: IO [Int]
getIntList =
unfoldr (B.readInt . B.dropWhile (==' ')) <$> B.readFile "test.txt"
I would venture to guess that a big part of your problem is actually words
. When you map read . words
, what you're actually doing is this:
This is a fairly ridiculous way to proceed. I believe you can even do better using something horrible like reads
, but it would make more sense to use something like ReadP. You can also try fancier sorts of things like stream-based parsing; I don't know if that will help much or not.