Basic I/O performance in Haskell

后端 未结 3 724
别跟我提以往
别跟我提以往 2021-01-03 08:31

Another microbenchmark: Why is this \"loop\" (compiled with ghc -O2 -fllvm, 7.4.1, Linux 64bit 3.2 kernel, redirected to /dev/null)



        
相关标签:
3条回答
  • 2021-01-03 08:41

    The standard Haskell way to hand giant bytestrings over to the operating system is to use a builder monoid.

    import Data.ByteString.Lazy.Builder  -- requires bytestring-0.10.x
    import Data.ByteString.Lazy.Builder.ASCII -- omit for bytestring-0.10.2.x
    import Data.Monoid
    import System.IO
    
    main = hPutBuilder stdout $ build  [0..100000000::Int]
    
    build = foldr add_line mempty
       where add_line n b = intDec n <> charUtf8 '\n' <> b
    

    which gives me:

     $ time ./printbuilder >> /dev/null
     real   0m7.032s
     user   0m6.603s
     sys    0m0.398s
    

    in contrast to Haskell approach you used

    $ time ./print >> /dev/null
    real    1m0.143s
    user    0m58.349s
    sys 0m1.032s
    

    That is, it's child's play to do nine times better than mapM_ print, contra Daniel Fischer's suprising defeatism. Everything you need to know is here: http://hackage.haskell.org/packages/archive/bytestring/0.10.2.0/doc/html/Data-ByteString-Builder.html I won't compare it with your C since my results were much slower than Daniel's and n.m. so I figure something was going wrong.

    Edit: Made the imports consistent with all versions of bytestring-0.10.x It occurred to me the following might be clearer -- the Builder equivalent of unlines . map show:

    main = hPutBuilder stdout $ unlines_ $ map intDec [0..100000000::Int]
     where unlines_ = mconcat . map (<> charUtf8 '\n')
    
    0 讨论(0)
  • 2021-01-03 08:44

    On my (rather slow and outdated) machine the results are:

    $ time haskell-test > haskell-out.txt
    real    1m57.497s
    user    1m47.759s
    sys     0m9.369s
    $ time c-test > c-out.txt
    real    7m28.792s
    user    1m9.072s
    sys     6m13.923s
    $ diff haskell-out.txt c-out.txt
    $
    

    (I have fixed the list so that both C and Haskell start with 0).

    Yes you read this right. Haskell is several times faster than C. Or rather, normally buffered Haskell is faster than C with write(2) non-buffered syscall.

    (When measuring output to /dev/null instead of a real disk file, C is about 1.5 times faster, but who cares about /dev/null performance?)

    Technical data: Intel E2140 CPU, 2 cores, 1.6 GHz, 1M cache, Gentoo Linux, gcc4.6.1, ghc7.6.1.

    0 讨论(0)
  • 2021-01-03 08:50

    Okay, on my box the C code, compiled per gcc -O3 takes about 21.5 seconds to run, the original Haskell code about 56 seconds. So not a factor of 5, a bit above 2.5.

    The first nontrivial difference is that

    mapM_ print [1..100000000]
    

    uses Integers, that's a bit slower because it involves a check upfront, and then works with boxed Ints, while the Show instance of Int does the conversion work on unboxed Int#s.

    Adding a type signature, so that the Haskell code works on Ints,

    mapM_ print [1 :: Int .. 100000000]
    

    brings the time down to 47 seconds, a bit above twice the time the C code takes.

    Now, another big difference is that show produces a linked list of Char and doesn't just fill a contiguous buffer of bytes. That is slower too.

    Then that linked list of Chars is used to fill a byte buffer that then is written to the stdout handle.

    So, the Haskell code does more, and more complicated things than the C code, thus it's not surprising that it takes longer.

    Admittedly, it would be desirable to have an easy way to output such things more directly (and hence faster). However, the proper way to handle it is to use a more suitable algorithm (that applies to C too). A simple change to

    putStr . unlines $ map show [0 :: Int .. 100000000]
    

    almost halves the time taken, and if one wants it really fast, one uses the faster ByteString I/O and builds the output efficiently as exemplified in applicative's answer.

    0 讨论(0)
提交回复
热议问题