Python faster than compiled Haskell?

前端 未结 7 1262
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-11-29 23:16

I have a simple script written in both Python and Haskell. It reads a file with 1,000,000 newline separated integers, parses that file into a list of integers, quick sorts i

相关标签:
7条回答
  • 2020-11-29 23:30

    i noticed some problem everybody else didn't notice for some reason; both your haskell and python code have this. (please tell me if it's fixed in the auto-optimizations, I know nothing about optimizations). for this I will demonstrate in haskell. in your code you define the lesser and greater lists like this:

    where lesser = filter (<p) xs
          greater = filter (>=p) xs
    

    this is bad, because you compare with p each element in xs twice, once for getting in the lesser list, and again for getting in the greater list. this (theoretically; I havn't checked timing) makes your sort use twice as much comparisons; this is a disaster. instead, you should make a function which splits a list into two lists using a predicate, in such a way that

    split f xs
    

    is equivalent to

    (filter f xs, filter (not.f) xs)
    

    using this kind of function you will only need to compare each element in the list once to know in which side of the tuple to put it.
    okay, lets do it:

    where
        split :: (a -> Bool) -> [a] -> ([a], [a])
        split _ [] = ([],[])
        split f (x:xs)
            |f x       = let (a,b) = split f xs in (x:a,b)
            |otherwise = let (a,b) = split f xs in (a,x:b)
    

    now lets replace the lesser/greater generator with

    let (lesser, greater) = split (p>) xs in (insert function here)
    

    full code:

    quicksort :: Ord a => [a] -> [a]
    quicksort []     = []
    quicksort (p:xs) =
        let (lesser, greater) = splitf (p>) xs
        in (quicksort lesser) ++ [p] ++ (quicksort greater)
        where
            splitf :: (a -> Bool) -> [a] -> ([a], [a])
            splitf _ [] = ([],[])
            splitf f (x:xs)
                |f x       = let (a,b) = splitf f xs in (x:a,b)
                |otherwise = let (a,b) = splitf f xs in (a,x:b)
    

    for some reason I can't right the getter/lesser part in the where clauses so I had to right it in let clauses. also, if it is not tail-recursive let me know and fix it for me (I don't know yet how tail-recorsive works fully)

    now you should do the same for the python code. I don't know python so I can't do it for you.

    EDIT: there actually happens to already be such function in Data.List called partition. note this proves the need for this kind of function because otherwise it wouldn't be defined. this shrinks the code to:

    quicksort :: Ord a => [a] -> [a]
    quicksort []     = []
    quicksort (p:xs) =
        let (lesser, greater) = partition (p>) xs
        in (quicksort lesser) ++ [p] ++ (quicksort greater)
    
    0 讨论(0)
  • 2020-11-29 23:38

    This is after the fact, but I think most of the trouble is in the Haskell writing. The following module is pretty primitive -- one should use builders probably and certainly avoid the ridiculous roundtrip via String for showing -- but it is simple and did distinctly better than pypy with kindall's improved python and better than the 2 and 4 sec Haskell modules elsewhere on this page (it surprised me how much they were using lists, so I made a couple more turns of the crank.)

    $ time aa.hs        real    0m0.709s
    $ time pypy aa.py   real    0m1.818s
    $ time python aa.py real    0m3.103s
    

    I'm using the sort recommended for unboxed vectors from vector-algorithms. The use of Data.Vector.Unboxed in some form is clearly now the standard, naive way of doing this sort of thing -- it's the new Data.List (for Int, Double, etc.) Everything but the sort is irritating IO management, which could I think still be massively improved, on the write end in particular. The reading and sorting together take about 0.2 sec as you can see from asking it to print what's at a bunch of indexes instead of writing to file, so twice as much time is spent writing as in anything else. If the pypy is spending most of its time using timsort or whatever, then it looks like the sorting itself is surely massively better in Haskell, and just as simple -- if you can just get your hands on the darned vector...

    I'm not sure why there aren't convenient functions around for reading and writing vectors of unboxed things from natural formats -- if there were, this would be three lines long and would avoid String and be much faster, but maybe I just haven't seen them.

    import qualified Data.ByteString.Lazy.Char8 as BL
    import qualified Data.ByteString.Char8 as B
    import qualified Data.Vector.Unboxed.Mutable as M
    import qualified Data.Vector.Unboxed as V
    import Data.Vector.Algorithms.Radix 
    import System.IO
    
    main  = do  unsorted <- fmap toInts (BL.readFile "data")
                vec <- V.thaw unsorted
                sorted <- sort vec >> V.freeze vec
                withFile "sorted" WriteMode $ \handle ->
                   V.mapM_ (writeLine handle) sorted
    
    writeLine :: Handle -> Int -> IO ()
    writeLine h int = B.hPut h $ B.pack (show int ++ "\n")
    
    toInts :: BL.ByteString -> V.Vector Int
    toInts bs = V.unfoldr oneInt (BL.cons ' ' bs) 
    
    oneInt :: BL.ByteString -> Maybe (Int, BL.ByteString)
    oneInt bs = if BL.null bs then Nothing else 
                   let bstail = BL.tail bs
                   in if BL.null bstail then Nothing else BL.readInt bstail
    
    0 讨论(0)
  • 2020-11-29 23:39

    In short, don't use read. Replace read with a function like this:

    import Numeric
    
    fastRead :: String -> Int
    fastRead s = case readDec s of [(n, "")] -> n
    

    I get a pretty fair speedup:

    ~/programming% time ./test.slow
    ./test.slow  9.82s user 0.06s system 99% cpu 9.901 total
    ~/programming% time ./test.fast
    ./test.fast  6.99s user 0.05s system 99% cpu 7.064 total
    ~/programming% time ./test.bytestring
    ./test.bytestring  4.94s user 0.06s system 99% cpu 5.026 total
    

    Just for fun, the above results include a version that uses ByteString (and hence fails the "ready for the 21st century" test by totally ignoring the problem of file encodings) for ULTIMATE BARE-METAL SPEED. It also has a few other differences; for example, it ships out to the standard library's sort function. The full code is below.

    import qualified Data.ByteString as BS
    import Data.Attoparsec.ByteString.Char8
    import Control.Applicative
    import Data.List
    
    parser = many (decimal <* char '\n')
    
    reallyParse p bs = case parse p bs of
        Partial f -> f BS.empty
        v -> v
    
    main = do
        numbers <- BS.readFile "data"
        case reallyParse parser numbers of
            Done t r | BS.null t -> writeFile "sorted" . unlines . map show . sort $ r
    
    0 讨论(0)
  • 2020-11-29 23:40

    Python is really optimized for this sort of thing. I suspect that Haskell isn't. Here's a similar question that provides some very good answers.

    0 讨论(0)
  • 2020-11-29 23:43

    To follow up @kindall interesting answer, those timings are dependent from both the python / Haskell implementation you use, the hardware configuration on which you run the tests, and the algorithm implementation you right in both languages.

    Nevertheless we can try to get some good hints of the relative performances of one language implementation compared to another, or from one language to another language. With well known alogrithms like qsort, it's a good beginning.

    To illustrate a python/python comparison, I just tested your script on CPython 2.7.3 and PyPy 1.8 on the same machine:

    • CPython: ~8s
    • PyPy: ~2.5s

    This shows there can be room for improvements in the language implementation, maybe compiled Haskell is not performing at best the interpretation and compilation of your corresponding code. If you are searching for speed in Python, consider also to switch to pypy if needed and if your covering code permits you to do so.

    0 讨论(0)
  • 2020-11-29 23:45

    The Original Haskell Code

    There are two issues with the Haskell version:

    • You're using string IO, which builds linked lists of characters
    • You're using a non-quicksort that looks like quicksort.

    This program takes 18.7 seconds to run on my Intel Core2 2.5 GHz laptop. (GHC 7.4 using -O2)

    Daniel's ByteString Version

    This is much improved, but notice it still uses the inefficient built-in merge sort.

    His version takes 8.1 seconds (and doesn't handle negative numbers, but that's more of a non-issue for this exploration).

    Note

    From here on this answer uses the following packages: Vector, attoparsec, text and vector-algorithms. Also notice that kindall's version using timsort takes 2.8 seconds on my machine (edit: and 2 seconds using pypy).

    A Text Version

    I ripped off Daniel's version, translated it to Text (so it handles various encodings) and added better sorting using a mutable Vector in an ST monad:

    import Data.Attoparsec.Text.Lazy
    import qualified Data.Text.Lazy as T
    import qualified Data.Text.Lazy.IO as TIO
    import qualified Data.Vector.Unboxed as V
    import qualified Data.Vector.Algorithms.Intro as I
    import Control.Applicative
    import Control.Monad.ST
    import System.Environment (getArgs)
    
    parser = many (decimal <* char '\n')
    
    main = do
        numbers <- TIO.readFile =<< fmap head getArgs
        case parse parser numbers of
            Done t r | T.null t -> writeFile "sorted" . unlines
                                                      . map show . vsort $ r
            x -> error $ Prelude.take 40 (show x)
    
    vsort :: [Int] -> [Int]
    vsort l = runST $ do
            let v = V.fromList l
            m <- V.unsafeThaw v
            I.sort m
            v' <- V.unsafeFreeze m
            return (V.toList v')
    

    This runs in 4 seconds (and also doesn't handle negatives)

    Return to the Bytestring

    So now we know we can make a more general program that's faster, what about making the ASCii -only version fast? No problem!

    import qualified Data.ByteString.Lazy.Char8 as BS
    import Data.Attoparsec.ByteString.Lazy (parse,  Result(..))
    import Data.Attoparsec.ByteString.Char8 (decimal, char)
    import Control.Applicative ((<*), many)
    import qualified Data.Vector.Unboxed as V
    import qualified Data.Vector.Algorithms.Intro as I
    import Control.Monad.ST
    
    
    parser = many (decimal <* char '\n')
    
    main = do
        numbers <- BS.readFile "rands"
        case parse parser numbers of
            Done t r | BS.null t -> writeFile "sorted" . unlines
                                                       . map show . vsort $ r
    
    vsort :: [Int] -> [Int]
    vsort l = runST $ do
            let v = V.fromList l
            m <- V.unsafeThaw v
            I.sort m
            v' <- V.unsafeFreeze m
            return (V.toList v')
    

    This runs in 2.3 seconds.

    Producing a Test File

    Just in case anyone's curious, my test file was produced by:

    import Control.Monad.CryptoRandom
    import Crypto.Random
    main = do
      g <- newGenIO :: IO SystemRandom
      let rs = Prelude.take (2^20) (map abs (crandoms g) :: [Int])
      writeFile "rands" (unlines $ map show rs)
    

    If you're wondering why vsort isn't packaged in some easier form on Hackage... so am I.

    0 讨论(0)
提交回复
热议问题