Why does Haskell use mergesort instead of quicksort?

前端 未结 6 773
执念已碎
执念已碎 2021-01-31 01:38

In Wikibooks\' Haskell, there is the following claim:

Data.List offers a sort function for sorting lists. It does not use quicksort; rather, it uses an ef

6条回答
  •  说谎
    说谎 (楼主)
    2021-01-31 01:50

    Many arguments on why Quicksort is not used in Haskell seem plausible. However, at least Quicksort is not slower than Mergesort for the random case. Based on the implementation given in Richard Bird's book, Thinking Functionally in Haskell, I made a 3-way Quicksort:

    tqsort [] = []
    tqsort (x:xs) = sortp xs [] [x] [] 
      where
        sortp [] us ws vs     = tqsort us ++ ws ++ tqsort vs
        sortp (y:ys) us ws vs =
          case compare y x of 
            LT -> sortp ys (y:us) ws vs 
            GT -> sortp ys us ws (y:vs)
            _  -> sortp ys us (y:ws) vs
    

    I benchmarked a few cases, e.g., lists of size 10^4 containing Int between 0 and 10^3 or 10^4, and so on. The result is the 3-way Quicksort or even Bird's version are better than GHC's Mergesort, something like 1.x~3.x faster than ghc's Mergesort, depending on the type of data (many repetitions? very sparse?). The following stats is generated by criterion:

    benchmarking Data.List.sort/Diverse/10^5
    time                 223.0 ms   (217.0 ms .. 228.8 ms)
                         1.000 R²   (1.000 R² .. 1.000 R²)
    mean                 226.4 ms   (224.5 ms .. 228.3 ms)
    std dev              2.591 ms   (1.824 ms .. 3.354 ms)
    variance introduced by outliers: 14% (moderately inflated)
    
    benchmarking 3-way Quicksort/Diverse/10^5
    time                 91.45 ms   (86.13 ms .. 98.14 ms)
                         0.996 R²   (0.993 R² .. 0.999 R²)
    mean                 96.65 ms   (94.48 ms .. 98.91 ms)
    std dev              3.665 ms   (2.775 ms .. 4.554 ms)
    

    However, there is another requirement of sort stated in Haskell 98/2010: it needs to be stable. The typical Quicksort implementation using Data.List.partition is stable, but the above one isn't.


    Later addition: A stable 3-way Quicksort mentioned in the comment seems as fast as tqsort here.

提交回复
热议问题