Efficient heaps in purely functional languages

后端 未结 9 1867
被撕碎了的回忆
被撕碎了的回忆 2021-01-30 13:42

As an exercise in Haskell, I\'m trying to implement heapsort. The heap is usually implemented as an array in imperative languages, but this would be hugely inefficient in purely

相关标签:
9条回答
  • 2021-01-30 13:52

    Just like in efficient Quicksort algorithms written in Haskell, you need to use monads (state transformers) to do stuff in-place.

    0 讨论(0)
  • 2021-01-30 13:55

    As an exercise in Haskell, I implemented an imperative heapsort with the ST Monad.

    {-# LANGUAGE ScopedTypeVariables #-}
    
    import Control.Monad (forM, forM_)
    import Control.Monad.ST (ST, runST)
    import Data.Array.MArray (newListArray, readArray, writeArray)
    import Data.Array.ST (STArray)
    import Data.STRef (newSTRef, readSTRef, writeSTRef)
    
    heapSort :: forall a. Ord a => [a] -> [a]
    heapSort list = runST $ do
      let n = length list
      heap <- newListArray (1, n) list :: ST s (STArray s Int a)
      heapSizeRef <- newSTRef n
      let
        heapifyDown pos = do
          val <- readArray heap pos
          heapSize <- readSTRef heapSizeRef
          let children = filter (<= heapSize) [pos*2, pos*2+1]      
          childrenVals <- forM children $ \i -> do
            childVal <- readArray heap i
            return (childVal, i)
          let (minChildVal, minChildIdx) = minimum childrenVals
          if null children || val < minChildVal
            then return ()
            else do
              writeArray heap pos minChildVal
              writeArray heap minChildIdx val
              heapifyDown minChildIdx
        lastParent = n `div` 2
      forM_ [lastParent,lastParent-1..1] heapifyDown
      forM [n,n-1..1] $ \i -> do
        top <- readArray heap 1
        val <- readArray heap i
        writeArray heap 1 val
        writeSTRef heapSizeRef (i-1)
        heapifyDown 1
        return top
    

    btw I contest that if it's not purely functional then there is no point in doing so in Haskell. I think my toy implementation is much nicer than what one would achieve in C++ with templates, passing around stuff to the inner functions.

    0 讨论(0)
  • 2021-01-30 13:56

    You could also use the ST monad, which allows you to write imperative code but expose a purely functional interface safely.

    0 讨论(0)
  • 2021-01-30 13:58

    Arrays in Haskell aren't as hugely inefficient as you might think, but typical practice in Haskell would probably be to implement this using ordinary data types, like this:

    data Heap a = Empty | Heap a (Heap a) (Heap a)
    fromList :: Ord a => [a] -> Heap a
    toSortedList :: Ord a => Heap a -> [a]
    heapSort = toSortedList . fromList
    

    If I were solving this problem, I might start by stuffing the list elements into an array, making it easier to index them for heap creation.

    import Data.Array
    fromList xs = heapify 0 where
      size = length xs
      elems = listArray (0, size - 1) xs :: Array Int a
      heapify n = ...
    

    If you're using a binary max heap, you might want to keep track of the size of the heap as you remove elements so you can find the bottom right element in O(log N) time. You could also take a look at other types of heaps that aren't typically implemented using arrays, like binomial heaps and fibonacci heaps.

    A final note on array performance: in Haskell there's a tradeoff between using static arrays and using mutable arrays. With static arrays, you have to create new copies of the arrays when you change the elements. With mutable arrays, the garbage collector has a hard time keeping different generations of objects separated. Try implementing the heapsort using an STArray and see how you like it.

    0 讨论(0)
  • 2021-01-30 14:02

    There are a number of Haskell heap implementations in an appendix to Okasaki's Purely Functional Data Structures. (The source code can be downloaded at the link. The book itself is well worth reading.) None of them are binary heaps, per se, but the "leftist" heap is very similar. It has O(log n) insertion, removal, and merge operations. There are also more complicated data structures like skew heaps, binomial heaps, and splay heaps which have better performance.

    0 讨论(0)
  • 2021-01-30 14:08

    Jon Fairbairn posted a functional heapsort to the Haskell Cafe mailing list back in 1997:

    http://www.mail-archive.com/haskell@haskell.org/msg01788.html

    I reproduce it below, reformatted to fit this space. I've also slightly simplified the code of merge_heap.

    I'm surprised treefold isn't in the standard prelude since it's so useful. Translated from the version I wrote in Ponder in October 1992 -- Jon Fairbairn

    module Treefold where
    
    -- treefold (*) z [a,b,c,d,e,f] = (((a*b)*(c*d))*(e*f))
    treefold f zero [] = zero
    treefold f zero [x] = x
    treefold f zero (a:b:l) = treefold f zero (f a b : pairfold l)
        where 
            pairfold (x:y:rest) = f x y : pairfold rest
            pairfold l = l -- here l will have fewer than 2 elements
    
    
    module Heapsort where
    import Treefold
    
    data Heap a = Nil | Node a [Heap a]
    heapify x = Node x []
    
    heapsort :: Ord a => [a] -> [a]    
    heapsort = flatten_heap . merge_heaps . map heapify    
        where 
            merge_heaps :: Ord a => [Heap a] -> Heap a
            merge_heaps = treefold merge_heap Nil
    
            flatten_heap Nil = []
            flatten_heap (Node x heaps) = x:flatten_heap (merge_heaps heaps)
    
            merge_heap heap Nil = heap
            merge_heap node_a@(Node a heaps_a) node_b@(Node b heaps_b)
                | a < b = Node a (node_b: heaps_a)
                | otherwise = Node b (node_a: heaps_b)
    
    0 讨论(0)
提交回复
热议问题