How do I write a parallel reduction using strategies in Haskell?

后端 未结 3 727
不思量自难忘°
不思量自难忘° 2021-01-02 23:46

In high-performance computing, sums, products, etc are often calculated using a \"parallel reduction\" that takes n elements and completes in O(log n) time

相关标签:
3条回答
  • 2021-01-03 00:03

    I don't think a list is the right data type for this. Because it's just a linked list, the data will necessarily be accessed sequentially. Although you can evaluate the items in parallel, you won't gain much in the reduction step. If you really need a List, I think the best function would be just

    parFold f = foldl1' f . withStrategy (parList rseq)
    

    or maybe

    parFold f = foldl1' f . withStrategy (parBuffer 5 rseq)
    

    If the reduction step is complex, you might get a gain by subdividing the list like this:

    parReduce f = foldl' f mempty . reducedList . chunkList . withStrategy (parList rseq)
     where
      chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls
      reducedList = parMap rseq (foldl' f mempty)
    

    I've taken the liberty of assuming your data is a Monoid for mempty, if this isn't possible you can either replace mempty with your own empty type, or worse case use foldl1'.

    There are two operators from Control.Parallel.Strategies in use here. The parList evaluates all items of the list in parallel. After that, the chunkList divides the list into chunks of 1000 elements. Each of those chunks is then reduced in parallel by the parMap.

    You might also try

    parReduce2 f = foldl' f mempty . reducedList . chunkList
     where
      chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls
      reducedList = parMap rseq (foldl' f mempty)
    

    Depending on exactly how the work is distributed, one of these may be more efficient than the others.

    If you can use a data structure that has good support for indexing though (Array, Vector, Map, etc.), then you can do binary subdivisions for the reduction step, which will probably be better overall.

    0 讨论(0)
  • 2021-01-03 00:13

    Not sure what your parFold function is supposed to do. If that is intended to be a parallel version of foldr or foldl, I think its definition is wrong.

    parFold :: (a -> a -> a) -> [a] -> a
    
    // fold right in haskell (takes 3 arguments)
    foldr :: (a -> b -> b) -> b -> [a] -> b
    

    Fold applies the same function to each element of the list and accumulates the result of each application. Coming up with a parallel version of it, i guess, would require that the function application to the elements are done in parallel - a bit like what parList does.

        par_foldr :: (NFData a, NFData b) => (a -> b -> b) -> b -> [a] -> b
        par_foldr f z [] = z
        par_foldr f z (x:xs) = res `using` \ _ -> rseq x' `par` rdeepseq res
                           where x' = par_foldr f z xs
                                 res = x `f` x'
    
    0 讨论(0)
  • 2021-01-03 00:20

    This seems like a good start:

    parFold :: (a -> a -> a) -> [a] -> a
    parFold f = go
      where
      strategy = parList rseq
    
      go [x] = x
      go xs = go (reduce xs `using` strategy)
    
      reduce (x:y:xs) = f x y : reduce xs
      reduce list     = list   -- empty or singleton list
    

    It works, but parallelism is not so great. Replacing parList with something like parListChunks 1000 helps a bit, but speedup is still limited to under 1.5x on an 8-core machine.

    0 讨论(0)
提交回复
热议问题