In high-performance computing, sums, products, etc are often calculated using a \"parallel reduction\" that takes n elements and completes in O(log n) time
I don't think a list is the right data type for this. Because it's just a linked list, the data will necessarily be accessed sequentially. Although you can evaluate the items in parallel, you won't gain much in the reduction step. If you really need a List, I think the best function would be just
parFold f = foldl1' f . withStrategy (parList rseq)
or maybe
parFold f = foldl1' f . withStrategy (parBuffer 5 rseq)
If the reduction step is complex, you might get a gain by subdividing the list like this:
parReduce f = foldl' f mempty . reducedList . chunkList . withStrategy (parList rseq)
where
chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls
reducedList = parMap rseq (foldl' f mempty)
I've taken the liberty of assuming your data is a Monoid
for mempty, if this isn't possible you can either replace mempty with your own empty type, or worse case use foldl1'
.
There are two operators from Control.Parallel.Strategies
in use here. The parList
evaluates all items of the list in parallel. After that, the chunkList
divides the list into chunks of 1000 elements. Each of those chunks is then reduced in parallel by the parMap
.
You might also try
parReduce2 f = foldl' f mempty . reducedList . chunkList
where
chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls
reducedList = parMap rseq (foldl' f mempty)
Depending on exactly how the work is distributed, one of these may be more efficient than the others.
If you can use a data structure that has good support for indexing though (Array, Vector, Map, etc.), then you can do binary subdivisions for the reduction step, which will probably be better overall.