问题
I read this other post about a F# version of this algorithm. I found it very elegant and tried to combine some ideas of the answers.
Although I optimized it to make fewer checks (check only numbers around 6) and leave out unnecessary caching, it is still painfully slow. Calculating the 10,000th prime already take more than 5 minutes. Using the imperative approach, I can test all 31-bit integers in not that much more time.
So my question is if I am missing something that makes all this so slow. For example in another post someone was speculating that LazyList
may use locking. Does anyone have an idea?
As StackOverflow's rules say not to post new questions as answers, I feel I have to start a new topic for this.
Here's the code:
#r "FSharp.PowerPack.dll"
open Microsoft.FSharp.Collections
let squareLimit = System.Int32.MaxValue |> float32 |> sqrt |> int
let around6 = LazyList.unfold (fun (candidate, (plus, next)) ->
if candidate > System.Int32.MaxValue - plus then
None
else
Some(candidate, (candidate + plus, (next, plus)))
) (5, (2, 4))
let (|SeqCons|SeqNil|) s =
if Seq.isEmpty s then SeqNil
else SeqCons(Seq.head s, Seq.skip 1 s)
let rec lazyDifference l1 l2 =
if Seq.isEmpty l2 then l1 else
match l1, l2 with
| LazyList.Cons(x, xs), SeqCons(y, ys) ->
if x < y then
LazyList.consDelayed x (fun () -> lazyDifference xs l2)
elif x = y then
lazyDifference xs ys
else
lazyDifference l1 ys
| _ -> LazyList.empty
let lazyPrimes =
let rec loop = function
| LazyList.Cons(p, xs) as ll ->
if p > squareLimit then
ll
else
let increment = p <<< 1
let square = p * p
let remaining = lazyDifference xs {square..increment..System.Int32.MaxValue}
LazyList.consDelayed p (fun () -> loop remaining)
| _ -> LazyList.empty
loop (LazyList.cons 2 (LazyList.cons 3 around6))
回答1:
If you are calling Seq.skip
anywhere, then there's about a 99% chance that you have an O(N^2) algorithm. For nearly every elegant functional lazy Project Euler solution involving sequences, you want to use LazyList
, not Seq
. (See Juliet's comment link for more discussion.)
回答2:
Even if you succeed in taming the strange quadratic F# sequences design issues, there is certain algorithmic improvements still ahead. You are working in (...((x-a)-b)-...)
manner here. x
, or around6
, is getting deeper and deeper, but it's the most frequently-producing sequence. Transform it into (x-(a+b+...))
scheme -- or even use a tree structure there -- to gain an improvement in time complexity (sorry, that page is in Haskell). This gets actually very close to the complexity of imperative sieve, although still mush slower than the baseline C++ code.
Measuring local empirical orders of growth as O(n^a) <--> a = log(t_2/t_1) / log(n_2/n_1)
(in n
primes produced), the ideal n log(n) log(log(n))
translates into O(n^1.12) .. O(n^1.085)
behaviour on n=10^5..10^7
range. A simple C++ baseline imperative code achieves O(n^1.45 .. 1.18 .. 1.14)
while tree-merging code, as well as priority-queue based code, both exhibit steady O(n^1.20)
behaviour, more or less. Of course C++ is ~5020..15 times faster, but that's mostly just a "constant factor". :)
来源:https://stackoverflow.com/questions/6469982/getting-functional-sieve-of-eratosthenes-fast