Getting functional sieve of Eratosthenes fast

问题

I read this other post about a F# version of this algorithm. I found it very elegant and tried to combine some ideas of the answers.

Although I optimized it to make fewer checks (check only numbers around 6) and leave out unnecessary caching, it is still painfully slow. Calculating the 10,000^th prime already take more than 5 minutes. Using the imperative approach, I can test all 31-bit integers in not that much more time.

So my question is if I am missing something that makes all this so slow. For example in another post someone was speculating that LazyList may use locking. Does anyone have an idea?

As StackOverflow's rules say not to post new questions as answers, I feel I have to start a new topic for this.

Here's the code:

#r "FSharp.PowerPack.dll"

open Microsoft.FSharp.Collections

let squareLimit = System.Int32.MaxValue |> float32 |> sqrt |> int

let around6 = LazyList.unfold (fun (candidate, (plus, next)) -> 
        if candidate > System.Int32.MaxValue - plus then
            None
        else
            Some(candidate, (candidate + plus, (next, plus)))
    ) (5, (2, 4))

let (|SeqCons|SeqNil|) s =
    if Seq.isEmpty s then SeqNil
    else SeqCons(Seq.head s, Seq.skip 1 s)

let rec lazyDifference l1 l2 =
    if Seq.isEmpty l2 then l1 else
    match l1, l2 with
    | LazyList.Cons(x, xs), SeqCons(y, ys) ->
        if x < y then
            LazyList.consDelayed x (fun () -> lazyDifference xs l2)
        elif x = y then
            lazyDifference xs ys
        else
            lazyDifference l1 ys
    | _ -> LazyList.empty

let lazyPrimes =
    let rec loop = function
        | LazyList.Cons(p, xs) as ll ->
            if p > squareLimit then
                ll
            else
                let increment = p <<< 1
                let square = p * p
                let remaining = lazyDifference xs {square..increment..System.Int32.MaxValue}
                LazyList.consDelayed p (fun () -> loop remaining)
        | _ -> LazyList.empty
    loop (LazyList.cons 2 (LazyList.cons 3 around6))

回答1:

If you are calling Seq.skip anywhere, then there's about a 99% chance that you have an O(N^2) algorithm. For nearly every elegant functional lazy Project Euler solution involving sequences, you want to use LazyList, not Seq. (See Juliet's comment link for more discussion.)

回答2:

Even if you succeed in taming the strange quadratic F# sequences design issues, there is certain algorithmic improvements still ahead. You are working in (...((x-a)-b)-...) manner here. x, or around6, is getting deeper and deeper, but it's the most frequently-producing sequence. Transform it into (x-(a+b+...)) scheme -- or even use a tree structure there -- to gain an improvement in time complexity (sorry, that page is in Haskell). This gets actually very close to the complexity of imperative sieve, although still mush slower than the baseline C++ code.

Measuring local empirical orders of growth as O(n^a) <--> a = log(t_2/t_1) / log(n_2/n_1) (in n primes produced), the ideal n log(n) log(log(n)) translates into O(n^1.12) .. O(n^1.085) behaviour on n=10^5..10^7 range. A simple C++ baseline imperative code achieves O(n^1.45 .. 1.18 .. 1.14) while tree-merging code, as well as priority-queue based code, both exhibit steady O(n^1.20) behaviour, more or less. Of course C++ is ~5020..15 times faster, but that's mostly just a "constant factor". :)

来源：https://stackoverflow.com/questions/6469982/getting-functional-sieve-of-eratosthenes-fast

标签

performance

lazy-evaluation

primes

sieve-of-eratosthenes