Haskell --> F#: Turner's Sieve

◇◆丶佛笑我妖孽 提交于 2019-11-30 05:17:38

If you want to calculate things like merges/differences of infinite lists like Haskell does, the LazyList type (found inside the F# power pack) springs to mind.

It makes for very verbose code, like the translation below:

#r "FSharp.PowerPack.dll"

//A lazy stream of numbers, starting from x
let rec numsFrom x = LazyList.consDelayed x (fun () -> numsFrom (x+1))

//subtracts L2 from L1, where L1 and L2 are both sorted(!) lazy lists
let rec lazyDiff L1 L2 =
    match L1,L2 with
        | LazyList.Cons(x1,xs1),LazyList.Cons(x2,xs2) when x1<x2 ->
            LazyList.consDelayed x1 (fun () -> lazyDiff xs1 L2)
        | LazyList.Cons(x1,xs1),LazyList.Cons(x2,xs2) when x1=x2 ->
            lazyDiff xs1 L2
        | LazyList.Cons(x1,xs1),LazyList.Cons(x2,xs2) when x1>x2 ->
            lazyDiff L1 xs2
        | _ -> failwith "Should not occur with infinite lists!"

let rec euler = function
    | LazyList.Cons(p,xs) as LL ->
        let remaining = lazyDiff xs (LazyList.map ((*) p) LL)
        LazyList.consDelayed p (fun () -> euler remaining)
    | _ -> failwith "Should be unpossible with infinite lists!"

let primes = euler (numsFrom 2)

with

> primes |> Seq.take 15 |> Seq.toList;;
val it : int list = [2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47]

Note that I added two failwith clauses to keep the compiler from complaining about an incomplete match, even if we know that all lists in the calculation are (lazily) infinite.

You can do it with seq. And as you got minus done, euler itself is same as in Haskell.

let rec minus xs ys =
  seq {
    match Seq.isEmpty xs, Seq.isEmpty ys with
    | true,_ | _,true -> yield! xs
    | _ ->
       let x,y = Seq.head xs, Seq.head ys
       let xs',ys' = Seq.skip 1 xs, Seq.skip 1 ys
       match compare x y with
       | 0 -> yield! minus xs' ys'
       | 1 -> yield! minus xs ys'
       | _ -> yield x; yield! minus xs' ys
  }

let rec euler s =
  seq {
    let p = Seq.head s
    yield p
    yield! minus (Seq.skip 1 s) (Seq.map ((*) p) s) |> euler
  }

let primes = Seq.initInfinite ((+) 2) |> euler

With sequences, you get a lot more competitive by persisting the sequence:

let rec euler s =
    seq {
        let s = Seq.cache s
        let p = Seq.head s
        yield p
        yield! minus (Seq.skip 1 s) (Seq.map ((*) p) s) |> euler
    }

First, it must be said that the Euler's sieve for prime numbers is not a "improved version of the Sieve of Eratosthenes" as its performance in every sense is much worse than any version of the Sieve of Eratosthenes: Haskell Wiki on Prime Number algorithms - Euler

Next, it should be said that @cfem's code using LazyList's is a faithful although verbose translation of the version of Euler's sieve that you gave, although it lacks the slight improvement of only sieving odd numbers as per the link above.

It should be noted that there isn't really any point in implementing the Euler sieve, as it is more complex and slower than finding primes by Trial Division Optimized (TDO) as to only doing divisions by found primes up to the square root of the candidate number tested for prime as per: Haskell Wiki on Prime Number algorithms - TDO.

This Trial Division Optimized (TDO) sieve can be implemented in F# using LazyList's (with a reference to FSharp.PowerPack.dll) as:

let primesTDO() =
  let rec oddprimes =
    let rec oddprimesFrom n =
      if oddprimes |> Seq.takeWhile (fun p -> p * p <= n) |> Seq.forall (fun p -> (n % p) <> 0)
      then LazyList.consDelayed n (fun() -> oddprimesFrom (n + 2))
      else oddprimesFrom (n + 2)
    LazyList.consDelayed 3 (fun() -> oddprimesFrom 5)
  LazyList.consDelayed 2 (fun () -> oddprimes)

It can be implemented using sequences in the same form as:

let primesTDOS() =
  let rec oddprimes =
    let rec oddprimesFrom n =
      if oddprimes |> Seq.takeWhile (fun p -> p * p <= n) |> Seq.forall (fun p -> (n % p) <> 0)
      then seq { yield n; yield! oddprimesFrom (n + 2) }
      else oddprimesFrom (n + 2)
    seq { yield 3; yield! (oddprimesFrom 5) } |> Seq.cache
  seq { yield 2; yield! oddprimes }

The sequence version is slightly faster than the LazyList version because it avoids some overhead in calling through since LazyList's are based on cached sequences. Both use an internal object which represents a cached copy of the primes found so far, automatically cached in the case of LazyList's, and by the Seq.cache in the case of sequences. Either can find the first 100,000 primes in about two seconds.

Now, the Euler sieve can have the odd number sieving optimization and be expressed using LazyList's as the following, with one match case eliminated due to knowing that the input list parameters are infinite and the compare match simplified, as well I've added an operator '^' to make the code more readable:

let primesE = //uses LazyList's from referenced FSharp.PowerPack.dll version 4.0.0.1
  let inline (^) a ll = LazyList.cons a (LazyList.delayed ll) //a consd function for readability
  let rec eulers xs =
    //subtracts ys from xs, where xs and ys are both sorted(!) infinite lazy lists
    let rec (-) xs ys =
      let x,xs',ys' = LazyList.head xs,LazyList.tail xs,LazyList.tail ys
      match compare x ( LazyList.head ys) with
        | 0 -> xs' - ys' // x == y
        | 1 -> xs - ys' // x > y
        | _ -> x^(fun()->(xs' - ys)) // must be x < y
    let p = LazyList.head xs
    p^(fun() -> (((LazyList.tail xs) - (LazyList.map ((*) p) xs)) |> eulers))
  let rec everyothernumFrom x = x^(fun() -> (everyothernumFrom (x + 2)))
  2^(fun() -> ((everyothernumFrom 3) |> eulers))

However, it must be noted that the time to calculate the 1899th prime (16381) is about 0.2 and 0.16 seconds for the primesTDO and primesTDOS, respectively, while it is about 2.5 seconds using this primesE for a terrible performance for the Euler sieve at over ten times the time even for this small range. In addition to terrible performance, primeE cannot even calculate primes to 3000 due do even worse memory utilization as it records a rapidly increasing number of deferred execution functions with increasing found primes.

Note that one must be careful in repeated timing of the code as written since the LazyList is a value and has built-in memorization of previously found elements, thus a second identical scan will take close to zero time; for timing purposes it might be better to make the PrimeE a function as in PrimeE() so the work starts from the beginning each time.

In summary, the Euler sieve implemented in any language including F# is only an interesting intellectual exercise and has no practical use as it is much slower and hogs memory much worse than just about every other reasonably optimized sieve.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!