Optimize prime number finder and cpu speed checker in R

问题

I have following code to find prime numbers in 10 seconds:

prime_nums = function (){
    ptm <- proc.time()

    p_nums = c(2)
    counter = 2
    while (TRUE){
        isPRIME = FALSE
        counter = counter +1
        for(n in p_nums) {
            if(n > sqrt(counter)){ isPRIME=TRUE; break; }
            if(counter %% n == 0){ isPRIME = FALSE; break;}
        }
        if(isPRIME) { p_nums[length(p_nums)+1]=counter ; cat("",counter,";")}
        if((proc.time()[3]-ptm[3]) > 10) break; 
    }
}

However, this is written with many loops which are generally not preferred in R. How can I optimize this code to become as fast as possible?

EDIT: I found following code to be fastest:

prime_nums_fastest = function (){
    ptm <- proc.time()

    p_nums = c(2L,3L,5L,7L)
    counter = 7L

    while (TRUE){
        isPRIME = FALSE
        counter = counter +2L       
        loc = 4*sqrt(counter)/log(counter,2)   

        isPRIME = !any(0 == (counter %% p_nums[1:loc]))

        if(isPRIME) { p_nums[length(p_nums)+1]=counter }
        if((proc.time()[3]-ptm[3]) > 10) break;
    }
    print(p_nums)
}

Initial small primes are kept to simplify. Using 2*sqrt.. or even 3*sqrt... for loc parameter leads to inclusion of non-primes. Significantly less primes need to be checked than using 1:sqrt(counter).

回答1:

Get rid of the cat command. That's expensive. With it in place, I get to 384239. Returning the vector of primes instead gets me to 471617, a significant improvement.

Changing n > sqrt(counter) to n*n > counter gets me to 477163, a small improvement.

Changing p_nums and counter to be of type integer gets me to 514859, another small improvement. This is achieved by modifying the lines where these are defined and adjusted:

p_nums = c(2L)
counter = 2L
# ... and inside the loop:
  counter = counter +1L

Note that you can vectorize the loop which determines that a value is prime, with code such as this:

isPRIME = !any(0 == (counter %% p_nums[1:sqrt(counter)]))

Using that instead of for gets me to 451249, a significant regression (using no cat and using integer arithmetic). This is because R does not have lazy list evaluation, so the modulus is taken for every value, then they are tested against 0. This is an advantage of for in this case.

来源：https://stackoverflow.com/questions/24467936/optimize-prime-number-finder-and-cpu-speed-checker-in-r

标签

time

primes