问题
Any suggestions as to how I can get this program to work for n = 1 trillion (aside from making upgrades / buying a new computer)?
The error is as follows: after building, the program being executing (the command line style output window pops up) and then quickly closes out, and I get the following error "ProjectPrimes.exe has stopped working (Windows is looking for a solution to this problem." I suspect this has to do with memory issues as I first encountered it at n = 20 million but that was before I chose to malloc/free the 'sieve' array (i.e., my 'sieve' array is the big array of dimensions n x 1 and with each element consisting of a 1 or 0).
The program takes ~35 seconds to run through the first 300 million integers (16,252,325 primes) so it's okay but nothing spectacular. As I'd mentioned, the goal is to be able to generate primes below 1 trillion so am a long ways off...
If relevant, here are my machine specs (in case the goal happens to be unreasonable on this machine): 2.40ghz i5, 4gb RAM, 64bit Windows 7.
An overview of methodology, for those who aren't familiar: we use the Sieve of Sundaram approach. Without getting into the proof, we first eliminate all odd non-primes below an integer "n" using the sieve function: [2*(i+j+2*i*j)+1 | i <- [1..n/2], j <- [i..an optimized upper bound]]. We then cross off the even numbers (excluding two, of course). Which leaves us with the primes.
Why does the prime function return (a pointer to an array containing) the complete set of primes below n? Well, the goal is to be able to identify both (i) the number of primes below n as well as (ii) list the primes below n. This is also why I chose to pass a pointer for the count of primes below n as an argument.
Here's the not-so-exciting 'main' function:
int main() {
long ceiling = 300*1000*1000;
long *numPrimes;
long *primes;
primes = primesToSS(ceiling+1, numPrimes);
printf("\n\nThere are %d primes below %d.\n\n",*numPrimes,ceiling);
free(primes);
return 0;
}
And here's the meat:
//n represents the ceiling, i.e., the integer below which we will generate primes
//cnt* is a pointer which will point the number of primes below n
long* primesToSS( long n, long* cnt ) {
//initialize sieve by setting all elements equal to 1 (except for 0 and 1)
long *sieve = malloc(n*sizeof(long));
initArray(sieve, n, 1);
sieve[0] = 0; sieve[1] = 0;
//eliminate all odd composite numbers
for (int i = 1; i <= n/2; ++i)
for (int j = i; j <= (n-2*i)/(2*(2*i+1)); ++j)
sieve[ 2*(i+j+2*i*j)+1 ] = 0;
//eliminate all even numbers greater than two
//and count total number of primes below n
long numPrimes = 1;
for (int i = 3; i < n; ++i) {
if (i % 2 == 0) sieve[i] = 0;
numPrimes += sieve[i];
}
*cnt = numPrimes;
//create array of primes
long *primes = malloc(numPrimes*sizeof(int));
long counter = 0;
for (int i = 0; i < n; ++i) {
if (sieve[i] == 1) {
primes[counter] = i;
counter++; }
}
free(sieve);
return primes;
}
void initArray( int* arr, int len, int n ) {
for( int i = 0; i < len; ++i) arr[i] = n; }
回答1:
Let's make some observations:
- As people have mentioned in the comments and answers, you only need 1 bit to store whether a number is a prime. Therefore, you can pack the data for 8 numbers into 1 byte. Implement your own bitset data structure to make the code cleaner.
- Also mentioned in the comments, since all prime numbers larger than 2 are odd numbers, there is no need to store data for even numbers. This allow you to pack 16 numbers into 1 byte.
- Following the idea of wheel factorization, you can further pack 24 numbers into 1 bytes, if you store only the bits for numbers congruent to 1 or 5 modulo 6. Higher modulo will save slightly more space (with diminishing return), but the logic to access the bitset data structure may slow down the algorithm.
- For the program to work with 1 trillion (1012) numbers, as you sieve, you only need to collect the list of prime numbers less than 1 million (106). Larger numbers are unnecessary, since the other factor in the compound numbers less than 1 trillion would already have been covered by the list.
Packing numbers at 16 numbers/byte (the most basic setup you should do), you would require ~58 GiB of RAM. However, there is no need to allocate for the whole range. Just allocate for a smaller range of a few hundred million to a few billion numbers (try to vary the number for highest performance), which translate to at most a few hundred MiB, then use the list of prime number less than 1 million to sieve the ranges.
This step can be done in parallel - you only need to share the list of prime numbers less than 1 million.
By the way, this technique is called segmented sieve.
回答2:
long *numPrimes;
...
primes = primesToSS(ceiling+1, numPrimes);
printf("\n\nThere are %d primes below %d.\n\n",*numPrimes,ceiling);
As Blastfurnace noted, you're dereferencing (and storing via) an uninitialized pointer. This should be
long numPrimes;
...
primes = primesToSS(ceiling+1, &numPrimes);
printf("\n\nThere are %d primes below %d.\n\n", numPrimes, ceiling);
There may well be other errors ... especially not checking for malloc failure. Oh, here's one:
long *sieve;
initArray(sieve, n, 1);
...
void initArray( int* arr, int len, int n ) {
for( int i = 0; i < len; ++i) arr[i] = n; }
int
and long
are different types. And here you do it again:
long *primes = malloc(numPrimes*sizeof(int));
Best practice is to use sizeof the element, not the type, which avoids this sort of thing:
long *primes = malloc(numPrimes * sizeof *primes);
Check your code for other obvious mistakes.
Style note: putting a } at the end of a line is poor practice and begging for a bug when you add another line later.
回答3:
Some memory considerations: For the first 1 trillion primes, you need 1 trillion/8 bytes if you represent every number with a single bit. That's 125000000000 = 119209 MB = 116 GB of RAM.
While you could ask your OS to create a 200GB SWAP partition, the performance will probably be bad. If you want to keep the same algorithm, then you need to look at memory mapped files.
That will still be very slow but at least you'll be able to allocate an array which is big enough.
The next step will then be to look at different prime algorithms or you need to research memory compression algorithms. In your case, sparse matrices won't work well since your start matrix is all 1 and you eventually remove almost all elements.
来源:https://stackoverflow.com/questions/26211275/generate-primes-from-1-to-n-crashing-for-n-300-million