I read an article (1.5 years old http://www.drdobbs.com/parallel/cache-friendly-code-solving-manycores-ne/240012736) which talks about cache performance and size of data. They
I don't get constant time. I modified your code a bit to make it simpler. My times are much lower than yours. I'm not sure why. The large times at the beginning make sense because there are only a few values to write to so it's a dependency chain. The L2 Cache ends at 256k/4=64k. Notice how the values start rising between size=32768 and 65536.
//GCC -O3 Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz
Size=1, Iterations=1073741824, Time=187.18 ms
Size=2, Iterations=536870912, Time=113.47 ms
Size=4, Iterations=268435456, Time=50.53 ms
Size=8, Iterations=134217728, Time=25.02 ms
Size=16, Iterations=67108864, Time=25.61 ms
Size=32, Iterations=33554432, Time=24.08 ms
Size=64, Iterations=16777216, Time=22.69 ms
Size=128, Iterations=8388608, Time=22.03 ms
Size=256, Iterations=4194304, Time=19.98 ms
Size=512, Iterations=2097152, Time=17.09 ms
Size=1024, Iterations=1048576, Time=15.66 ms
Size=2048, Iterations=524288, Time=14.94 ms
Size=4096, Iterations=262144, Time=14.58 ms
Size=8192, Iterations=131072, Time=14.40 ms
Size=16384, Iterations=65536, Time=14.63 ms
Size=32768, Iterations=32768, Time=14.75 ms
Size=65536, Iterations=16384, Time=18.58 ms
Size=131072, Iterations=8192, Time=20.51 ms
Size=262144, Iterations=4096, Time=21.18 ms
Size=524288, Iterations=2048, Time=21.26 ms
Size=1048576, Iterations=1024, Time=21.22 ms
Size=2097152, Iterations=512, Time=22.17 ms
Size=4194304, Iterations=256, Time=38.01 ms
Size=8388608, Iterations=128, Time=38.63 ms
Size=16777216, Iterations=64, Time=38.09 ms
Size=33554432, Iterations=32, Time=38.54 ms
Size=67108864, Iterations=16, Time=39.11 ms
Size=134217728, Iterations=8, Time=39.96 ms
Size=268435456, Iterations=4, Time=42.15 ms
Size=536870912, Iterations=2, Time=46.39 ms
The code:
#include
#include
static void test_function(int n, int iterations)
{
int *array = new int[n];
for (int i = 0; i < iterations; i++)
for (int x = 0; x < n; x++)
array[x]++;
delete[] array;
}
int main() {
for(int i=0, n=1, iterations=1073741824; i<30; i++, n*=2, iterations/=2) {
double dtime;
dtime = omp_get_wtime();
test_function(n, iterations);
dtime = omp_get_wtime() - dtime;
printf("Size=%d, Iterations=%d, Time=%.3f\n", n, iterations, dtime);
}
}