A big loop within a small loop always faster than a small loop within a big one?

一个人想着一个人 提交于 2020-08-02 08:23:08

问题


I just read this post, and wonder if we can draw the conclusion that a big loop within a small loop must always run faster than a small loop within a big one, no matter what the code does inside the nested loop? Take an example.

int m, n; 
m = 1000000;
n = 10;

Snippet A

for (int i = 0; i < n; i++)         
    for (int j=0; j < m; j++)               
       {       
           DoSomething();        
       }

Snippet B

for (int j = 0; j < m; j++)               
    for (int i=0; i < n; i++)           
       {       
          DoSomething();          
       }

Can we say that, no matter what DoSomething() actually does, snippet A always runs faster thant snippet B?

UPDATE
As pointed out by @stackmate, I want to expand this question into two

  1. When the code inside nested loop is DoSomething() which means DoSomething() has nothing to do with variable i and j. What is the performance difference?

  2. When the code inside nested loop is DoSomething(i, j) which means DoSomething(i, j) has relateship with variable i and j. What is the performance difference?


回答1:


There cannot be a specific answer to your question. The parameter deciding whether it will be fast or not is what you are doing inside the loops. For example say you are adding 2 arrays and storing them in a third array:

Code 1:
for(int i = 0; i < 1000; i++)
{
    for(int j = 0; j < 1000000; j++)
         C[i][j] = A[i][j] + B[i][j];
}

Code 2:
for(int i = 0; i < 1000000; i++)
{
    for(int j = 0; j < 1000; j++)
         C[j][i] = A[j][i] + B[j][i];
}

Code 1 will be much faster than code 2. The reason is cache. Take a look at this question for more details. The answers are superbly informative and there is no point in me explaining the concept of cache again over here.




回答2:


@Cool_Coder already covered one main reason (memory access patterns resulting in better cache hit rates) why having the smaller loop as the inside loop can be beneficial.

Another scenario is that the inner loop could be unrolled. Particularly if the size of the smaller loop is really small and fixed, the compiler will unroll the inner loop if it's beneficial. The resulting code will then have only one loop instead of two nested loops, with a reduction in branches.

If you have a situation like this in highly performance critical code, you need to try both, and benchmark carefully. If the code is not very performance critical, it's probably not worth worrying about.




回答3:


An observation. If you have read operations inside the nested loop like the following

for (int i = 0; i < n; i++)
    a = aList.get(i);         
    for (int j=0; j < m; j++)               
       {   
           b = bList.get(j)
           DoSomething(a, b);        
       }

then having n < m results in less someList.get-operations than n > m. With n=1 and m=2 there will be three read operations while with n=2 and m=1 there will be four read operations. In the case n > m there will be more repititive read operations. Or put another way the runtime is n + n*m. The value n*m stays the same whether n > m or n < m but the first addend changes.

(I assumed no compiler optimizations and ignored caching behavior)




回答4:


It depends.

If you do anything other between the for loops then the one iterating m first will take longer to execute, simply because it does more work because of that extra added work before entering the next loop.

If not, then

n * m = m * n

Andy T's comment is also correct. When you have the outer loop be the longest one, then the initialization of the inner loop happens more often. If this is c++ though, I'd expect the compiler to optimize this away so that benchmarking your code would probably yield the same result for both loops.




回答5:


In Snippet A: n=10 times switch to outer For loop But in Snippet B: m=1000000 times switch to outer For loop. And this reason (More switches between two For loop) makes to Snippet A is faster than Snippet B.




回答6:


for(int i = 0; i < 1000; i++)
{
        for(int j = 0;

in above code if you see we are initializing j 1000 times.

for(int i = 0; i < 1000000; i++)
{
    for(int j = 0;

while in this second code we are initializing j 1000000 times which clearly shows that second code has extra overhead.



来源:https://stackoverflow.com/questions/23914350/a-big-loop-within-a-small-loop-always-faster-than-a-small-loop-within-a-big-one

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!