std::vector C(4);
for(int i = 0; i < 1000;++i)
for(int j = 0; j < 2000; ++j)
{
C[0] = 1.0;
C[1] = 1.0;
C[2] = 1.0;
C[3] = 1.
The second way is allocating new memory (in your case 1000*2000 times). Each one being a completely new memory location in heap (although not always new, can be in the same location). Memory allocation takes longer time than just modifying the values contained in already allocated memory.
The first way is allocating 1 memory location array, and just modifying the values in it. If compilers do optimize for this (which isn't always the case), better not leave it to the compiler, if you can choose to allocate less memory (or less often) yourself as a programmer.
Is it completely wrong to keep variables local in a loop whenever possible? I was under the (perhaps false) impression that this would provide optimization opportunities for the compiler.
No, that's a good rule of thumb. But it is only a rule of thumb. Minimizing the scope of a variable gives the compiler more freedom for register allocation and other optimizations, and at least as importantly, it generally yields more readable code. But it also depends on repeated creation/destruction being cheap, or being optimized away entirely. That is often the case... But not always.
So as you've discovered, sometimes it's a bad idea.
First thing you should do is making sure that the design is OK, this means:
I think that in this case it would indeed imply that it's better to define the variable in the loop.
Only if you have real performance problems, you may optimize your code (if the compiler didn't already do this for you), e.g. by putting the variable declaration outside the loop.
The problem is heap activity. Replace std::vector<double> C(4);
with std::array<double, 4> C;
and it should not make any difference where you place the variable anymore.
Creation of the vector
is expensive, in this case, because it may allocate an array of size 4 on the heap.
If you know the size of the 'local' vector upfront, you may as well use an automatic array:
for( int i = 0; i != 2000; ++i ) {
int C[4]; // no initialization
C[0] = 1;
// ...
}
This way you loose the cost of allocating free memory.
I was under the (perhaps false) impression that this would provide optimization opportunities for the compiler.
This is probably true for built-in types such as int or double.
The issue here is that you are using vector which needs to run the constructor on entering the loop body, and the destructor when leaving. Since both these methods are non-trivial the compiler cannot optimise these away, as your program would no longer be correct.
As a counter-examample for this, imagine what such an optimisation would do if you used a file object instead of a vector.