The following code gives different output when running the release inside Visual Studio, and running the release outside Visual Studio. I\'m using Visual Studio 2008 and targeti
It is a JIT optimizer bug. It is unrolling the inner loop but not updating the oVec.y value properly:
for (oVec.x = 0; oVec.x < 2; oVec.x++) {
0000000a xor esi,esi ; oVec.x = 0
for (oVec.y = 0; oVec.y < 2; oVec.y++) {
0000000c mov edi,2 ; oVec.y = 2, WRONG!
oDoesSomething.Do(oVec);
00000011 push edi
00000012 push esi
00000013 mov ecx,ebx
00000015 call dword ptr ds:[00170210h] ; first unrolled call
0000001b push edi ; WRONG! does not increment oVec.y
0000001c push esi
0000001d mov ecx,ebx
0000001f call dword ptr ds:[00170210h] ; second unrolled call
for (oVec.x = 0; oVec.x < 2; oVec.x++) {
00000025 inc esi
00000026 cmp esi,2
00000029 jl 0000000C
The bug disappears when you let oVec.y increment to 4, that's too many calls to unroll.
One workaround is this:
for (int x = 0; x < 2; x++) {
for (int y = 0; y < 2; y++) {
oDoesSomething.Do(new IntVec(x, y));
}
}
UPDATE: re-checked in August 2012, this bug was fixed in the version 4.0.30319 jitter. But is still present in the v2.0.50727 jitter. It seems unlikely they'll fix this in the old version after this long.
I copied your code into a new Console App.
So it is the x86 JIT incorrectly generating the code. Have deleted my original text about reordering of loops etc. A few other answers on here have confirmed that the JIT is unwinding the loop incorrectly when on x86.
To fix the problem you can change the declaration of IntVec to a class and it works in all flavours.
Think this needs to go on MS Connect....
-1 to Microsoft!
I believe this is in a genuine JIT compilation bug. I would report it to Microsoft and see what they say. Interestingly, I found that the x64 JIT does not have the same problem.
Here is my reading of the x86 JIT.
// save context
00000000 push ebp
00000001 mov ebp,esp
00000003 push edi
00000004 push esi
00000005 push ebx
// put oDoesSomething pointer in ebx
00000006 mov ebx,ecx
// zero out edi, this will store oVec.y
00000008 xor edi,edi
// zero out esi, this will store oVec.x
0000000a xor esi,esi
// NOTE: the inner loop is unrolled here.
// set oVec.y to 2
0000000c mov edi,2
// call oDoesSomething.Do(oVec) -- y is always 2!?!
00000011 push edi
00000012 push esi
00000013 mov ecx,ebx
00000015 call dword ptr ds:[002F0010h]
// call oDoesSomething.Do(oVec) -- y is always 2?!?!
0000001b push edi
0000001c push esi
0000001d mov ecx,ebx
0000001f call dword ptr ds:[002F0010h]
// increment oVec.x
00000025 inc esi
// loop back to 0000000C if oVec.x < 2
00000026 cmp esi,2
00000029 jl 0000000C
// restore context and return
0000002b pop ebx
0000002c pop esi
0000002d pop edi
0000002e pop ebp
0000002f ret
This looks like an optimization gone bad to me...