I optimized an extension method to compare two streams for equality (byte-for-byte) - knowing that this is a hot method I tried to optimize it as far as possible (the streams ca
for (var i = 0; i < (len / 8) + 1; i++)
The debugger in general has a hard time with this union, it can't display the array content when I try it. But the core problem is no doubt the +1 in the for() end expression. That indexes the array beyond its last element when len is divisible by 8. The runtime cannot catch this mistake, overlapping the arrays causes the Length property to have a bogus value. What happens next is undefined behavior, you are reading bytes that are not part of the array. A workaround is to make the array 7 bytes longer.
This kind of code is not exactly an optimization, reading and comparing uint64 on a 32-bit machine is expensive, especially when the array isn't aligned correctly.. About 50% odds for that. A better mousetrap is to use the C runtime memcmp() function, available on any Windows machine:
[DllImport("msvcrt.dll")]
private static extern int memcmp(byte[] arr1, byte[] arr2, int cnt);
And use it like this:
int len;
while ((len = target.Read(arr1, 0, 4096)) != 0) {
if (compareTo.Read(arr2, 0, 4096) != len) return false;
if (memcmp(arr1, arr2, len) != 0) return false;
}
return true;
Do compare the perf of this with a plain for() loop that compares bytes. The ultimate throttle here is the memory bus bandwidth.
Problems like this are commonly issues with understanding of how optimizations work. This line of code could very well be being executed because both return false clauses are combined into one set of instructions at the lower level. Other causes for issues like this is if the architecture you are on allows for conditional execution in which certain instructions are hit in the debugger but the results are never committed to registers at the architecture level.
Verify that the code works in debug mode first. Then when you are convinced the outcome is the same as the release mode, look at the underlying instructions to figure out the compiler optimization at hand.