Why are AND instructions generated?

后端 未结 3 657
一整个雨季
一整个雨季 2020-12-25 11:51

For code such as this:

int res = 0;
for (int i = 0; i < 32; i++)
{
    res += 1 << i;
}

This code is generated (release mode, no d

相关标签:
3条回答
  • 2020-12-25 12:38

    x64 cores already apply the 5 bit mask to the shift amount. From the Intel Processor manual, volume 2B page 4-362:

    The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REG.W is used). A special opcode encoding is provided for a count of 1.

    So that's machine code that isn't necessary. Unfortunately, the C# compiler cannot make any assumptions about the processor's behavior and must apply C# language rules. And generate IL whose behavior is specified in the CLI specification. Ecma-335, Partion III, chapter 3.58 says this about the SHL opcode:

    The shl instruction shifts value (int32, int64 or native int) left by the number of bits specified by shiftAmount. shiftAmount is of type int32 or native int. The return value is unspecified if shiftAmount is greater than or equal to the width of value.

    Unspecified is the rub here. Bolting specified behavior on top of unspecified implementation details produces the unnecessary code. Technically the jitter could optimize the opcode away. Although that's tricky, it doesn't know the language rule. Any language that specifies no masking will have a hard time generating proper IL. You can post to connect.microsoft.com to get the jitter team's view on the matter.

    0 讨论(0)
  • 2020-12-25 12:45

    The and is already present in the CIL code emitted by the C# compiler:

        IL_0009: ldc.i4.s 31
        IL_000b: and
        IL_000c: shl
    

    The spec for the CIL shl instruction says:

    The return value is unspecified if shiftAmount is greater than or equal to the size of value.

    The C# spec, however, defines the 32-bit shift to take the shift count mod 32:

    When the type of x is int or uint, the shift count is given by the low-order five bits of count. In other words, the shift count is computed from count & 0x1F.

    In this situation, the C# compiler can’t really do much better than emit an explicit and operation. Best you can hope for is that the JITter will notice this and optimize away the redundant and, but that takes time, and the speed of JIT is pretty important. So consider this the price paid for a JIT-based system.

    The real question, I guess, is why the CIL specifies the shl instruction that way, when C# and x86 both specify the truncating behaviour. That I do not know, but I speculate that it’s important for the CIL spec to avoid specifying a behaviour that may JIT to something expensive on some instruction sets. At the same time, it’s important for C# to have as few undefined behaviours as possible, because people invariably end up using such undefined behaviours until the next version of the compiler/framework/OS/whatever changes them, breaking the code.

    0 讨论(0)
  • 2020-12-25 12:49

    C# compiler has to insert these AND instructions while generating intermediate (machine-independent) code, because C# left shift operator is required to use only 5 least significant bits.

    While generating x86 code, optimizing compiler may drop these unneeded instructions. But, apparently, it skips this optimization (probably, because it cannot afford to spend much time on analysis).

    0 讨论(0)
提交回复
热议问题