问题
For an integer that is never expected to take -ve values, one could unsigned int or int. From a compiler perspective or purely cpu cycle perspective is there any difference on x86_64 ?
回答1:
It depends. It might go either way, depending on what you are doing with that int
as well as on the properties of the underlying hardware.
An obvious example in unsigned int
s favor would be the integer division operation. In C/C++ integer division is supposed to round towards zero, while machine integer division on x86 rounds towards negative infinity. Also, various "optimized" replacements for integer division (shifts, etc.) also generally round towards negative infinity. So, in order to satisfy standard requirements the compiler are forced to adjust the signed integer division results with additional machine instructions. In case of unsigned integer division this problem does not arise, which is why generally integer division works much faster for unsigned types than for signed types.
For example, consider this simple expression
rand() / 2
The code generated for this expression by MSVC complier will generally look as follows
call rand
cdq
sub eax,edx
sar eax,1
Note that instead of a single shift instruction (sar
) we are seeing a whole bunch of instructions here, i.e our sar
is preceded by two extra instructions (cdq
and sub
). These extra instructions are there just to "adjust" the division in order to force it to generate the "correct" (from C language point of view) result. Note, that the compiler does not know that your value will always be positive, so it has to generate these instructions always, unconditionally. They will never do anything useful, thus wasting the CPU cycles.
Not take a look at the code for
(unsigned) rand() / 2
It is just
call rand
shr eax,1
In this case a single shift did the trick, thus providing us with an astronomically faster code (for the division alone).
On the other hand, when you are mixing integer arithmetics and FPU floating-point arithmetics, signed integer types might work faster since the FPU instruction set contains immediate instruction for loading/storing signed integer values, but has no instructions for unsigned integer values.
To illustrate this one can use the following simple function
double zero() { return rand(); }
The generated code will generally be very simple
call rand
mov dword ptr [esp],eax
fild dword ptr [esp]
But if we change our function to
double zero() { return (unsigned) rand(); }
the generated code will change to
call rand
test eax,eax
mov dword ptr [esp],eax
fild dword ptr [esp]
jge zero+17h
fadd qword ptr [__real@41f0000000000000 (4020F8h)]
This code is noticeably larger because the FPU instruction set does not work with unsigned integer types, so the extra adjustments are necessary after loading an unsigned value (which is what that conditional fadd
does).
There are other contexts and examples that can be used to demonstrate that it works either way. So, again, it all depends. But generally, all this will not matter in the big picture of your program's performance. I generally prefer to use unsigned types to represent unsigned quantities. In my code 99% of integer types are unsigned. But I do it for purely conceptual reasons, not for any performance gains.
回答2:
Signed types are inherently more optimizable in most cases because the compiler can ignore the possibility of overflow and simplify/rearrange arithmetic in whatever ways it sees fit. On the other hand, unsigned types are inherently safer because the result is always well-defined (even if not to what you naively think it should be).
The one case where unsigned types are better optimizable is when you're writing division/remainder by a power of two. For unsigned types this translates directly to bitshift and bitwise and. For signed types, unless the compiler can establish that the value is known to be positive, it must generate extra code to compensate for the off-by-one issue with negative numbers (according to C, -3/2 is -1, whereas algebraically and by bitwise operations it's -2).
回答3:
It will almost certainly make no difference, but occasionally the compiler can play games with the signedness of types in order to shave a couple of cycles, but to be honest it probably is a negligible change overall.
For example suppose you have an int x
and want to write:
if(x >= 10 && x < 200) { /* ... */ }
You (or better yet, the compiler) can transform this a little to do one less comparison:
if((unsigned int)(x - 10) < 190) { /* ... */ }
This is making an assumption that int
is represented in 2's compliment, so that if (x - 10)
is less that 0
is becomes a huge value when viewed as an unsigned int
. For example, on a typical x86 system, (unsigned int)-1 == 0xffffffff
which is clearly bigger than the 190
being tested.
This is micro-optimization at best and best left up the compiler, instead you should write code that expresses what you mean and if it is too slow, profile and decide where it really is necessary to get clever.
回答4:
I don't imagine it would make much difference in terms of CPU or the compiler. One possible case would be if it enabled the compiler to know that the number would never be negative and optimize away code.
However it IS useful to a human reading your code so they know the domain of the variable in question.
回答5:
From the ALU's point of view adding (or whatever) signed or unsigned values doesn't make any difference, since they're both represented by a group of bit. 0100 + 1011
is always 1111
, but you choose if that is 4 + (-5) = -1
or 4 + 11 = 15
.
So I agree with @Mark, you should choose the best data-type to help others understand your code.
来源:https://stackoverflow.com/questions/4889707/compiler-optimization-on-marking-an-int-unsigned