Until recently, I\'d considered the decision by most systems implementors/vendors to keep plain int
32-bit even on 64-bit machines a sort of expedient wart. With mo
While I don't personally write code like this, I'll bet that it's out there in more than one place... and of course it'll break if you change the size of int
.
int i, x = getInput();
for (i = 0; i < 32; i++)
{
if (x & (1 << i))
{
//Do something
}
}
Not particularly. int is 64 bit on some 64 bit architectures (not x64).
The standard does not actually guarantee you get 32 bit integers, just that (u)int32_t can hold one.
Now if you are depending on int is the same size as ptrdiff_t you may be broken.
Remember, C does not guarantee that the machine even is a binary machine.
Well, it's not like this story is all new. With "most computers" I assume you mean desktop computers. There already has been a transition from 16-bit to 32-bit int
. Is there anything at all that says the same progression won't happen this time?
There's one code idiom that would break if ints were 64-bits, and I see it often enough that I think it could be called reasonable:
((val & 0x80000000) != 0)
This is commonly found in checking error codes. Many error code standards (like Window's HRESULT
) uses bit 31 to represent an error. And code will sometimes check for that error either by testing bit 31 or sometimes by checking if the error is a negative number.
Microsoft's macros for testing HRESULT use both methods - and I'm sure there's a ton of code out there that does similar without using the SDK macros. If MS had moved to ILP64, this would be one area that caused porting headaches that are completely avoided with the LLP64 model (or the LP64 model).
Note: if you're not familiar with terms like "ILP64", please see the mini-glossary at the end of the answer.
I'm pretty sure there's a lot of code (not necessarily Windows-oriented) out there that uses plain-old-int to hold error codes, assuming that those ints are 32-bits in size. And I bet there's a lot of code with that error status scheme that also uses both kinds of checks (< 0
and bit 31 being set) and which would break if moved to an ILP64 platform. These checks could be made to continue to work correctly either way if the error codes were carefully constructed so that sign-extension took place, but again, many such systems I've seen construct the error values by or-ing together a bunch a bitfields.
Anyway, I don't think this is an unsolvable problem by any means, but I do think it's a fairly common coding practice that would cause a lot of code to require fixing up if moved to an ILP64 platform.
Note that I also don't think this was one of the foremost reasons for Microsoft to choose the LLP64 model (I think that decision was largely driven by binary data compatibility between 32-bit and 64-bit processes, as mentioned in MSDN and on Raymond Chen's blog).
Mini-Glossary for the 64-bit Platform Programming Model terminology:
int
, long
, pointers are 64-bitslong
and pointers are 64-bits, int
is 32-bits (used by many (most?) Unix platforms)long long
and pointers are 64-bits, int
and long
remain 32-bits (used on Win64)For more information on 64-bit programming models, see "64-bit Programming Models: Why LP64?"
DEC Alpha and OSF/1 Unix was one of the first 64-bit versions of Unix, and it used 64-bit integers - an ILP64 architecture (meaning int
, long
and pointers were all 64-bit quantities). It caused lots of problems.
One issue I've not seen mentioned - which is why I'm answering at all after so long - is that if you have a 64-bit int
, what size do you use for short
? Both 16 bits (the classical, change nothing approach) and 32 bits (the radical 'well, a short
should be half as long as an int
' approach) will present some problems.
With the C99 <stdint.h>
and <inttypes.h>
headers, you can code to fixed size integers - if you choose to ignore machines with 36-bit or 60-bit integers (which is at least quasi-legitimate). However, most code is not written using those types, and there are typically deep-seated and largely hidden (but fundamentally flawed) assumptions in the code that will be upset if the model departs from the existing variations.
Notice Microsoft's ultra-conservative LLP64 model for 64-bit Windows. That was chosen because too much old code would break if the 32-bit model was changed. However, code that had been ported to ILP64 or LP64 architectures was not immediately portable to LLP64 because of the differences. Conspiracy theorists would probably say it was deliberately chosen to make it more difficult for code written for 64-bit Unix to be ported to 64-bit Windows. In practice, I doubt whether that was more than a happy (for Microsoft) side-effect; the 32-bit Windows code had to be revised a lot to make use of the LP64 model too.
With modern C99 fixed-size types (int32_t and uint32_t, etc.) the need for there to be a standard integer type of each size 8, 16, 32, and 64 mostly disappears,
C99 has fixed-sized typeDEFs, not fixed-size types. The native C integer types are still char
, short
, int
, long
, and long long
. They are still relevant.
The problem with ILP64 is that it has a great mismatch between C types and C99 typedefs.
From 64-Bit Programming Models: Why LP64?:
Unfortunately, the ILP64 model does not provide a natural way to describe 32-bit data types, and must resort to non-portable constructs such as
__int32
to describe such types. This is likely to cause practical problems in producing code which can run on both 32 and 64 bit platforms without#ifdef
constructions. It has been possible to port large quantities of code to LP64 models without the need to make such changes, while maintaining the investment made in data sets, even in cases where the typing information was not made externally visible by the application.