Why would uint32_t be preferred rather than uint_fast32_t?

前端 未结 11 1177
没有蜡笔的小新
没有蜡笔的小新 2021-01-31 00:58

It seems that uint32_t is much more prevalent than uint_fast32_t (I realise this is anecdotal evidence). That seems counter-intuitive to me, though.

相关标签:
11条回答
  • 2021-01-31 01:25

    One reason is that unsigned int is already "fastest" without the need for any special typedefs or the need to include something. So, if you need it fast, just use the fundamental int or unsigned int type.
    While the standard does not explicitly guarantee that it is fastest, it indirectly does so by stating "Plain ints have the natural size suggested by the architecture of the execution environment" in 3.9.1. In other words, int (or its unsigned counterpart) is what the processor is most comfortable with.

    Now of course, you don't know what size unsigned int might be. You only know it is at least as large as short (and I seem to remember that short must be at least 16 bits, although I can't find that in the standard now!). Usually it's just plain simply 4 bytes, but it could in theory be larger, or in extreme cases, even smaller (although I've personally never encountered an architecture where this was the case, not even on 8-bit computers in the 1980s... maybe some microcontrollers, who knows turns out I suffer from dementia, int was very clearly 16 bits back then).

    The C++ standard doesn't bother to specify what the <cstdint> types are or what they guarantee, it merely mentions "same as in C".

    uint32_t, per the C standard, guarantees that you get exactly 32 bits. Not anything different, none less and no padding bits. Sometimes this is exactly what you need, and thus it is very valuable.

    uint_least32_t guarantees that whatever the size is, it cannot be smaller than 32 bits (but it could very well be larger). Sometimes, but much more rarely than an exact witdh or "don't care", this is what you want.

    Lastly, uint_fast32_t is somewhat superfluous in my opinion, except for documentation-of-intent purposes. The C standard states "designates an integer type that is usually fastest" (note the word "usually") and explicitly mentions that it needs not be fastest for all purposes. In other words, uint_fast32_t is just about the same as uint_least32_t, which is usually fastest too, only no guarantee given (but no guarantee either way).

    Since most of the time you either don't care about the exact size or you want exactly 32 (or 64, sometimes 16) bits, and since the "don't care" unsigned int type is fastest anyway, this explains why uint_fast32_t isn't so frequently used.

    0 讨论(0)
  • 2021-01-31 01:26

    Why do many people use uint32_t rather than uint32_fast_t?

    Silly answer:

    • There is no standard type uint32_fast_t, the correct spelling is uint_fast32_t.

    Practical answer:

    • Many people actually use uint32_t or int32_t for their precise semantics, exactly 32 bits with unsigned wrap around arithmetic (uint32_t) or 2's complement representation (int32_t). The xxx_fast32_t types may be larger and thus inappropriate to store to binary files, use in packed arrays and structures, or send over a network. Furthermore, they may not even be faster.

    Pragmatic answer:

    • Many people just don't know (or simply don't care) about uint_fast32_t, as demonstrated in comments and answers, and probably assume plain unsigned int to have the same semantics, although many current architectures still have 16-bit ints and some rare Museum samples have other strange int sizes less than 32.

    UX answer:

    • Although possibly faster than uint32_t, uint_fast32_t is slower to use: it takes longer to type, especially accounting for looking up spelling and semantics in the C documentation ;-)

    Elegance matters, (obviously opinion based):

    • uint32_t looks bad enough that many programmers prefer to define their own u32 or uint32 type... From this perspective, uint_fast32_t looks clumsy beyond repair. No surprise it sits on the bench with its friends uint_least32_t and such.
    0 讨论(0)
  • 2021-01-31 01:28

    To my understanding, int was initially supposed to be a "native" integer type with additional guarantee that it should be at least 16 bits in size - something that was considered "reasonable" size back then.

    When 32-bit platforms became more common, we can say that "reasonable" size has changed to 32 bits:

    • Modern Windows uses 32-bit int on all platforms.
    • POSIX guarantees that int is at least 32 bits.
    • C#, Java has type int which is guaranteed to be exactly 32 bits.

    But when 64-bit platform became the norm, no one expanded int to be a 64-bit integer because of:

    • Portability: a lot of code depends on int being 32 bit in size.
    • Memory consumption: doubling memory usage for every int might be unreasonable for most cases, as in most cases numbers in use are much smaller than 2 billion.

    Now, why would you prefer uint32_t to uint_fast32_t? For the same reason languages, C# and Java always use fixed size integers: programmer does not write code thinking about possible sizes of different types, they write for one platform and test code on that platform. Most of the code implicitly depends on specific sizes of data types. And this is why uint32_t is a better choice for most cases - it does not allow any ambiguity regarding its behavior.

    Moreover, is uint_fast32_t really the fastest type on a platform with a size equal or greater to 32 bits? Not really. Consider this code compiler by GCC for x86_64 on Windows:

    extern uint64_t get(void);
    
    uint64_t sum(uint64_t value)
    {
        return value + get();
    }
    

    Generated assembly looks like this:

    push   %rbx
    sub    $0x20,%rsp
    mov    %rcx,%rbx
    callq  d <sum+0xd>
    add    %rbx,%rax
    add    $0x20,%rsp
    pop    %rbx
    retq
    

    Now if you change get()'s return value to uint_fast32_t (which is 4 bytes on Windows x86_64) you get this:

    push   %rbx
    sub    $0x20,%rsp
    mov    %rcx,%rbx
    callq  d <sum+0xd>
    mov    %eax,%eax        ; <-- additional instruction
    add    %rbx,%rax
    add    $0x20,%rsp
    pop    %rbx
    retq
    

    Notice how generated code is almost the same except for additional mov %eax,%eax instruction after function call which is meant to expand 32-bit value into a 64-bit value.

    There is no such issue if you only use 32-bit values, but you will probably be using those with size_t variables (array sizes probably?) and those are 64 bits on x86_64. On Linux uint_fast32_t is 8 bytes, so the situation is different.

    Many programmers use int when they need to return small value (let's say in the range [-32,32]). This would work perfectly if int would be platforms native integer size, but since it is not on 64-bit platforms, another type which matches platform native type is a better choice (unless it is frequently used with other integers of smaller size).

    Basically, regardless of what standard says, uint_fast32_t is broken on some implementations anyway. If you care about additional instruction generated in some places, you should define your own "native" integer type. Or you can use size_t for this purpose, as it will usually match native size (I am not including old and obscure platforms like 8086, only platforms that can run Windows, Linux etc).


    Another sign that shows int was supposed to be a native integer type is "integer promotion rule". Most CPUs can only perform operations on native, so 32 bit CPU usually can only do 32-bit additions, subtractions etc (Intel CPUs are an exception here). Integer types of other sizes are supported only through load and store instructions. For example, the 8-bit value should be loaded with appropriate "load 8-bit signed" or "load 8-bit unsigned" instruction and will expand value to 32 bits after load. Without integer promotion rule C compilers would have to add a little bit more code for expressions that use types smaller than native type. Unfortunately, this does not hold anymore with 64-bit architectures as compilers now have to emit additional instructions in some cases (as was shown above).

    0 讨论(0)
  • 2021-01-31 01:29

    From the viewpoint of correctness and ease of coding, uint32_t has many advantages over uint_fast32_t in particular because of the more precisely defined size and arithmetic semantics, as many users above have pointed out.

    What has perhaps been missed is that the one supposed advantage of uint_fast32_t - that it can be faster, just never materialized in any meaningful way. Most of the 64-bit processors that have dominated the 64-bit era (x86-64 and Aarch64 mostly) evolved from 32-bit architectures and have fast 32-bit native operations even in 64-bit mode. So uint_fast32_t is just the same as uint32_t on those platforms.

    Even if some of the "also ran" platforms like POWER, MIPS64, SPARC only offer 64-bit ALU operations, the vast majority of interesting 32-bit operations can be done just fine on 64-bit registers: the bottom 32-bit will have the desired results (and all mainstream platforms at least allow you to load/store 32-bits). Left shift is the main problematic one, but even that can be optimized in many cases by value/range tracking optimizations in the compiler.

    I doubt the occasional slightly slower left shift or 32x32 -> 64 multiplication is going to outweigh double the memory use for such values, in all but the most obscure applications.

    Finally, I'll note that while the tradeoff has largely been characterized as "memory use and vectorization potential" (in favor of uint32_t) versus instruction count/speed (in favor of uint_fast32_t) - even that isn't clear to me. Yes, on some platforms you'll need additional instructions for some 32-bit operations, but you'll also save some instructions because:

    • Using a smaller type often allows the compiler to cleverly combine adjacent operations by using one 64-bit operation to accomplish two 32-bit ones. An example of this type of "poor man's vectorization" is not uncommon. For example, create of a constant struct two32{ uint32_t a, b; } into rax like two32{1, 2} can be optimized into a single mov rax, 0x20001 while the 64-bit version needs two instructions. In principle this should also be possible for adjacent arithmetic operations (same operation, different operand), but I haven't seen it in practice.
    • Lower "memory use" also often leads to fewer instructions, even if memory or cache footprint isn't a problem, because any type structure or arrays of this type are copied, you get twice the bang for your buck per register copied.
    • Smaller data types often exploit better modern calling conventions like the SysV ABI which pack data structure data efficiently into registers. For example, you can return up to a 16-byte structure in registers rdx:rax. For a function returning structure with 4 uint32_t values (initialized from a constant), that translates into

      ret_constant32():
          movabs  rax, 8589934593
          movabs  rdx, 17179869187
          ret
      

      The same structure with 4 64-bit uint_fast32_t needs a register move and four stores to memory to do the same thing (and the caller will probablyhave to read the values back from memory after the return):

      ret_constant64():
          mov     rax, rdi
          mov     QWORD PTR [rdi], 1
          mov     QWORD PTR [rdi+8], 2
          mov     QWORD PTR [rdi+16], 3
          mov     QWORD PTR [rdi+24], 4
          ret
      

      Similarly, when passing structure arguments, 32-bit values are packed about twice as densely into the registers available for parameters, so it makes it less likely that you'll run out of register arguments and have to spill to the stack1.

    • Even if you choose to use uint_fast32_t for places where "speed matters" you'll often also have places where you need a fixed size type. For example, when passing values for external output, from external input, as part of your ABI, as part of a structure that needs a specific layout, or because you smartly use uint32_t for large aggregations of values to save on memory footprint. In the places where your uint_fast32_t and ``uint32_t` types need to interface, you might find (in addition to the development complexity), unnecessary sign extensions or other size-mismatch related code. Compilers do an OK job at optimizing this away in many cases, but it still not unusual to see this in optimized output when mixing types of different sizes.

    You can play with some of the examples above and more on godbolt.


    1 To be clear, the convention of packing structures tightly into registers isn't always a clear win for smaller values. It does mean that the smaller values may have to be "extracted" before they can be used. For example a simple function that returns the sum of the two structure members together needs a mov rax, rdi; shr rax, 32; add edi, eax while for the 64-bit version each argument gets its own register and just needs a single add or lea. Still if you accept that the "tightly pack structures while passing" design makes sense overall, then smaller values will take more advantage of this feature.

    0 讨论(0)
  • 2021-01-31 01:29

    To give a direct answer: I think the real reason why uint32_t is used over uint_fast32_t or uint_least32_t is simply that it is easier to type, and, due to being shorter, much nicer to read: If you make structs with some types, and some of them are uint_fast32_t or similar, then it's often hard to align them nicely with int or bool or other types in C, which are quite short (case in point: char vs. character). I of course cannot back this up with hard data, but the other answers can only guess at the reason as well.

    As for technical reasons to prefer uint32_t, I don't think there are any - when you absolutely need an exact 32 bit unsigned int, then this type is your only standardised choice. In almost all other cases, the other variants are technically preferable - specifically, uint_fast32_t if you are concerned about speed, and uint_least32_t if you are concerned about storage space. Using uint32_t in either of these cases risks not being able to compile as the type is not required to exist.

    In practise, the uint32_t and related types exist on all current platforms, except some very rare (nowadays) DSPs or joke implementations, so there is little actual risk in using the exact type. Similarly, while you can run into speed penalties with the fixed-width types, they are (on modern cpus) not crippling anymore.

    Which is why, I think, the shorter type simply wins out in most cases, due to programmer lazyness.

    0 讨论(0)
  • 2021-01-31 01:30

    I have not seen evidence that uint32_t be used for its range. Instead, most of the time that I've seen uint32_t is used, it is to hold exactly 4 octets of data in various algorithms, with guaranteed wraparound and shift semantics!

    There are also other reasons to use uint32_t instead of uint_fast32_t: Often it is that it will provide stable ABI. Additionally the memory usage can be known accurately. This very much offsets whatever the speed gain would be from uint_fast32_t, whenever that type would be distinct from that of uint32_t.

    For values < 65536, there is already a handy type, it is called unsigned int (unsigned short is required to have at least that range as well, but unsigned int is of the native word size) For values < 4294967296, there is another called unsigned long.


    And lastly, people do not use uint_fast32_t because it is annoyingly long to type and easy to mistype :D

    0 讨论(0)
提交回复
热议问题