ANSI-C: maximum number of characters printing a decimal int

前端 未结 8 1671
醉梦人生
醉梦人生 2021-02-19 10:17

I\'d like to know if it is an easy way of determining the maximum number of characters to print a decimal int.

I know contains

相关标签:
8条回答
  • 2021-02-19 10:25

    You can calculate the number of digits using log base 10. In my system, calculating the ceiling of log base 2 using the bit representation of the number didn't provide any significant gain in speed. The floor of log base 10 + 1 gives the number of digits, I add 2 to account for the null character and sign.

    #include <limits.h>
    #include <stdio.h>
    #include <math.h>
    
    int main(void){
      printf("%d %d\n", INT_MAX, (int)floor(log10(INT_MAX)) + 3);
    
      return 0;
    }
    

    Also note that the number of bytes of an int can be 2 or 4 and it is 2 only in old systems, so you could calculate the upper bound and use it in your program.

    0 讨论(0)
  • 2021-02-19 10:26

    The simplest canonical and arguably most portable way is to ask snprintf() how much space would be required:

    char sbuf[2];
    int ndigits;
    
    ndigits = snprintf(sbuf, (size_t) 1, "%lld", (long long) INT_MIN);
    

    slightly less portable perhaps using intmax_t and %j:

    ndigits = snprintf(sbuf, (size_t) 1, "%j", (intmax_t) INT_MIN);
    

    One could consider that to be too expensive to do at runtime though, but it can work for any value, not just the MIN/MAX values of any integer type.

    You could of course also just directly calculate the number of digits that a given integer would require to be expressed in Base 10 notation with a simple recursive function:

    unsigned int
    numCharsB10(intmax_t n)
    {
            if (n < 0)
                    return numCharsB10((n == INTMAX_MIN) ? INTMAX_MAX : -n) + 1;
            if (n < 10)
                    return 1;
    
            return 1 + numCharsB10(n / 10);
    }
    

    but that of course also requires CPU at runtime, even when inlined, though perhaps a little less than snprintf() does.

    @R.'s answer above though is more or less wrong, but on the right track. Here's the correct derivation of some very well and widely tested and highly portable macros that implement the calculation at compile time using sizeof(), using a slight correction of @R.'s initial wording to start out:

    First we can easily see (or show) that sizeof(int) is the log base 2 of UINT_MAX divided by the number of bits represented by one unit of sizeof() (8, aka CHAR_BIT):

    sizeof(int) == log2(UINT_MAX) / 8

    because UINT_MAX is of course just 2 ^ (sizeof(int) * 8)) and log2(x) is the inverse of 2^x.

    We can use the identity "logb(x) = log(x) / log(b)" (where log() is the natural logarithm) to find logarithms of other bases. For example, you could compute the "log base 2" of "x" using:

    log2(x) = log(x) / log(2)

    and also:

    log10(x) = log(x) / log(10)

    So, we can deduce that:

    log10(v) = log2(v) / log2(10)

    Now what we want in the end is the log base 10 of UINT_MAX, so since log2(10) is approximately 3, and since we know from above what log2() is in terms of sizeof(), we can say that log10(UINT_MAX) is approximately:

    log10(2^(sizeof(int)*8)) ~= (sizeof(int) * 8) / 3

    That's not perfect though, especially since what we really want is the ceiling value, but with some minor adjustment to account for the integer rounding of log2(10) to 3, we can get what we need by first adding one to the log2 term, then subtracting 1 from the result for any larger-sized integer, resulting in this "good-enough" expression:

    #if 0
    #define __MAX_B10STRLEN_FOR_UNSIGNED_TYPE(t) \
        ((((sizeof(t) * CHAR_BIT) + 1) / 3) - ((sizeof(t) > 2) ? 1 : 0))
    #endif
    

    Even better we can multiply our first log2() term by 1/log2(10) (multiplying by the reciprocal of the divisor is the same as dividing by the divisor), and doing so makes it possible to find a better integer approximation. I most recently (re?)encountered this suggestion while reading Sean Anderson's bithacks: http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10

    To do this with integer math to the best approximation possible, we need to find the ideal ratio representing our reciprocal. This can be found by searching for the smallest fractional part of multiplying our desired value of 1/log2(10) by successive powers of 2, within some reasonable range of powers of 2, such as with the following little AWK script:

        awk 'BEGIN {
                minf=1.0
        }
        END {
                for (i = 1; i <= 31; i++) {
                        a = 1.0 / (log(10) / log(2)) * 2^i
                        if (a > (2^32 / 32))
                                break;
                        n = int(a)
                        f = a - (n * 1.0)
                        if (f < minf) {
                                minf = f
                                minn = n
                                bits = i
                        }
                        # printf("a=%f, n=%d, f=%f, i=%d\n", a, n, f, i)
                }
                printf("%d + %f / %d, bits=%d\n", minn, minf, 2^bits, bits)
        }' < /dev/null
    
        1233 + 0.018862 / 4096, bits=12
    

    So we can get a good integer approximation of multiplying our log2(v) value by 1/log2(10) by multiplying it by 1233 followed by a right-shift of 12 (2^12 is 4096 of course):

    log10(UINT_MAX) ~= ((sizeof(int) * 8) + 1) * 1233 >> 12

    and, together with adding one to do the equivalent of finding the ceiling value, that gets rid of the need to fiddle with odd values:

    #define __MAX_B10STRLEN_FOR_UNSIGNED_TYPE(t) \
        (((((sizeof(t) * CHAR_BIT)) * 1233) >> 12) + 1)
    
    /*
     * for signed types we need room for the sign, except for int64_t
     */
    #define __MAX_B10STRLEN_FOR_SIGNED_TYPE(t) \
        (__MAX_B10STRLEN_FOR_UNSIGNED_TYPE(t) + ((sizeof(t) == 8) ? 0 : 1))
    
    /*
     * NOTE: this gives a warning (for unsigned types of int and larger) saying
     * "comparison of unsigned expression < 0 is always false", and of course it
     * is, but that's what we want to know (if indeed type 't' is unsigned)!
     */
    #define __MAX_B10STRLEN_FOR_INT_TYPE(t)                     \
        (((t) -1 < 0) ? __MAX_B10STRLEN_FOR_SIGNED_TYPE(t)      \
                      : __MAX_B10STRLEN_FOR_UNSIGNED_TYPE(t))
    

    whereas normally the compiler will evaluate at compile time the expression my __MAX_B10STRLEN_FOR_INT_TYPE() macro becomes. Of course my macro always calculates the maximum space required by a given type of integer, not the exact space required by a particular integer value.

    0 讨论(0)
  • 2021-02-19 10:31

    The maximum number of decimal digits d of a signed or unsigned integer x of b bits matches the number of decimal digits of the number 2^b. In the case of signed numbers, an extra character must be added for the sign.

    The number of decimal digits of x can be calculated as log_10(x), rounded up.

    Therefore, the maximum number of decimal digits of x will be log_10(2^b) = b * log_10(2) = b * 0.301029995663981, rounded up.

    If s is the size in bytes (given by the sizeof operator) of a certain type of integer used to store x, its size b in bits will be b = s * 8. So, the maximum number of decimal digits d will be (s * 8) * 0.301029995663981, rounded up. Rounding up will consist of truncating (converting to an integer), and adding 1.

    Of course, all these constants will have to be added 1 to count the final 0 byte (see IntegerString in the following example).

    #include <limits.h>
    #include <stdio.h>
    #include <stdlib.h>
    
    #define COMMON_LOG_OF_2 0.301029995663981
    
    #define MAX_DECIMAL_DIGITS_UCHAR        ((unsigned) (sizeof (unsigned char     ) * 8 * COMMON_LOG_OF_2) + 1)
    #define MAX_DECIMAL_DIGITS_USHORT       ((unsigned) (sizeof (unsigned short    ) * 8 * COMMON_LOG_OF_2) + 1)
    #define MAX_DECIMAL_DIGITS_UINT         ((unsigned) (sizeof (unsigned int      ) * 8 * COMMON_LOG_OF_2) + 1)
    #define MAX_DECIMAL_DIGITS_ULONG        ((unsigned) (sizeof (unsigned long     ) * 8 * COMMON_LOG_OF_2) + 1)
    #define MAX_DECIMAL_DIGITS_ULONGLONG    ((unsigned) (sizeof (unsigned long long) * 8 * COMMON_LOG_OF_2) + 1)
    #define MAX_DECIMAL_DIGITS_UINT128      ((unsigned) (sizeof (unsigned __int128 ) * 8 * COMMON_LOG_OF_2) + 1)
    
    #define MAX_DECIMAL_DIGITS_CHAR         (1 + MAX_DECIMAL_DIGITS_UCHAR    )
    #define MAX_DECIMAL_DIGITS_SHORT        (1 + MAX_DECIMAL_DIGITS_USHORT   )
    #define MAX_DECIMAL_DIGITS_INT          (1 + MAX_DECIMAL_DIGITS_UINT     )
    #define MAX_DECIMAL_DIGITS_LONG         (1 + MAX_DECIMAL_DIGITS_ULONG    )
    #define MAX_DECIMAL_DIGITS_LONGLONG     (1 + MAX_DECIMAL_DIGITS_ULONGLONG)
    #define MAX_DECIMAL_DIGITS_INT128       (1 + MAX_DECIMAL_DIGITS_UINT128  )
    
    int main (void)
    {
        char IntegerString[MAX_DECIMAL_DIGITS_INT + 1];
    
        printf ("MAX_DECIMAL_DIGITS_UCHAR     = %2u\n",MAX_DECIMAL_DIGITS_UCHAR    );
        printf ("MAX_DECIMAL_DIGITS_USHORT    = %2u\n",MAX_DECIMAL_DIGITS_USHORT   );
        printf ("MAX_DECIMAL_DIGITS_UINT      = %2u\n",MAX_DECIMAL_DIGITS_UINT     );
        printf ("MAX_DECIMAL_DIGITS_ULONG     = %2u\n",MAX_DECIMAL_DIGITS_ULONG    );
        printf ("MAX_DECIMAL_DIGITS_ULONGLONG = %2u\n",MAX_DECIMAL_DIGITS_ULONGLONG);
        printf ("MAX_DECIMAL_DIGITS_UINT128   = %2u\n",MAX_DECIMAL_DIGITS_UINT128  );
    
        printf ("MAX_DECIMAL_DIGITS_CHAR      = %2u\n",MAX_DECIMAL_DIGITS_CHAR     );
        printf ("MAX_DECIMAL_DIGITS_SHORT     = %2u\n",MAX_DECIMAL_DIGITS_SHORT    );
        printf ("MAX_DECIMAL_DIGITS_INT       = %2u\n",MAX_DECIMAL_DIGITS_INT      );
        printf ("MAX_DECIMAL_DIGITS_LONG      = %2u\n",MAX_DECIMAL_DIGITS_LONG     );
        printf ("MAX_DECIMAL_DIGITS_LONGLONG  = %2u\n",MAX_DECIMAL_DIGITS_LONGLONG );
        printf ("MAX_DECIMAL_DIGITS_INT128    = %2u\n",MAX_DECIMAL_DIGITS_INT128   );
    
        sprintf (IntegerString,"%d",INT_MAX);
        printf ("INT_MAX       = %d\n",INT_MAX);
        printf ("IntegerString = %s\n",IntegerString);
    
        sprintf (IntegerString,"%d",INT_MIN);
        printf ("INT_MIN       = %d\n",INT_MIN);
        printf ("IntegerString = %s\n",IntegerString);
    
        return EXIT_SUCCESS;
    }
    

    EDIT:

    Unfortunately, the use of floating point may cause problems when evaluating the expressions as constants. I have modified them by multiplying by 2 ^ 11 and dividing by 2 ^ 8, so that all calculations should be performed by the preprocessor with integers:

    #include <limits.h>
    #include <stdio.h>
    #include <stdlib.h>
    
    #define LOG2_x_2_11 616 // log(2) * 2^11
    
    #define MAX_DECIMAL_DIGITS_UCHAR        (((sizeof (unsigned char     ) * LOG2_x_2_11) >> 8) + 1)
    #define MAX_DECIMAL_DIGITS_USHORT       (((sizeof (unsigned short    ) * LOG2_x_2_11) >> 8) + 1)
    #define MAX_DECIMAL_DIGITS_UINT         (((sizeof (unsigned int      ) * LOG2_x_2_11) >> 8) + 1)
    #define MAX_DECIMAL_DIGITS_ULONG        (((sizeof (unsigned long     ) * LOG2_x_2_11) >> 8) + 1)
    #define MAX_DECIMAL_DIGITS_ULONGLONG    (((sizeof (unsigned long long) * LOG2_x_2_11) >> 8) + 1)
    #define MAX_DECIMAL_DIGITS_UINT128      (((sizeof (unsigned __int128 ) * LOG2_x_2_11) >> 8) + 1)
    
    #define MAX_DECIMAL_DIGITS_CHAR     (1 + MAX_DECIMAL_DIGITS_UCHAR    )
    #define MAX_DECIMAL_DIGITS_SHORT    (1 + MAX_DECIMAL_DIGITS_USHORT   )
    #define MAX_DECIMAL_DIGITS_INT      (1 + MAX_DECIMAL_DIGITS_UINT     )
    #define MAX_DECIMAL_DIGITS_LONG     (1 + MAX_DECIMAL_DIGITS_ULONG    )
    #define MAX_DECIMAL_DIGITS_LONGLONG (1 + MAX_DECIMAL_DIGITS_ULONGLONG)
    #define MAX_DECIMAL_DIGITS_INT128   (1 + MAX_DECIMAL_DIGITS_UINT128  )
    
    int main (void)
    {
        char IntegerString[MAX_DECIMAL_DIGITS_INT + 1];
    
        printf ("MAX_DECIMAL_DIGITS_UCHAR     = %2zu\n",MAX_DECIMAL_DIGITS_UCHAR    );
        printf ("MAX_DECIMAL_DIGITS_USHORT    = %2zu\n",MAX_DECIMAL_DIGITS_USHORT   );
        printf ("MAX_DECIMAL_DIGITS_UINT      = %2zu\n",MAX_DECIMAL_DIGITS_UINT     );
        printf ("MAX_DECIMAL_DIGITS_ULONG     = %2zu\n",MAX_DECIMAL_DIGITS_ULONG    );
        printf ("MAX_DECIMAL_DIGITS_ULONGLONG = %2zu\n",MAX_DECIMAL_DIGITS_ULONGLONG);
        printf ("MAX_DECIMAL_DIGITS_UINT128   = %2zu\n",MAX_DECIMAL_DIGITS_UINT128  );
    
        printf ("MAX_DECIMAL_DIGITS_CHAR      = %2zu\n",MAX_DECIMAL_DIGITS_CHAR     );
        printf ("MAX_DECIMAL_DIGITS_SHORT     = %2zu\n",MAX_DECIMAL_DIGITS_SHORT    );
        printf ("MAX_DECIMAL_DIGITS_INT       = %2zu\n",MAX_DECIMAL_DIGITS_INT      );
        printf ("MAX_DECIMAL_DIGITS_LONG      = %2zu\n",MAX_DECIMAL_DIGITS_LONG     );
        printf ("MAX_DECIMAL_DIGITS_LONGLONG  = %2zu\n",MAX_DECIMAL_DIGITS_LONGLONG );
        printf ("MAX_DECIMAL_DIGITS_INT128    = %2zu\n",MAX_DECIMAL_DIGITS_INT128   );
    
        sprintf (IntegerString,"%d",INT_MAX);
        printf ("INT_MAX       = %d\n",INT_MAX);
        printf ("IntegerString = %s\n",IntegerString);
    
        sprintf (IntegerString,"%d",INT_MIN);
        printf ("INT_MIN       = %d\n",INT_MIN);
        printf ("IntegerString = %s\n",IntegerString);
    
        return EXIT_SUCCESS;
    }
    
    0 讨论(0)
  • 2021-02-19 10:31

    In C++11 and later, you can do the following:

    namespace details {
        template<typename T>
        constexpr size_t max_to_string_length_impl(T value) {
            return (value >= 0 && value < 10) ? 1                            // [0..9] -> 1
                : (std::is_signed<T>::value && value < 0 && value > -10) ? 2 // [-9..-1] -> 2
                : 1 + max_to_string_length_impl(value / 10);                 // ..-10] [10.. -> recursion
        }
    }
    
    template<typename T>
    constexpr size_t max_to_string_length() { 
        return std::max(
            details::max_to_string_length_impl(std::numeric_limits<T>::max()),
            details::max_to_string_length_impl(std::numeric_limits<T>::min())); 
    }
    
    0 讨论(0)
  • 2021-02-19 10:35

    I don't know if it is any trick to do what you want in plain ANSI-C, but in C++ you can easily use template metaprogramming to do:

    #include    <iostream>
    #include    <limits>
    #include    <climits>
    
    template< typename T, unsigned long N = INT_MAX >
    class   MaxLen
    {
    public:
        enum
        {
            StringLen = MaxLen< T, N / 10 >::StringLen + 1
        };
    };
    
    template< typename T >
    class   MaxLen< T, 0 >
    {
    public:
        enum
        {
            StringLen = 1
        };
    };
    

    And you can call it from your pure-C code creating an additional C++ function like this:

    extern "C"
    int int_str_max( )
    {
        return  MaxLen< int >::StringLen;
    }
    

    This has a ZERO execution time overhead and calculates the exact space needed.


    You can test the above templates with something like:

    int main( )
    {
    std::cout << "Max: " << std::numeric_limits< short >::max( ) << std::endl;
    std::cout << "Digits: " << std::numeric_limits< short >::digits10 << std::endl;
    std::cout << "A \"short\" is " << sizeof( short ) << " bytes." << std::endl
        << "A string large enough to fit any \"short\" is "
        << MaxLen< short, SHRT_MAX >::StringLen << " bytes wide." << std::endl;
    
    std::cout << "Max: " << std::numeric_limits< int >::max( ) << std::endl;
    std::cout << "Digits: " << std::numeric_limits< int >::digits10 << std::endl;
    std::cout << "An \"int\" is " << sizeof( int ) << " bytes." << std::endl
        << "A string large enough to fit any \"int\" is "
        << MaxLen< int >::StringLen << " bytes wide." << std::endl;
    
    std::cout << "Max: " << std::numeric_limits< long >::max( ) << std::endl;
    std::cout << "Digits: " << std::numeric_limits< long >::digits10 << std::endl;
    std::cout << "A \"long\" is " << sizeof( long ) << " bytes." << std::endl
        << "A string large enough to fit any \"long\" is "
        << MaxLen< long, LONG_MAX >::StringLen << " bytes wide." << std::endl;
    
        return  0;
    }
    

    The output is:

    Max: 32767
    Digits: 4
    A "short" is 2 bytes.
    A string large enough to fit any "short" is 6 bytes wide.
    Max: 2147483647
    Digits: 9
    An "int" is 4 bytes.
    A string large enough to fit any "int" is 11 bytes wide.
    Max: 9223372036854775807
    Digits: 18
    A "long" is 8 bytes.
    A string large enough to fit any "long" is 20 bytes wide.
    
    • Note the slightly different values from std::numeric_limits< T >::digits10 and MaxLen< T, N >::StringLen, as the former does not take into account digits if if can't reach '9'. Of course you can use it and simply add two if you don't care wasting a single byte in some cases.

    EDIT:

    Some may have found weird including <climits>. If you can count with C++11, you won't need it, and will earn an additional simplicity:

    #include    <iostream>
    #include    <limits>
    
    template< typename T, unsigned long N = std::numeric_limits< T >::max( ) >
    class   MaxLen
    {
    public:
        enum
        {
            StringLen = MaxLen< T, N / 10 >::StringLen + 1
        };
    };
    
    template< typename T >
    class   MaxLen< T, 0 >
    {
    public:
        enum
        {
            StringLen = 1
        };
    };
    

    Now you can use

    MaxLen< short >::StringLen
    

    instead of

    MaxLen< short, SHRT_MAX >::StringLen
    

    Good, isn't?

    0 讨论(0)
  • 2021-02-19 10:37

    Here's the C version:

    #include <limits.h>
    
    #define xstr(s) str(s)
    #define str(s) #s
    #define INT_STR_MAX sizeof(xstr(INT_MAX))
    
    char buffer[INT_STR_MAX];
    

    Then:

    $ gcc -E -o str.cpp str.c
    $ grep buffer str.cpp
    char buffer[sizeof("2147483647")];
    
    $ gcc -S -o str.S str.c
    $ grep buffer str.S
        .comm   buffer,11,1
    
    0 讨论(0)
提交回复
热议问题