How do you convert char numbers to decimal and back or convert ASCII 'A'-'Z'/'a'-'z' to letter offsets 0 for 'A'/'a' …?

后端 未结 4 849
无人共我
无人共我 2020-12-07 05:20

If you have a char that is in the range \'0\' to \'9\' how do you convert it to int values of 0 to 9

And then how do you convert it back?

Also given letters

相关标签:
4条回答
  • 2020-12-07 05:25

    The basic char encoding specified by C++ makes converting to and from '0' - '9' easy.

    C++ specifies:

    In both the source and execution basic character sets, the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous.

    This means that, whatever the integral value of '0', the integral value of '1' is '0' + 1, the integral value of '2' is '0' + 2, and so on. Using this information and the basic rules of arithmetic you can convert from char to int and back easily:

    char c = ...; // some value in the range '0' - '9'
    int int_value = c - '0';
    
    // int_value is in the range 0 - 9
    char c2 = '0' + int_value;
    

    Portably converting the letters 'a' to 'z' to numbers from 0 to 25 is not as easy because C++ does not specify that the values of these letters are consecutive. In ASCII they are consecutive, and you can write code that relies on that similar to the above code for '0' - '9'. (These days ASCII is used most everywhere).

    Portable code would instead use a lookup table or a specific checks for each character:

    char int_to_char[] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'};
    
    int char_to_int[CHAR_MAX + 1] = {};
    
    for (int i=0; i<sizeof(int_to_char); ++i) {
      char_to_int[int_to_char[i]] = i;
    }
    
    // convert a lowercase char letter to a number in the range 0 - 25:
    int i = char_to_int['d'];
    
    // convert an int in the range 0 - 25 to a char
    char c = int_to_char[25];
    

    In C99 you can just directly initialize the char_to_int[] data without a loop.

    int char_to_int[] = {['a'] = 0, ['b'] = 1, ['c'] = 2, ['d'] = 3, ['e'] = 4, ['f'] = 5, ['g'] = 6, ['h'] = 7, ['i'] = 8, ['j'] = 9, ['k'] = 10, ['l'] = 11, ['m'] = 12, ['n'] = 13, ['o'] = 14, ['p'] = 15, ['q'] = 16, ['r'] = 17, ['s'] = 18, ['t'] = 19, ['u'] = 20, ['v'] = 21, ['w'] = 22, ['x'] = 23, ['y'] = 24, ['z'] = 25};
    

    C++ compilers that also support C99 may support this in C++ as well, as an extension.


    Here's a complete program that generates random values to use in these conversions. It uses C++, plus the C99 designated initialization extension.

    #include <cassert>
    
    int digit_char_to_int(char c) {
      assert('0' <= c && c <= '9');
      return c - '0';
    }
    
    char int_to_digit_char(int i) {
      assert(0 <= i && i <= 9);
      return '0' + i;
    }
    
    int alpha_char_to_int(char c) {
      static constexpr int char_to_int[] = {['a'] = 0, ['b'] = 1, ['c'] = 2, ['d'] = 3, ['e'] = 4, ['f'] = 5, ['g'] = 6, ['h'] = 7, ['i'] = 8, ['j'] = 9, ['k'] = 10, ['l'] = 11, ['m'] = 12, ['n'] = 13, ['o'] = 14, ['p'] = 15, ['q'] = 16, ['r'] = 17, ['s'] = 18, ['t'] = 19, ['u'] = 20, ['v'] = 21, ['w'] = 22, ['x'] = 23, ['y'] = 24, ['z'] = 25};
    
      assert(0 <= c && c <= sizeof(char_to_int)/sizeof(*char_to_int));
      int i = char_to_int[c];
      assert(i != 0 || c == 'a');
      return i;
    }
    
    char int_to_alpha_char(int i) {
      static constexpr char int_to_char[] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'};
    
      assert(0 <= i && i <= 25);
      return int_to_char[i];
    }
    
    #include <random>
    #include <iostream>
    
    int main() {
      std::random_device r;
      std::seed_seq seed{r(), r(), r(), r(), r(), r(), r(), r()};
      std::mt19937 m(seed);
    
      std::uniform_int_distribution<int> digits{0, 9};
      std::uniform_int_distribution<int> letters{0, 25};
    
      for (int i=0; i<20; ++i) {
        int a = digits(m);
        char b = int_to_digit_char(a);
        int c = digit_char_to_int(b);
    
        std::cout << a << " -> '" << b << "' -> " << c << '\n';
      }
    
      for (int i=0; i<20; ++i) {
        int a = letters(m);
        char b = int_to_alpha_char(a);
        int c = alpha_char_to_int(b);
    
        std::cout << a << " -> '" << b << "' -> " << c << '\n';
      }
    
    }
    
    0 讨论(0)
  • 2020-12-07 05:35

    There are two main ways to do this conversion: Lookup and Mathmatically

    All ASCII values are denoted in decimal notion in this answer

    Note that in ASCII: '0' is 48, 'A' is 65, and 'a' is 97

    Lookup:

    In the lookup version you have an array of char, and then place the mapped values in the array, and create an array of ints to convert back:

    In order to both validate and get the corresponding value when mapping char to int:

    0 will be a sentinal value to mean not mapped: out of range    
    all results will be one more than expected
    

    unsigned char is used to make sure a signed negative char is handled correctly

    While 'C' allows the notation { ['A'] = 1, ['B'] = 2,… }; , C++ does not, so generically the following code can be used to fill lookup tables:

    void fill_lookups(unsigned char * from_table, int from_size, int * to_table)
    {
         for (int i = 0; i < from_size; ++i)
         {
             to_table[from_table[i]]=i+1; // add one to support 0 as "out of range"
         }
    }
    
    unsigned char int_to_char[]={ '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
    unsigned char int_to_lower[]={'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
                         'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
                         'u', 'v', 'w', 'x', 'y', 'z'};
    unsigned char int_to_upper[]={'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
                         'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
                         'U', 'V', 'W', 'X', 'Y', 'Z'};
    
    int char_to_int[UCHAR_MAX+2] = {};       // This will return 0 for non digits
    int letter_to_offset[UCHAR_MAX+2] = {};  // This will return 0 for non alpha
    
    fill_lookups(int_to_char, sizeof(int_to_char), char_to_int);
    fill_lookups(int_to_lower, sizeof(int_to_lower), letter_to_offset);
    fill_lookups(int_to_upper, sizeof(int_to_upper), letter_to_offset);
    
    // Helper function to check in range and always reduce in range lookups by 1
    int to_int(int * table, unsigned char c, bool * in_range)
    {
       int ret = table[c];
       if (ret)
       {
           *in_range=(1==1); // for C/C++ true
           --ret;
       }
       else
       {
           *in_range=(0==1); // for C/C++ false
       }
    
       return ret;
    }
    
    bool in_range;  // always true in these cases
    int a=to_int(char_to_int, '7', &in_range); // a is now 7
    char b=int_to_char[7]; // b is now '7'    
    int c=to_int(letter_to_offset, 'C', &in_range); // c=2
    int d=to_int(letter_to_offset, 'c', &in_range); // d=2
    char e=int_to_upper[2]; // e='C'
    char f=int_to_lower[2]; // f='c'
    

    While this will work and if validation or other lookups are needed this might make sense, but...

    In general a better way to do this is using mathmatic equations.

    Mathmatically (alpha works for ASCII)

    Assuming that the conversions have already been validated to be in the correct range: (C style cast used for use with C or C++)

    Note that '0'-'9' are guarenteed to be consecutive in C and C++

    For ASCII 'A-Z' and 'a-z' are not only consecutive but 'A' % 32 and 'a' % 32 are both 1

    int a='7'-'0';         // a is now 7 in ASCII: 55-48=7
    
    char b=(char)7+'0';    // b is now '7' in ASCII: 7 + 48
    
    int c='C' % 32 - 1;    // c is now 2 in ASCII : 67 % 32 = 3 - 1 = 2
    

    -or- where we know it is uppercase

    int c='C'-'A';         // c is now 2 in ASCII : 67 - 65 = 2
    
    
    int d='c' % 32 - 1;    // d is now 2 in ASCII : 99 % 32 = 3 - 1 = 2
    

    -or- where we know it is lowercase

    int d='c'-'a';         // d is now 2 in ASCII : 99 - 97 = 2
    
    char e=(char)2 + 'A';  // e is 'C' in ASCII : 65 + 2 = 67
    char f=(char)2 + 'a';  // f is 'c' in ASCII : 97 + 2 = 99
    
    0 讨论(0)
  • 2020-12-07 05:38

    If I understand correctly, you want to do this:

    #include <ctype.h>    /* for toupper */
    
    int digit_from_char(char c) {
        return c - '0';
    }
    
    char char_from_digit(int d) {
        return d + '0';
    }
    
    int letter_from_char(char c) {
        return toupper(c) - 'A';
    }
    
    char char_from_letter(int l) {
        return l + 'A';
    }
    
    0 讨论(0)
  • 2020-12-07 05:47

    If you know a character c is either a letter or number, you can just do:

    int cton( char c )
    {
      if( 'a' <= c ) return c-'a';
      if( 'A' <= c ) return c-'A';
      return c-'0';
    }
    

    Add whatever error checking on c is needed.

    To convert an integer n back to a char, just do '0'+n if you want a digit, 'A'+n if you want an uppercase letter, and 'a'+n if you want lowercase.

    Note: This works for ASCII (as the OP is tagged.) See Pete's informative comment however.

    0 讨论(0)
提交回复
热议问题