What's the fastest way to convert hex to integer in C++?

后端 未结 6 593
甜味超标
甜味超标 2021-02-01 08:52

I\'m trying to convert a hex char to integer as fast as possible.

This is only one line: int x = atoi(hex.c_str);

Is there a faster way

6条回答
  •  野的像风
    2021-02-01 09:29

    Proposed Solutions that Render Faster than the OP's if-else:

    • Unordered Map Lookup Table

    Provided that your input strings are always hex numbers you could define a lookup table as an unordered_map:

    std::unordered_map table {
    {'0', 0}, {'1', 1}, {'2', 2},
    {'3', 3}, {'4', 4}, {'5', 5},
    {'6', 6}, {'7', 7}, {'8', 8},
    {'9', 9}, {'a', 10}, {'A', 10},
    {'b', 11}, {'B', 11}, {'c', 12},
    {'C', 12}, {'d', 13}, {'D', 13},
    {'e', 14}, {'E', 14}, {'f', 15},
    {'F', 15}, {'x', 0}, {'X', 0}};
    
    int hextoint(char number) {
      return table[(std::size_t)number];
    }
    
    • Lookup Table as user constexpr literal (C++14)

    Or if you want something more faster instead of an unordered_map you could use the new C++14 facilities with user literal types and define your table as a literal type at compile time:

    struct Table {
      long long tab[128];
      constexpr Table() : tab {} {
        tab['1'] = 1;
        tab['2'] = 2;
        tab['3'] = 3;
        tab['4'] = 4;
        tab['5'] = 5;
        tab['6'] = 6;
        tab['7'] = 7;
        tab['8'] = 8;
        tab['9'] = 9;
        tab['a'] = 10;
        tab['A'] = 10;
        tab['b'] = 11;
        tab['B'] = 11;
        tab['c'] = 12;
        tab['C'] = 12;
        tab['d'] = 13;
        tab['D'] = 13;
        tab['e'] = 14;
        tab['E'] = 14;
        tab['f'] = 15;
        tab['F'] = 15;
      }
      constexpr long long operator[](char const idx) const { return tab[(std::size_t) idx]; } 
    } constexpr table;
    
    constexpr int hextoint(char number) {
      return table[(std::size_t)number];
    }
    

    Live Demo

    Benchmarks:

    I ran benchmarks with the code written by Nikos Athanasiou that was posted recently on isocpp.org as a proposed method for C++ micro-benchmarking.

    The algorithms that were compared are:

    1. OP's original if-else:

    long long hextoint3(char number) {
      if(number == '0') return 0;
      if(number == '1') return 1;
      if(number == '2') return 2;
      if(number == '3') return 3;
      if(number == '4') return 4;
      if(number == '5') return 5;
      if(number == '6') return 6;
      if(number == '7') return 7;
      if(number == '8') return 8;
      if(number == '9') return 9;
      if(number == 'a' || number == 'A') return 10;
      if(number == 'b' || number == 'B') return 11;
      if(number == 'c' || number == 'C') return 12;
      if(number == 'd' || number == 'D') return 13;
      if(number == 'e' || number == 'E') return 14;
      if(number == 'f' || number == 'F') return 15;
      return 0;
    }
    

    2. Compact if-else, proposed by Christophe:

    long long hextoint(char number) {
      if (number >= '0' && number <= '9') return number - '0';
      else if (number >= 'a' && number <= 'f') return number - 'a' + 0x0a;
      else if (number >= 'A' && number <= 'F') return number - 'A' + 0X0a;
      else return 0;
    }
    

    3. Corrected ternary operator version that handles also capital letter inputs, proposed by g24l:

    long long hextoint(char in) {
      int const x = in;
      return (x <= 57)? x - 48 : (x <= 70)? (x - 65) + 0x0a : (x - 97) + 0x0a;
    }
    

    4. Lookup Table (unordered_map):

    long long hextoint(char number) {
      return table[(std::size_t)number];
    }
    

    where table is the unordered map shown previously.

    5. Lookup Table (user constexpr literal):

    long long hextoint(char number) {
      return table[(std::size_t)number];
    }
    

    Where table is user defined literal as shown above.

    Experimental Settings

    I defined a function that transforms an input hex string to an integer:

    long long hexstrtoint(std::string const &str, long long(*f)(char)) {
      long long ret = 0;
      for(int j(1), i(str.size() - 1); i >= 0; --i, j *= 16) {
        ret += (j * f(str[i]));
      }
      return ret;
    }
    

    I also defined a function that populates a vector of strings with random hex strings:

    std::vector
    populate_vec(int const N) {
      random_device rd;
      mt19937 eng{ rd() };
      uniform_int_distribution distr(0, std::numeric_limits::max() - 1);
      std::vector out(N);
      for(int i(0); i < N; ++i) {
        out[i] = int_to_hex(distr(eng));
      }
      return out;
    }
    

    I created vectors populated with 50000, 100000, 150000, 200000 and 250000 random hex strings respectively. Then for each algorithm I run 100 experiments and averaged the time results.

    Compiler was GCC version 5.2 with optimization option -O3.

    Results:

    Discussion

    From the results we can conclude that for these experimental settings the proposed table method out-performs all the other methods. The if-else method is by far the worst where as the unordered_map although it wins the if-else method it is significantly slower than the other proposed methods.

    CODE

    Edit:

    Results for method proposed by stgatilov, with bitwise operations:

    long long hextoint(char x) {
        int b = uint8_t(x);
        int maskLetter = (('9' - b) >> 31);
        int maskSmall = (('Z' - b) >> 31);
        int offset = '0' + (maskLetter & int('A' - '0' - 10)) + (maskSmall & int('a' - 'A'));
        return b - offset;
    }
    

    Edit:

    I also tested the original code from g24l against the table method:

    long long hextoint(char in) {
      long long const x = in;
      return x < 58? x - 48 : x - 87;
    }
    

    Note that this method doesn't handle capital letters A, B, C, D, E and F.

    Results:

    Still the table method renders faster.

提交回复
热议问题