Converting a hex string to a byte array

后端 未结 19 1937
时光取名叫无心
时光取名叫无心 2020-11-22 12:10

What is the best way to convert a variable length hex string e.g. \"01A1\" to a byte array containing that data.

i.e converting this:

st         


        
相关标签:
19条回答
  • 2020-11-22 13:07

    I would use a standard function like sscanf to read the string into an unsigned integer, and then you already have the bytes you need in memory. If you were on a big endian machine you could just write out (memcpy) the memory of the integer from the first non-zero byte. However you can't safely assume this in general, so you can use some bit masking and shifting to get the bytes out.

    const char* src = "01A1";
    char hexArray[256] = {0};
    int hexLength = 0;
    
    // read in the string
    unsigned int hex = 0;
    sscanf(src, "%x", &hex);
    
    // write it out
    for (unsigned int mask = 0xff000000, bitPos=24; mask; mask>>=8, bitPos-=8) {
        unsigned int currByte = hex & mask;
        if (currByte || hexLength) {
            hexArray[hexLength++] = currByte>>bitPos;
        }
    }
    
    0 讨论(0)
  • 2020-11-22 13:07

    I found this question, but the accepted answer didn't look like a C++ way of solving the task to me (this doesn't mean it's a bad answer or anything, just explaining motivation behind adding this one). I recollected this nice answer and decided to implement something similar. Here is complete code of what I ended up with (it also works for std::wstring):

    #include <cctype>
    #include <cstdlib>
    
    #include <algorithm>
    #include <iostream>
    #include <iterator>
    #include <ostream>
    #include <stdexcept>
    #include <string>
    #include <vector>
    
    template <typename OutputIt>
    class hex_ostream_iterator :
        public std::iterator<std::output_iterator_tag, void, void, void, void>
    {
        OutputIt out;
        int digitCount;
        int number;
    
    public:
        hex_ostream_iterator(OutputIt out) : out(out), digitCount(0), number(0)
        {
        }
    
        hex_ostream_iterator<OutputIt> &
        operator=(char c)
        {
            number = (number << 4) | char2int(c);
            digitCount++;
    
            if (digitCount == 2) {
                digitCount = 0;
                *out++ = number;
                number = 0;
            }
            return *this;
        }
    
        hex_ostream_iterator<OutputIt> &
        operator*()
        {
            return *this;
        }
    
        hex_ostream_iterator<OutputIt> &
        operator++()
        {
            return *this;
        }
    
        hex_ostream_iterator<OutputIt> &
        operator++(int)
        {
            return *this;
        }
    
    private:
        int
        char2int(char c)
        {
            static const std::string HEX_CHARS = "0123456789abcdef";
    
            const char lowerC = std::tolower(c);
            const std::string::size_type pos = HEX_CHARS.find_first_of(lowerC);
            if (pos == std::string::npos) {
                throw std::runtime_error(std::string("Not a hex digit: ") + c);
            }
            return pos;
        }
    };
    
    template <typename OutputIt>
    hex_ostream_iterator<OutputIt>
    hex_iterator(OutputIt out)
    {
        return hex_ostream_iterator<OutputIt>(out);
    }
    
    template <typename InputIt, typename OutputIt>
    hex_ostream_iterator<OutputIt>
    from_hex_string(InputIt first, InputIt last, OutputIt out)
    {
        if (std::distance(first, last) % 2 == 1) {
            *out = '0';
            ++out;
        }
        return std::copy(first, last, out);
    }
    
    int
    main(int argc, char *argv[])
    {
        if (argc != 2) {
            std::cout << "Usage: " << argv[0] << " hexstring" << std::endl;
            return EXIT_FAILURE;
        }
    
        const std::string input = argv[1];
        std::vector<unsigned char> bytes;
        from_hex_string(input.begin(), input.end(),
                        hex_iterator(std::back_inserter(bytes)));
    
        typedef std::ostream_iterator<unsigned char> osit;
        std::copy(bytes.begin(), bytes.end(), osit(std::cout));
    
        return EXIT_SUCCESS;
    }
    

    And the output of ./hex2bytes 61a062a063 | hexdump -C:

    00000000  61 a0 62 a0 63                                    |a.b.c|
    00000005
    

    And of ./hex2bytes 6a062a063 | hexdump -C (note odd number of characters):

    00000000  06 a0 62 a0 63                                    |..b.c|
    00000005
    
    0 讨论(0)
  • 2020-11-22 13:08

    The difficulty in an hex to char conversion is that the hex digits work pairwise, f.ex: 3132 or A0FF. So an even number of hex digits is assumed. However it could be perfectly valid to have an odd number of digits, like: 332 and AFF, which should be understood as 0332 and 0AFF.

    I propose an improvement to Niels Keurentjes hex2bin() function. First we count the number of valid hex digits. As we have to count, let's control also the buffer size:

    void hex2bin(const char* src, char* target, size_t size_target)
    {
        int countdgts=0;    // count hex digits
        for (const char *p=src; *p && isxdigit(*p); p++) 
            countdgts++;                            
        if ((countdgts+1)/2+1>size_target)
            throw exception("Risk of buffer overflow"); 
    

    By the way, to use isxdigit() you'll have to #include <cctype>.
    Once we know how many digits, we can determine if the first one is the higher digit (only pairs) or not (first digit not a pair).

    bool ishi = !(countdgts%2);         
    

    Then we can loop digit by digit, combining each pair using bin shift << and bin or, and toggling the 'high' indicator at each iteration:

        for (*target=0; *src; ishi = !ishi)  {    
            char tmp = char2int(*src++);    // hex digit on 4 lower bits
            if (ishi)
                *target = (tmp << 4);   // high:  shift by 4
            else *target++ |= tmp;      // low:  complete previous  
        } 
      *target=0;    // null terminated target (if desired)
    }
    
    0 讨论(0)
  • 2020-11-22 13:09

    You said "variable length." Just how variable do you mean?

    For hex strings that fit into an unsigned long I have always liked the C function strtoul. To make it convert hex pass 16 as the radix value.

    Code might look like:

    #include <cstdlib>
    std::string str = "01a1";
    unsigned long val = strtoul(str.c_str(), 0, 16);
    
    0 讨论(0)
  • 2020-11-22 13:09

    Somebody mentioned using sscanf to do this, but didn't say how. This is how. It's useful because it also works in ancient versions of C and C++ and even most versions of embedded C or C++ for microcontrollers.

    When converted to bytes, the hex-string in this example resolves to the ASCII text "Hello there!" which is then printed.

    #include <stdio.h>
    int main ()
    {
        char hexdata[] = "48656c6c6f20746865726521";
        char bytedata[20]{};
        for(int j = 0; j < sizeof(hexdata) / 2; j++) {
            sscanf(hexdata + j * 2, "%02hhX", bytedata + j);
        }
        printf ("%s -> %s\n", hexdata, bytedata);
        return 0;
    }
    
    0 讨论(0)
  • 2020-11-22 13:10
    #include <iostream>
    
    using byte = unsigned char;
    
    static int charToInt(char c) {
        if (c >= '0' && c <= '9') {
            return c - '0';
        }
        if (c >= 'A' && c <= 'F') {
            return c - 'A' + 10;
        }
        if (c >= 'a' && c <= 'f') {
            return c - 'a' + 10;
        }
        return -1;
    }
    
    // Decodes specified HEX string to bytes array. Specified nBytes is length of bytes
    // array. Returns -1 if fails to decode any of bytes. Returns number of bytes decoded
    // on success. Maximum number of bytes decoded will be equal to nBytes. It is assumed
    // that specified string is '\0' terminated.
    int hexStringToBytes(const char* str, byte* bytes, int nBytes) {
        int nDecoded {0};
        for (int i {0}; str[i] != '\0' && nDecoded < nBytes; i += 2, nDecoded += 1) {
            if (str[i + 1] != '\0') {
                int m {charToInt(str[i])};
                int n {charToInt(str[i + 1])};
                if (m != -1 && n != -1) {
                    bytes[nDecoded] = (m << 4) | n;
                } else {
                    return -1;
                }
            } else {
                return -1;
            }
        }
        return nDecoded;
    }
    
    int main(int argc, char* argv[]) {
        if (argc < 2) {
            return 1;
        }
    
        byte bytes[0x100];
        int ret {hexStringToBytes(argv[1], bytes, 0x100)};
        if (ret < 0) {
            return 1;
        }
        std::cout << "number of bytes: " << ret << "\n" << std::hex;
        for (int i {0}; i < ret; ++i) {
            if (bytes[i] < 0x10) {
                std::cout << "0";
            }
            std::cout << (bytes[i] & 0xff);
        }
        std::cout << "\n";
    
        return 0;
    }
    
    0 讨论(0)
提交回复
热议问题