Check if a string is hexadecimal

后端 未结 11 1414
说谎
说谎 2020-12-13 06:14

I know the easiest way is using a regular expression, but I wonder if there are other ways to do this check.

Why do I need this? I am writing a Python script that re

相关标签:
11条回答
  • 2020-12-13 06:36

    I know the op mentioned regular expressions, but I wanted to contribute such a solution for completeness' sake:

    def is_hex(s):
        return re.fullmatch(r"^[0-9a-fA-F]$", s or "") is not None
    

    Performance

    In order to evaluate the performance of the different solutions proposed here, I used Python's timeit module. The input strings are generated randomly for three different lengths, 10, 100, 1000:

    s=''.join(random.choice('0123456789abcdef') for _ in range(10))
    

    Levon's solutions:

    # int(s, 16)
      10: 0.257451018987922
     100: 0.40081690801889636
    1000: 1.8926858339982573
    
    # all(_ in string.hexdigits for _ in s)
      10:  1.2884491360164247
     100: 10.047717947978526
    1000: 94.35805322701344
    

    Other answers are variations of these two. Using a regular expression:

    # re.fullmatch(r'^[0-9a-fA-F]$', s or '')
      10: 0.725040541990893
     100: 0.7184272820013575
    1000: 0.7190397029917222
    

    Picking the right solution thus depends on the length on the input string and whether exceptions can be handled safely. The regular expression certainly handles large strings much faster (and won't throw a ValueError on overflow), but int() is the winner for shorter strings.

    0 讨论(0)
  • 2020-12-13 06:36

    Using Python you are looking to determine True or False, I would use eumero's is_hex method over Levon's method one. The following code contains a gotcha...

    if int(input_string, 16):
        print 'it is hex'
    else:
        print 'it is not hex'
    

    It incorrectly reports the string '00' as not hex because zero evaluates to False.

    0 讨论(0)
  • 2020-12-13 06:39

    Since all the regular expression above took about the same amount of time, I would guess that most of the time was related to converting the string to a regular expression. Below is the data I got when pre-compiling the regular expression.

    int_hex  
    0.000800 ms 10  
    0.001300 ms 100  
    0.008200 ms 1000  
    
    all_hex  
    0.003500 ms 10  
    0.015200 ms 100  
    0.112000 ms 1000  
    
    fullmatch_hex  
    0.001800 ms 10  
    0.001200 ms 100  
    0.005500 ms 1000
    
    0 讨论(0)
  • 2020-12-13 06:46

    (1) Using int() works nicely for this, and Python does all the checking for you :)

    int('00480065006C006C006F00200077006F0072006C00640021', 16)
    6896377547970387516320582441726837832153446723333914657L
    

    will work. In case of failure you will receive a ValueError exception.

    Short example:

    int('af', 16)
    175
    
    int('ah', 16)
     ...
    ValueError: invalid literal for int() with base 16: 'ah'
    

    (2) An alternative would be to traverse the data and make sure all characters fall within the range of 0..9 and a-f/A-F. string.hexdigits ('0123456789abcdefABCDEF') is useful for this as it contains both upper and lower case digits.

    import string
    all(c in string.hexdigits for c in s)
    

    will return either True or False based on the validity of your data in string s.

    Short example:

    s = 'af'
    all(c in string.hexdigits for c in s)
    True
    
    s = 'ah'
    all(c in string.hexdigits for c in s)
    False
    

    Notes:

    As @ScottGriffiths notes correctly in a comment below, the int() approach will work if your string contains 0x at the start, while the character-by-character check will fail with this. Also, checking against a set of characters is faster than a string of characters, but it is doubtful this will matter with short SMS strings, unless you process many (many!) of them in sequence in which case you could convert stringhexditigs to a set with set(string.hexdigits).

    0 讨论(0)
  • 2020-12-13 06:46

    You can:

    1. test whether the string contains only hexadecimal digits (0…9,A…F)
    2. try to convert the string to integer and see whether it fails.

    Here is the code:

    import string
    def is_hex(s):
         hex_digits = set(string.hexdigits)
         # if s is long, then it is faster to check against a set
         return all(c in hex_digits for c in s)
    
    def is_hex(s):
        try:
            int(s, 16)
            return True
        except ValueError:
            return False
    
    0 讨论(0)
提交回复
热议问题