How do I check if a string is a number (float)?

后端 未结 30 3825
暗喜
暗喜 2020-11-21 05:16

What is the best possible way to check if a string can be represented as a number in Python?

The function I currently have right now is:

def is_numb         


        
相关标签:
30条回答
  • 2020-11-21 05:49

    Just Mimic C#

    In C# there are two different functions that handle parsing of scalar values:

    • Float.Parse()
    • Float.TryParse()

    float.parse():

    def parse(string):
        try:
            return float(string)
        except Exception:
            throw TypeError
    

    Note: If you're wondering why I changed the exception to a TypeError, here's the documentation.

    float.try_parse():

    def try_parse(string, fail=None):
        try:
            return float(string)
        except Exception:
            return fail;
    

    Note: You don't want to return the boolean 'False' because that's still a value type. None is better because it indicates failure. Of course, if you want something different you can change the fail parameter to whatever you want.

    To extend float to include the 'parse()' and 'try_parse()' you'll need to monkeypatch the 'float' class to add these methods.

    If you want respect pre-existing functions the code should be something like:

    def monkey_patch():
        if(!hasattr(float, 'parse')):
            float.parse = parse
        if(!hasattr(float, 'try_parse')):
            float.try_parse = try_parse
    

    SideNote: I personally prefer to call it Monkey Punching because it feels like I'm abusing the language when I do this but YMMV.

    Usage:

    float.parse('giggity') // throws TypeException
    float.parse('54.3') // returns the scalar value 54.3
    float.tryParse('twank') // returns None
    float.tryParse('32.2') // returns the scalar value 32.2
    

    And the great Sage Pythonas said to the Holy See Sharpisus, "Anything you can do I can do better; I can do anything better than you."

    0 讨论(0)
  • 2020-11-21 05:51

    Try this.

     def is_number(var):
        try:
           if var == int(var):
                return True
        except Exception:
            return False
    
    0 讨论(0)
  • 2020-11-21 05:52

    Which, not only is ugly and slow

    I'd dispute both.

    A regex or other string parsing method would be uglier and slower.

    I'm not sure that anything much could be faster than the above. It calls the function and returns. Try/Catch doesn't introduce much overhead because the most common exception is caught without an extensive search of stack frames.

    The issue is that any numeric conversion function has two kinds of results

    • A number, if the number is valid
    • A status code (e.g., via errno) or exception to show that no valid number could be parsed.

    C (as an example) hacks around this a number of ways. Python lays it out clearly and explicitly.

    I think your code for doing this is perfect.

    0 讨论(0)
  • 2020-11-21 05:52

    how about this:

    '3.14'.replace('.','',1).isdigit()
    

    which will return true only if there is one or no '.' in the string of digits.

    '3.14.5'.replace('.','',1).isdigit()
    

    will return false

    edit: just saw another comment ... adding a .replace(badstuff,'',maxnum_badstuff) for other cases can be done. if you are passing salt and not arbitrary condiments (ref:xkcd#974) this will do fine :P

    0 讨论(0)
  • 2020-11-21 05:52

    For strings of non-numbers, try: except: is actually slower than regular expressions. For strings of valid numbers, regex is slower. So, the appropriate method depends on your input.

    If you find that you are in a performance bind, you can use a new third-party module called fastnumbers that provides a function called isfloat. Full disclosure, I am the author. I have included its results in the timings below.


    from __future__ import print_function
    import timeit
    
    prep_base = '''\
    x = 'invalid'
    y = '5402'
    z = '4.754e3'
    '''
    
    prep_try_method = '''\
    def is_number_try(val):
        try:
            float(val)
            return True
        except ValueError:
            return False
    
    '''
    
    prep_re_method = '''\
    import re
    float_match = re.compile(r'[-+]?\d*\.?\d+(?:[eE][-+]?\d+)?$').match
    def is_number_re(val):
        return bool(float_match(val))
    
    '''
    
    fn_method = '''\
    from fastnumbers import isfloat
    
    '''
    
    print('Try with non-number strings', timeit.timeit('is_number_try(x)',
        prep_base + prep_try_method), 'seconds')
    print('Try with integer strings', timeit.timeit('is_number_try(y)',
        prep_base + prep_try_method), 'seconds')
    print('Try with float strings', timeit.timeit('is_number_try(z)',
        prep_base + prep_try_method), 'seconds')
    print()
    print('Regex with non-number strings', timeit.timeit('is_number_re(x)',
        prep_base + prep_re_method), 'seconds')
    print('Regex with integer strings', timeit.timeit('is_number_re(y)',
        prep_base + prep_re_method), 'seconds')
    print('Regex with float strings', timeit.timeit('is_number_re(z)',
        prep_base + prep_re_method), 'seconds')
    print()
    print('fastnumbers with non-number strings', timeit.timeit('isfloat(x)',
        prep_base + 'from fastnumbers import isfloat'), 'seconds')
    print('fastnumbers with integer strings', timeit.timeit('isfloat(y)',
        prep_base + 'from fastnumbers import isfloat'), 'seconds')
    print('fastnumbers with float strings', timeit.timeit('isfloat(z)',
        prep_base + 'from fastnumbers import isfloat'), 'seconds')
    print()
    

    Try with non-number strings 2.39108395576 seconds
    Try with integer strings 0.375686168671 seconds
    Try with float strings 0.369210958481 seconds
    
    Regex with non-number strings 0.748660802841 seconds
    Regex with integer strings 1.02021503448 seconds
    Regex with float strings 1.08564686775 seconds
    
    fastnumbers with non-number strings 0.174362897873 seconds
    fastnumbers with integer strings 0.179651021957 seconds
    fastnumbers with float strings 0.20222902298 seconds
    

    As you can see

    • try: except: was fast for numeric input but very slow for an invalid input
    • regex is very efficient when the input is invalid
    • fastnumbers wins in both cases
    0 讨论(0)
  • 2020-11-21 05:52

    I think your solution is fine, but there is a correct regexp implementation.

    There does seem to be a lot of regexp hate towards these answers which I think is unjustified, regexps can be reasonably clean and correct and fast. It really depends on what you're trying to do. The original question was how can you "check if a string can be represented as a number (float)" (as per your title). Presumably you would want to use the numeric/float value once you've checked that it's valid, in which case your try/except makes a lot of sense. But if, for some reason, you just want to validate that a string is a number then a regex also works fine, but it's hard to get correct. I think most of the regex answers so far, for example, do not properly parse strings without an integer part (such as ".7") which is a float as far as python is concerned. And that's slightly tricky to check for in a single regex where the fractional portion is not required. I've included two regex to show this.

    It does raise the interesting question as to what a "number" is. Do you include "inf" which is valid as a float in python? Or do you include numbers that are "numbers" but maybe can't be represented in python (such as numbers that are larger than the float max).

    There's also ambiguities in how you parse numbers. For example, what about "--20"? Is this a "number"? Is this a legal way to represent "20"? Python will let you do "var = --20" and set it to 20 (though really this is because it treats it as an expression), but float("--20") does not work.

    Anyways, without more info, here's a regex that I believe covers all the ints and floats as python parses them.

    # Doesn't properly handle floats missing the integer part, such as ".7"
    SIMPLE_FLOAT_REGEXP = re.compile(r'^[-+]?[0-9]+\.?[0-9]+([eE][-+]?[0-9]+)?$')
    # Example "-12.34E+56"      # sign (-)
                                #     integer (12)
                                #           mantissa (34)
                                #                    exponent (E+56)
    
    # Should handle all floats
    FLOAT_REGEXP = re.compile(r'^[-+]?([0-9]+|[0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?$')
    # Example "-12.34E+56"      # sign (-)
                                #     integer (12)
                                #           OR
                                #             int/mantissa (12.34)
                                #                            exponent (E+56)
    
    def is_float(str):
      return True if FLOAT_REGEXP.match(str) else False
    

    Some example test values:

    True  <- +42
    True  <- +42.42
    False <- +42.42.22
    True  <- +42.42e22
    True  <- +42.42E-22
    False <- +42.42e-22.8
    True  <- .42
    False <- 42nope
    

    Running the benchmarking code in @ron-reiter's answer shows that this regex is actually faster than the normal regex and is much faster at handling bad values than the exception, which makes some sense. Results:

    check_regexp with good floats: 18.001921
    check_regexp with bad floats: 17.861423
    check_regexp with strings: 17.558862
    check_correct_regexp with good floats: 11.04428
    check_correct_regexp with bad floats: 8.71211
    check_correct_regexp with strings: 8.144161
    check_replace with good floats: 6.020597
    check_replace with bad floats: 5.343049
    check_replace with strings: 5.091642
    check_exception with good floats: 5.201605
    check_exception with bad floats: 23.921864
    check_exception with strings: 23.755481
    
    0 讨论(0)
提交回复
热议问题