How do I check if a string is a number (float)?

后端 未结 30 3973
暗喜
暗喜 2020-11-21 05:16

What is the best possible way to check if a string can be represented as a number in Python?

The function I currently have right now is:

def is_numb         


        
30条回答
  •  星月不相逢
    2020-11-21 05:52

    I think your solution is fine, but there is a correct regexp implementation.

    There does seem to be a lot of regexp hate towards these answers which I think is unjustified, regexps can be reasonably clean and correct and fast. It really depends on what you're trying to do. The original question was how can you "check if a string can be represented as a number (float)" (as per your title). Presumably you would want to use the numeric/float value once you've checked that it's valid, in which case your try/except makes a lot of sense. But if, for some reason, you just want to validate that a string is a number then a regex also works fine, but it's hard to get correct. I think most of the regex answers so far, for example, do not properly parse strings without an integer part (such as ".7") which is a float as far as python is concerned. And that's slightly tricky to check for in a single regex where the fractional portion is not required. I've included two regex to show this.

    It does raise the interesting question as to what a "number" is. Do you include "inf" which is valid as a float in python? Or do you include numbers that are "numbers" but maybe can't be represented in python (such as numbers that are larger than the float max).

    There's also ambiguities in how you parse numbers. For example, what about "--20"? Is this a "number"? Is this a legal way to represent "20"? Python will let you do "var = --20" and set it to 20 (though really this is because it treats it as an expression), but float("--20") does not work.

    Anyways, without more info, here's a regex that I believe covers all the ints and floats as python parses them.

    # Doesn't properly handle floats missing the integer part, such as ".7"
    SIMPLE_FLOAT_REGEXP = re.compile(r'^[-+]?[0-9]+\.?[0-9]+([eE][-+]?[0-9]+)?$')
    # Example "-12.34E+56"      # sign (-)
                                #     integer (12)
                                #           mantissa (34)
                                #                    exponent (E+56)
    
    # Should handle all floats
    FLOAT_REGEXP = re.compile(r'^[-+]?([0-9]+|[0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?$')
    # Example "-12.34E+56"      # sign (-)
                                #     integer (12)
                                #           OR
                                #             int/mantissa (12.34)
                                #                            exponent (E+56)
    
    def is_float(str):
      return True if FLOAT_REGEXP.match(str) else False
    

    Some example test values:

    True  <- +42
    True  <- +42.42
    False <- +42.42.22
    True  <- +42.42e22
    True  <- +42.42E-22
    False <- +42.42e-22.8
    True  <- .42
    False <- 42nope
    

    Running the benchmarking code in @ron-reiter's answer shows that this regex is actually faster than the normal regex and is much faster at handling bad values than the exception, which makes some sense. Results:

    check_regexp with good floats: 18.001921
    check_regexp with bad floats: 17.861423
    check_regexp with strings: 17.558862
    check_correct_regexp with good floats: 11.04428
    check_correct_regexp with bad floats: 8.71211
    check_correct_regexp with strings: 8.144161
    check_replace with good floats: 6.020597
    check_replace with bad floats: 5.343049
    check_replace with strings: 5.091642
    check_exception with good floats: 5.201605
    check_exception with bad floats: 23.921864
    check_exception with strings: 23.755481
    

提交回复
热议问题