How do I check if a string is a number (float)?

后端未结

关注

 30  3987

What is the best possible way to check if a string can be represented as a number in Python?

The function I currently have right now is:

def is_numb


                      
              相关标签:


      
      
        
          30条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北荒        
                
              
                            
                2020-11-21 05:49
              
            
            
                                                                       
Just Mimic C#

In C# there are two different functions that handle parsing of scalar values:


Float.Parse()
Float.TryParse()


float.parse():

def parse(string):
    try:
        return float(string)
    except Exception:
        throw TypeError


Note: If you're wondering why I changed the exception to a TypeError, here's the documentation.

float.try_parse():

def try_parse(string, fail=None):
    try:
        return float(string)
    except Exception:
        return fail;


Note: You don't want to return the boolean 'False' because that's still a value type. None is better because it indicates failure. Of course, if you want something different you can change the fail parameter to whatever you want.

To extend float to include the 'parse()' and 'try_parse()' you'll need to monkeypatch the 'float' class to add these methods.

If you want respect pre-existing functions the code should be something like:

def monkey_patch():
    if(!hasattr(float, 'parse')):
        float.parse = parse
    if(!hasattr(float, 'try_parse')):
        float.try_parse = try_parse


SideNote: I personally prefer to call it Monkey Punching because it feels like I'm abusing the language when I do this but YMMV.

Usage:

float.parse('giggity') // throws TypeException
float.parse('54.3') // returns the scalar value 54.3
float.tryParse('twank') // returns None
float.tryParse('32.2') // returns the scalar value 32.2


And the great Sage Pythonas said to the Holy See Sharpisus, "Anything you can do I can do better; I can do anything better than you."
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  暗喜        
                
              
                            
                2020-11-21 05:51
              
            
            
                                                                       
Try this.

 def is_number(var):
    try:
       if var == int(var):
            return True
    except Exception:
        return False

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  有刺的猬        
                
              
                            
                2020-11-21 05:52
              
            
            
                                                                       

  Which, not only is ugly and slow


I'd dispute both.

A regex or other string parsing method would be uglier and slower.  

I'm not sure that anything much could be faster than the above.  It calls the function and returns.  Try/Catch doesn't introduce much overhead because the most common exception is caught without an extensive search of stack frames.

The issue is that any numeric conversion function has two kinds of results


A number, if the number is valid
A status code (e.g., via errno) or exception to show that no valid number could be parsed.


C (as an example) hacks around this a number of ways.  Python lays it out clearly and explicitly.

I think your code for doing this is perfect.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  -上瘾入骨i        
                
              
                            
                2020-11-21 05:52
              
            
            
                                                                       
how about this:

'3.14'.replace('.','',1).isdigit()


which will return true only if there is one or no '.' in the string of digits.

'3.14.5'.replace('.','',1).isdigit()


will return false

edit: just saw another comment ...
adding a .replace(badstuff,'',maxnum_badstuff) for other cases can be done. if you are passing salt and not arbitrary condiments (ref:xkcd#974) this will do fine :P
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  春和景丽        
                
              
                            
                2020-11-21 05:52
              
            
            
                                                                       
For strings of non-numbers, try: except: is actually slower than regular expressions.  For strings of valid numbers, regex is slower.  So, the appropriate method depends on your input. 

If you find that you are in a performance bind, you can use a new third-party module called fastnumbers that provides a function called isfloat.  Full disclosure, I am the author.  I have included its results in the timings below.



from __future__ import print_function
import timeit

prep_base = '''\
x = 'invalid'
y = '5402'
z = '4.754e3'
'''

prep_try_method = '''\
def is_number_try(val):
    try:
        float(val)
        return True
    except ValueError:
        return False

'''

prep_re_method = '''\
import re
float_match = re.compile(r'[-+]?\d*\.?\d+(?:[eE][-+]?\d+)?$').match
def is_number_re(val):
    return bool(float_match(val))

'''

fn_method = '''\
from fastnumbers import isfloat

'''

print('Try with non-number strings', timeit.timeit('is_number_try(x)',
    prep_base + prep_try_method), 'seconds')
print('Try with integer strings', timeit.timeit('is_number_try(y)',
    prep_base + prep_try_method), 'seconds')
print('Try with float strings', timeit.timeit('is_number_try(z)',
    prep_base + prep_try_method), 'seconds')
print()
print('Regex with non-number strings', timeit.timeit('is_number_re(x)',
    prep_base + prep_re_method), 'seconds')
print('Regex with integer strings', timeit.timeit('is_number_re(y)',
    prep_base + prep_re_method), 'seconds')
print('Regex with float strings', timeit.timeit('is_number_re(z)',
    prep_base + prep_re_method), 'seconds')
print()
print('fastnumbers with non-number strings', timeit.timeit('isfloat(x)',
    prep_base + 'from fastnumbers import isfloat'), 'seconds')
print('fastnumbers with integer strings', timeit.timeit('isfloat(y)',
    prep_base + 'from fastnumbers import isfloat'), 'seconds')
print('fastnumbers with float strings', timeit.timeit('isfloat(z)',
    prep_base + 'from fastnumbers import isfloat'), 'seconds')
print()




Try with non-number strings 2.39108395576 seconds
Try with integer strings 0.375686168671 seconds
Try with float strings 0.369210958481 seconds

Regex with non-number strings 0.748660802841 seconds
Regex with integer strings 1.02021503448 seconds
Regex with float strings 1.08564686775 seconds

fastnumbers with non-number strings 0.174362897873 seconds
fastnumbers with integer strings 0.179651021957 seconds
fastnumbers with float strings 0.20222902298 seconds


As you can see


try: except: was fast for numeric input but very slow for an invalid input
regex is very efficient when the input is invalid
fastnumbers wins in both cases

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  星月不相逢        
                
              
                            
                2020-11-21 05:52
              
            
            
                                                                       
I think your solution is fine, but there is a correct regexp implementation.

There does seem to be a lot of regexp hate towards these answers which I think is unjustified, regexps can be reasonably clean and correct and fast.  It really depends on what you're trying to do.  The original question was how can you "check if a string can be represented as a number (float)" (as per your title).  Presumably you would want to use the numeric/float value once you've checked that it's valid, in which case your try/except makes a lot of sense.  But if, for some reason, you just want to validate that a string is a number then a regex also works fine, but it's hard to get correct.  I think most of the regex answers so far, for example, do not properly parse strings without an integer part (such as ".7") which is a float as far as python is concerned.  And that's slightly tricky to check for in a single regex where the fractional portion is not required.  I've included two regex to show this.

It does raise the interesting question as to what a "number" is.  Do you include "inf" which is valid as a float in python?  Or do you include numbers that are "numbers" but maybe can't be represented in python (such as numbers that are larger than the float max).

There's also ambiguities in how you parse numbers.  For example, what about "--20"?  Is this a "number"?  Is this a legal way to represent "20"?  Python will let you do "var = --20" and set it to 20 (though really this is because it treats it as an expression), but float("--20") does not work.

Anyways, without more info, here's a regex that I believe covers all the ints and floats as python parses them.

# Doesn't properly handle floats missing the integer part, such as ".7"
SIMPLE_FLOAT_REGEXP = re.compile(r'^[-+]?[0-9]+\.?[0-9]+([eE][-+]?[0-9]+)?$')
# Example "-12.34E+56"      # sign (-)
                            #     integer (12)
                            #           mantissa (34)
                            #                    exponent (E+56)

# Should handle all floats
FLOAT_REGEXP = re.compile(r'^[-+]?([0-9]+|[0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?$')
# Example "-12.34E+56"      # sign (-)
                            #     integer (12)
                            #           OR
                            #             int/mantissa (12.34)
                            #                            exponent (E+56)

def is_float(str):
  return True if FLOAT_REGEXP.match(str) else False


Some example test values:

True  <- +42
True  <- +42.42
False <- +42.42.22
True  <- +42.42e22
True  <- +42.42E-22
False <- +42.42e-22.8
True  <- .42
False <- 42nope


Running the benchmarking code in @ron-reiter's answer shows that this regex is actually faster than the normal regex and is much faster at handling bad values than the exception, which makes some sense.  Results:

check_regexp with good floats: 18.001921
check_regexp with bad floats: 17.861423
check_regexp with strings: 17.558862
check_correct_regexp with good floats: 11.04428
check_correct_regexp with bad floats: 8.71211
check_correct_regexp with strings: 8.144161
check_replace with good floats: 6.020597
check_replace with bad floats: 5.343049
check_replace with strings: 5.091642
check_exception with good floats: 5.201605
check_exception with bad floats: 23.921864
check_exception with strings: 23.755481

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
4
5
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


自定义标题
段落格式
字体
字号
代码语言
点击上传
x
                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复