How do I check if a string is a number (float)?

后端 未结 30 3845
暗喜
暗喜 2020-11-21 05:16

What is the best possible way to check if a string can be represented as a number in Python?

The function I currently have right now is:

def is_numb         


        
30条回答
  •  不思量自难忘°
    2020-11-21 05:33

    I wanted to see which method is fastest. Overall the best and most consistent results were given by the check_replace function. The fastest results were given by the check_exception function, but only if there was no exception fired - meaning its code is the most efficient, but the overhead of throwing an exception is quite large.

    Please note that checking for a successful cast is the only method which is accurate, for example, this works with check_exception but the other two test functions will return False for a valid float:

    huge_number = float('1e+100')
    

    Here is the benchmark code:

    import time, re, random, string
    
    ITERATIONS = 10000000
    
    class Timer:    
        def __enter__(self):
            self.start = time.clock()
            return self
        def __exit__(self, *args):
            self.end = time.clock()
            self.interval = self.end - self.start
    
    def check_regexp(x):
        return re.compile("^\d*\.?\d*$").match(x) is not None
    
    def check_replace(x):
        return x.replace('.','',1).isdigit()
    
    def check_exception(s):
        try:
            float(s)
            return True
        except ValueError:
            return False
    
    to_check = [check_regexp, check_replace, check_exception]
    
    print('preparing data...')
    good_numbers = [
        str(random.random() / random.random()) 
        for x in range(ITERATIONS)]
    
    bad_numbers = ['.' + x for x in good_numbers]
    
    strings = [
        ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(random.randint(1,10)))
        for x in range(ITERATIONS)]
    
    print('running test...')
    for func in to_check:
        with Timer() as t:
            for x in good_numbers:
                res = func(x)
        print('%s with good floats: %s' % (func.__name__, t.interval))
        with Timer() as t:
            for x in bad_numbers:
                res = func(x)
        print('%s with bad floats: %s' % (func.__name__, t.interval))
        with Timer() as t:
            for x in strings:
                res = func(x)
        print('%s with strings: %s' % (func.__name__, t.interval))
    

    Here are the results with Python 2.7.10 on a 2017 MacBook Pro 13:

    check_regexp with good floats: 12.688639
    check_regexp with bad floats: 11.624862
    check_regexp with strings: 11.349414
    check_replace with good floats: 4.419841
    check_replace with bad floats: 4.294909
    check_replace with strings: 4.086358
    check_exception with good floats: 3.276668
    check_exception with bad floats: 13.843092
    check_exception with strings: 15.786169
    

    Here are the results with Python 3.6.5 on a 2017 MacBook Pro 13:

    check_regexp with good floats: 13.472906000000009
    check_regexp with bad floats: 12.977665000000016
    check_regexp with strings: 12.417542999999995
    check_replace with good floats: 6.011045999999993
    check_replace with bad floats: 4.849356
    check_replace with strings: 4.282754000000011
    check_exception with good floats: 6.039081999999979
    check_exception with bad floats: 9.322753000000006
    check_exception with strings: 9.952595000000002
    

    Here are the results with PyPy 2.7.13 on a 2017 MacBook Pro 13:

    check_regexp with good floats: 2.693217
    check_regexp with bad floats: 2.744819
    check_regexp with strings: 2.532414
    check_replace with good floats: 0.604367
    check_replace with bad floats: 0.538169
    check_replace with strings: 0.598664
    check_exception with good floats: 1.944103
    check_exception with bad floats: 2.449182
    check_exception with strings: 2.200056
    

提交回复
热议问题