Strip all non-numeric characters (except for “.”) from a string in Python

后端 未结 6 1458
死守一世寂寞
死守一世寂寞 2020-12-07 22:26

I\'ve got a pretty good working snippit of code, but I was wondering if anyone has any better suggestions on how to do this:

val = \'\'.join([c for c in val          


        
6条回答
  •  囚心锁ツ
    2020-12-07 23:04

    If the set of characters were larger, using sets as below might be faster. As it is, this is a bit slower than a.py.

    dec = set('1234567890.')
    
    a = '27893jkasnf8u2qrtq2ntkjh8934yt8.298222rwagasjkijw'
    for i in xrange(1000000):
        ''.join(ch for ch in a if ch in dec)

    At least on my system, you can save a tiny bit of time (and memory if your string were long enough to matter) by using a generator expression instead of a list comprehension in a.py:

    a = '27893jkasnf8u2qrtq2ntkjh8934yt8.298222rwagasjkijw'
    for i in xrange(1000000):
        ''.join(c for c in a if c in '1234567890.')

    Oh, and here's the fastest way I've found by far on this test string (much faster than regex) if you are doing this many, many times and are willing to put up with the overhead of building a couple of character tables.

    chrs = ''.join(chr(i) for i in xrange(256))
    deletable = ''.join(ch for ch in chrs if ch not in '1234567890.')
    
    a = '27893jkasnf8u2qrtq2ntkjh8934yt8.298222rwagasjkijw'
    for i in xrange(1000000):
        a.translate(chrs, deletable)

    On my system, that runs in ~1.0 seconds where the regex b.py runs in ~4.3 seconds.

提交回复
热议问题