How to extract numbers from a string in Python?

后端 未结 17 2040
星月不相逢
星月不相逢 2020-11-21 05:19

I would extract all the numbers contained in a string. Which is the better suited for the purpose, regular expressions or the isdigit() method?

Example:

相关标签:
17条回答
  • 2020-11-21 06:00
    # extract numbers from garbage string:
    s = '12//n,_@#$%3.14kjlw0xdadfackvj1.6e-19&*ghn334'
    newstr = ''.join((ch if ch in '0123456789.-e' else ' ') for ch in s)
    listOfNumbers = [float(i) for i in newstr.split()]
    print(listOfNumbers)
    [12.0, 3.14, 0.0, 1.6e-19, 334.0]
    
    0 讨论(0)
  • 2020-11-21 06:00

    For phone numbers you can simply exclude all non-digit characters with \D in regex:

    import re
    
    phone_number = '(619) 459-3635'
    phone_number = re.sub(r"\D", "", phone_number)
    print(phone_number)
    
    0 讨论(0)
  • 2020-11-21 06:01

    I was looking for a solution to remove strings' masks, specifically from Brazilian phones numbers, this post not answered but inspired me. This is my solution:

    >>> phone_number = '+55(11)8715-9877'
    >>> ''.join([n for n in phone_number if n.isdigit()])
    '551187159877'
    
    0 讨论(0)
  • 2020-11-21 06:06

    Using Regex below is the way

    lines = "hello 12 hi 89"
    import re
    output = []
    #repl_str = re.compile('\d+.?\d*')
    repl_str = re.compile('^\d+$')
    #t = r'\d+.?\d*'
    line = lines.split()
    for word in line:
            match = re.search(repl_str, word)
            if match:
                output.append(float(match.group()))
    print (output)
    

    with findall re.findall(r'\d+', "hello 12 hi 89")

    ['12', '89']
    

    re.findall(r'\b\d+\b', "hello 12 hi 89 33F AC 777")

    ['12', '89', '777']
    
    0 讨论(0)
  • 2020-11-21 06:07

    @jmnas, I liked your answer, but it didn't find floats. I'm working on a script to parse code going to a CNC mill and needed to find both X and Y dimensions that can be integers or floats, so I adapted your code to the following. This finds int, float with positive and negative vals. Still doesn't find hex formatted values but you could add "x" and "A" through "F" to the num_char tuple and I think it would parse things like '0x23AC'.

    s = 'hello X42 I\'m a Y-32.35 string Z30'
    xy = ("X", "Y")
    num_char = (".", "+", "-")
    
    l = []
    
    tokens = s.split()
    for token in tokens:
    
        if token.startswith(xy):
            num = ""
            for char in token:
                # print(char)
                if char.isdigit() or (char in num_char):
                    num = num + char
    
            try:
                l.append(float(num))
            except ValueError:
                pass
    
    print(l)
    
    0 讨论(0)
提交回复
热议问题