How to extract numbers from a string in Python?

后端未结

关注

 17  2039

I would extract all the numbers contained in a string. Which is the better suited for the purpose, regular expressions or the isdigit() method?

Example:

相关标签:

17条回答

小蘑菇

2020-11-21 05:53
To catch different patterns it is helpful to query with different patterns.

Setup all the patterns that catch different number patterns of interest:

(finds commas) 12,300 or 12,300.00

'[\d]+[.,\d]+'

(finds floats) 0.123 or .123

'[\d]*[.][\d]+'

(finds integers) 123

'[\d]+'

Combine with pipe ( | ) into one pattern with multiple or conditionals.

(Note: Put complex patterns first else simple patterns will return chunks of the complex catch instead of the complex catch returning the full catch).
```
p = '[\d]+[.,\d]+|[\d]*[.][\d]+|[\d]+'
```
Below, we'll confirm a pattern is present with re.search(), then return an iterable list of catches. Finally, we'll print each catch using bracket notation to subselect the match object return value from the match object.
```
s = 'he33llo 42 I\'m a 32 string 30 444.4 12,001'

if re.search(p, s) is not None:
    for catch in re.finditer(p, s):
        print(catch[0]) # catch is a match object
```
Returns:
```
33
42
32
30
444.4
12,001
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

慢半拍i

2020-11-21 05:54

The best option I found is below. It will extract a number and can eliminate any type of char.

def extract_nbr(input_str):
    if input_str is None or input_str == '':
        return 0

    out_number = ''
    for ele in input_str:
        if ele.isdigit():
            out_number += ele
    return float(out_number)

0 讨论(0)

臣服心动

2020-11-21 05:55
I am amazed to see that no one has yet mentioned the usage of itertools.groupby as an alternative to achieve this.

You may use itertools.groupby() along with str.isdigit() in order to extract numbers from string as:
```
from itertools import groupby
my_str = "hello 12 hi 89"

l = [int(''.join(i)) for is_digit, i in groupby(my_str, str.isdigit) if is_digit]
```
The value hold by l will be:
```
[12, 89]
```
PS: This is just for illustration purpose to show that as an alternative we could also use groupby to achieve this. But this is not a recommended solution. If you want to achieve this, you should be using accepted answer of fmark based on using list comprehension with str.isdigit as filter.
0 讨论(0)
发布评论:

提交评论
- 加载中...

清歌不尽

2020-11-21 05:57

This answer also contains the case when the number is float in the string

def get_first_nbr_from_str(input_str):
    '''
    :param input_str: strings that contains digit and words
    :return: the number extracted from the input_str
    demo:
    'ab324.23.123xyz': 324.23
    '.5abc44': 0.5
    '''
    if not input_str and not isinstance(input_str, str):
        return 0
    out_number = ''
    for ele in input_str:
        if (ele == '.' and '.' not in out_number) or ele.isdigit():
            out_number += ele
        elif out_number:
            break
    return float(out_number)

0 讨论(0)

执念已碎

2020-11-21 05:57
I am just adding this answer because no one added one using Exception handling and because this also works for floats
```
a = []
line = "abcd 1234 efgh 56.78 ij"
for word in line.split():
    try:
        a.append(float(word))
    except ValueError:
        pass
print(a)
```
Output :
```
[1234.0, 56.78]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
不思量自难忘°

2020-11-21 05:59
I'd use a regexp :
```
>>> import re
>>> re.findall(r'\d+', 'hello 42 I\'m a 32 string 30')
['42', '32', '30']
```
This would also match 42 from bla42bla. If you only want numbers delimited by word boundaries (space, period, comma), you can use \b :
```
>>> re.findall(r'\b\d+\b', 'he33llo 42 I\'m a 32 string 30')
['42', '32', '30']
```
To end up with a list of numbers instead of a list of strings:
```
>>> [int(s) for s in re.findall(r'\b\d+\b', 'he33llo 42 I\'m a 32 string 30')]
[42, 32, 30]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

How to extract numbers from a string in Python?

Setup all the patterns that catch different number patterns of interest:

(finds commas) 12,300 or 12,300.00

(finds floats) 0.123 or .123

(finds integers) 123

Combine with pipe ( | ) into one pattern with multiple or conditionals.