Regex for getting all digits in a string after a character

倖福魔咒の 提交于 2020-02-15 08:01:07

问题


I am trying to parse the following string and return all digits after the last square bracket:

C9: Title of object (foo, bar) [ch1, CH12,c03,4]

So the result should be:

1,12,03,4

The string and digits will change. The important thing is to get the digits after the '[' regardless of what character (if any) precede it. (I need this in python so no atomic groups either!) I have tried everything I can think of including:

 \[.*?(\d) = matches '1' only
 \[.*(\d) = matches '4' only
 \[*?(\d) = matches include '9' from the beginning

etc

Any help is greatly appreciated!

EDIT: I also need to do this without using str.split() too.


回答1:


You can rather find all digits in the substring after the last [ bracket:

>>> s = 'C9: Title of object (fo[ 123o, bar) [ch1, CH12,c03,4]'
>>> # Get substring after the last '['.
>>> target_string = s.rsplit('[', 1)[1]
>>>
>>> re.findall(r'\d+', target_string)
['1', '12', '03', '4']

If you can't use split, then this one would work with look-ahead assertion:

>>> s = 'C9: Title of object (fo[ 123o, bar) [ch1, CH12,c03,4]'
>>> re.findall(r'\d+(?=[^[]+$)', s)
['1', '12', '03', '4']

This finds all digits, which are followed by only non-[ characters till the end.




回答2:


It may help to use the non-greedy ?. For example:

\[.*?(\d*?),.*?(\d*?),.*?(\d*?),.*?(\d*?)\]

And, here's how it works (from https://regex101.com/r/jP7hM3/1):

"\[.*?(\d*?),.*?(\d*?),.*?(\d*?),.*?(\d*?)\]"
\[ matches the character [ literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
1st Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
, matches the character , literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
2nd Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
, matches the character , literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
3rd Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
, matches the character , literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
4th Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
\] matches the character ] literally

Although - I have to agree with others... This is a regex solution, but its not a very pythonic solution.



来源:https://stackoverflow.com/questions/34338341/regex-for-getting-all-digits-in-a-string-after-a-character

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!