RegEx only matches last item

半腔热情 提交于 2019-12-13 00:45:00

问题


I am trying to write my own syntax highlighter in sublime. I think it uses python-based regular expression. Just want to match all tokens in a row like:

description str.bla, str.blub, str.yeah, str.no

My regular expression looks like:

regex = "(description) (str\\.[\\w\\d]+)(,\\s*(str\\.[\\w\\d]+))*"

Now I expect 1 matches in group 1 ("description"), 1 match in group 2 ("str.bla") and 3 matches in my group no 4 ("str.blub", "str.yeah", "str.no")

but I have only 1 match in my last group ("str.no"). What's going on there?

Thanks a lot!


回答1:


When you have a repeated capture group, (e.g. (a)* or (a)+, etc), the capture group will contain only the last match.

So, if I have the regex:

(123\d)+

And the string:

123412351236

You will find that the capture group will contain only 1236.

I don't know any way around this (besides hard coding the number of subgroups to capture), but you can try capturing the whole group like so:

regex = "(description) (str\\.[\\w\\d]+)((?:,\\s*(?:str\\.[\\w\\d]+))*)"

Which should give you

['description', 'str.bla', ', str.blub, str.yeah, str.no']

Note how the elements are grouped; you have 3 items in the list, the last one being a 'list' within the larger list.




回答2:


Try this:

regex = "(description) (str\\.[\\w\\d]+)((?:,\\s*(?:str\\.[\\w\\d]+))*)"


来源:https://stackoverflow.com/questions/18211809/regex-only-matches-last-item

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!