问题
I am trying to write my own syntax highlighter in sublime. I think it uses python-based regular expression. Just want to match all tokens in a row like:
description str.bla, str.blub, str.yeah, str.no
My regular expression looks like:
regex = "(description) (str\\.[\\w\\d]+)(,\\s*(str\\.[\\w\\d]+))*"
Now I expect 1 matches in group 1 ("description"), 1 match in group 2 ("str.bla") and 3 matches in my group no 4 ("str.blub", "str.yeah", "str.no")
but I have only 1 match in my last group ("str.no"). What's going on there?
Thanks a lot!
回答1:
When you have a repeated capture group, (e.g. (a)*
or (a)+
, etc), the capture group will contain only the last match.
So, if I have the regex:
(123\d)+
And the string:
123412351236
You will find that the capture group will contain only 1236
.
I don't know any way around this (besides hard coding the number of subgroups to capture), but you can try capturing the whole group like so:
regex = "(description) (str\\.[\\w\\d]+)((?:,\\s*(?:str\\.[\\w\\d]+))*)"
Which should give you
['description', 'str.bla', ', str.blub, str.yeah, str.no']
Note how the elements are grouped; you have 3 items in the list, the last one being a 'list' within the larger list.
回答2:
Try this:
regex = "(description) (str\\.[\\w\\d]+)((?:,\\s*(?:str\\.[\\w\\d]+))*)"
来源:https://stackoverflow.com/questions/18211809/regex-only-matches-last-item