Alright, I\'ve read the tutorials and scrambled my head too much to be able to see clearly now.
I\'m trying to capture parameters and their type info from a function
Generally, you'd need two steps to get all data.
First, match/validate the whole function:
function\((?<parameters>((\/\*[a-zA-Z]+\*\/)?[0-9a-zA-Z_$]+,?)*)\)
Note that now you have a parameters
group with all parameters. You can match some of the pattern again to get all matches of parameters, or in this case, split on ,
.
If you're using .Net, by any chance, you're in luck. .Net keeps full record of all captures of each group, so you can use the collection:
match.Groups["param"].Captures
Some notes:
(?<type>(\/\*[a-zA-Z]+\*\/)?)
/
has no special meaning there (C#/.Net doesn't have regex delimiters).Here's an example of using the captures. Again, the main point is maintaining the relation between type
and param
: you want to capture empty types, so you don't lose count.
Pattern:
function
\(
(?:
(?:
/\*(?<type>[a-zA-Z]+)\*/ # type within /* */
| # or
(?<type>) # capture an empty type.
)
(?<param>
[0-9a-zA-Z_$]+
)
(?:,|(?=\s*\))) # mandatory comma, unless before the last ')'
)*
\)
Code:
Match match = Regex.Match(s, pattern, RegexOptions.IgnorePatternWhitespace);
CaptureCollection types = match.Groups["type"].Captures;
CaptureCollection parameters = match.Groups["param"].Captures;
for (int i = 0; i < parameters.Count; i++)
{
string parameter = parameters[i].Value;
string type = types[i].Value;
if (String.IsNullOrEmpty(type))
type = "NO TYPE";
Console.WriteLine("Parameter: {0}, Type: {1}", parameter, type);
}
It's been a while since this question was active, but I think I finally found an answer.
I think I was looking for the same situation as you, but for use with PHP, and there is an answer in another post I found that works really well, using the \K
and \G
commands from PCRE. See Alan Moore's answer here: PHP Regular Expression - Repeating Match of a Group
My issue was trying to pull out all the cell values in a table, where each row contained a 6 digit number, 20x a 1 or 2 digit number, and an unrelated 1 or 2 digit number. The solution was:
<tr class="[^"]*">\s+<td>(\d{6})<\/td>|\G<\/td>[^<>]*+<td>\K\d{1,6}|<td>(\d{1,2})<\/td>
Very nice solution if I do say so myself!
the page you referenced mentioned using ?:
for non-capture, then surrounding the repeating capture in its own group. i am guessing they are suggesting something like this function\(((?:(\/\*(?<type>[a-zA-Z]+)\*\/)?(?<param>[0-9a-zA-Z_$]+),?)*)\)
i like to use http://gskinner.com/RegExr/ to test my expressions, but it won't show repeated captures. You may have to loop through the results in whatever return structure you get back to see the values in other non-.NET languages.
sorry i couldn't test more thuroughly...