I have a regex containig various sub-groups which are connected through an or condition:
([[:alpha:]]+)|([[:digit:]]+)
When I match the string
Not directly.
with the std::regex
library, match_result class takes care of the sub-match and it has a method named std::match_results::size and with that you can find the number of sub-match.
Ex:
std::string str( "one two three four five" );
std::regex rx( "(\\w+)(\\w+)(\\w+)(\\w+)(\\w+)" );
std::match_results< std::string::const_iterator > mr;
std::regex_search( str, mr, rx );
std::cout << mr.size() << '\n'; // 6
here the output is 6 not 5 because the match itself is counted as well. You can access them by .str( number )
method or operator[]
So because sub-match are counted form left-to-right you should after seeing the output of size method figure out witch group was matched.
If you change the rx to "(\\w+)(\\d+)(\\w+)"
then the size = 0
If you change the rx to "(\\w+).+"
then the size is 2. That means you have a whole successful match and a sum-match
Ex:
std::string str( "one two three four five" );
std::regex rx( "(\\w+).+" );
std::match_results< std::string::const_iterator > mr;
std::regex_search( str, mr, rx );
std::cout << mr.str( 1 ) << '\n'; // one
std::cout << mr[ 1 ] << '\n'; // one
the output for both is: one
And also if you want to print only the sub-match you can use a simple loop that has an index and this index starts from 1 not 0
Ex:
std::string str( "one two three four five" );
std::regex rx( "(\\w+) \\w+ (\\w+) \\w+ (\\w+)" );
std::match_results< std::string::const_iterator > mr;
std::regex_search( str, mr, rx );
for( std::size_t index = 1; index < mr.size(); ++index ){
std::cout << mr[ index ] << '\n';
}
the output is:
one
three
five
By saying determine which of the sub-patterns matched
if you mean specify which sub-match should be return from the search-engine then the answer is yes by using std::regex_token_iterator
you can determine that:
Ex: (Iterate over second sub-match of each match )
std::string str( "How are you today ? I am fine . How about you ?" );
std::regex rx( "(\\w+) (\\w+) ?" );
std::match_results< std::string::const_iterator > mr;
std::regex_token_iterator< std::string::const_iterator > first( str.begin(), str.end(), rx, 2 ), last;
while( first != last ){
std::cout << first->str() << '\n';
++first;
}
the last parameter is 2 : ( str.begin(), str.end(), rx, 2 )
and it means you want only the second sub-match. So the output is:
are
today
am
about