C++ regex: Which group matched?

前端 未结 1 1060
暖寄归人
暖寄归人 2021-01-28 23:43

I have a regex containig various sub-groups which are connected through an or condition:

([[:alpha:]]+)|([[:digit:]]+)

When I match the string

相关标签:
1条回答
  • 2021-01-29 00:20

    Not directly.

    with the std::regex library, match_result class takes care of the sub-match and it has a method named std::match_results::size and with that you can find the number of sub-match.

    Ex:

    std::string str( "one two three four five" );
    std::regex rx( "(\\w+)(\\w+)(\\w+)(\\w+)(\\w+)" );
    std::match_results< std::string::const_iterator > mr;
    
    std::regex_search( str, mr, rx );
    
    std::cout << mr.size() << '\n'; // 6  
    

    here the output is 6 not 5 because the match itself is counted as well. You can access them by .str( number ) method or operator[]

    So because sub-match are counted form left-to-right you should after seeing the output of size method figure out witch group was matched.

    If you change the rx to "(\\w+)(\\d+)(\\w+)" then the size = 0

    If you change the rx to "(\\w+).+" then the size is 2. That means you have a whole successful match and a sum-match

    Ex:

    std::string str( "one two three four five" );
    std::regex rx( "(\\w+).+" );
    std::match_results< std::string::const_iterator > mr;
    
    std::regex_search( str, mr, rx );
    
    std::cout << mr.str( 1 ) << '\n'; // one
    std::cout << mr[ 1 ] << '\n';     // one
    

    the output for both is: one

    And also if you want to print only the sub-match you can use a simple loop that has an index and this index starts from 1 not 0

    Ex:

    std::string str( "one two three four five" );
    std::regex rx( "(\\w+) \\w+ (\\w+) \\w+ (\\w+)" );
    std::match_results< std::string::const_iterator > mr;
    
    std::regex_search( str, mr, rx );
    
    for( std::size_t index = 1; index < mr.size(); ++index ){
        std::cout << mr[ index ] << '\n';
    }
    

    the output is:

    one
    three
    five  
    

    By saying determine which of the sub-patterns matched
    if you mean specify which sub-match should be return from the search-engine then the answer is yes by using std::regex_token_iterator you can determine that:

    Ex: (Iterate over second sub-match of each match )

    std::string str( "How are you today ? I am fine . How about you ?" );
    std::regex rx( "(\\w+) (\\w+) ?" );
    std::match_results< std::string::const_iterator > mr;
    
    std::regex_token_iterator< std::string::const_iterator > first( str.begin(), str.end(), rx, 2 ), last;
    
    while( first != last ){
        std::cout << first->str() << '\n';
        ++first;
    } 
    

    the last parameter is 2 : ( str.begin(), str.end(), rx, 2 ) and it means you want only the second sub-match. So the output is:

    are
    today
    am
    about
    
    0 讨论(0)
提交回复
热议问题