Is there a way to have a capture repeat an arbitrary number of times in a regex?

前端 未结 3 1011
深忆病人
深忆病人 2021-01-21 08:00

I\'m using the C++ tr1::regex with the ECMA regex grammar. What I\'m trying to do is parse a header and return values associated with each item in the header.

Header:

相关标签:
3条回答
  • 2021-01-21 08:05

    No, there is not.

    0 讨论(0)
  • 2021-01-21 08:06

    Problem is that desired solution insists on use of capture groups. C++ provides tool regex_token_iterator to handle this in better way (C++11 example):

    #include <iostream>
    #include <string>
    #include <regex>
    
    using namespace std;
    
    int main() {
        std::regex e (R"((?:^-Numbers)?\s*(\d+))");
    
        string input;
    
        while (getline(cin, input)) {
            std::regex_token_iterator<std::string::iterator> a{
                input.begin(), input.end(),
                e, 1,
                regex_constants::match_continuous
            };
    
            std::regex_token_iterator<std::string::iterator> end;
            while (a != end) {
                cout << *a << " - ";
                ++a;
            }
            cout << '\n';
        }
    
        return 0;
    }
    

    https://wandbox.org/permlink/TzVEqykXP1eYdo1c

    0 讨论(0)
  • 2021-01-21 08:31

    I was about to ask this exact same question, and I kind of found a solution.

    Let's say you have an arbitrary number of words you want to capture.

    "there are four lights"

    and

    "captain picard is the bomb"

    You might think that the solution is:

    /((\w+)\s?)+/
    

    But this will only match the whole input string and the last captured group.

    What you can do is use the "g" switch.

    So, an example in Perl:

    use strict;
    use warnings;
    
    my $str1 = "there are four lights";
    my $str2 = "captain picard is the bomb";
    
    foreach ( $str1, $str2 ) {
        my @a = ( $_ =~ /(\w+)\s?/g );
        print "captured groups are: " . join( "|", @a ) . "\n";
    }
    

    Output is:

    captured groups are: there|are|four|lights
    captured groups are: captain|picard|is|the|bomb
    

    So, there is a solution if your language of choice supports an equivalent of "g" (and I guess most do...).

    Hope this helps someone who was in the same position as me!

    S

    0 讨论(0)
提交回复
热议问题