I\'m using the C++ tr1::regex with the ECMA regex grammar. What I\'m trying to do is parse a header and return values associated with each item in the header.
Header:
No, there is not.
Problem is that desired solution insists on use of capture groups. C++ provides tool regex_token_iterator
to handle this in better way (C++11 example):
#include <iostream>
#include <string>
#include <regex>
using namespace std;
int main() {
std::regex e (R"((?:^-Numbers)?\s*(\d+))");
string input;
while (getline(cin, input)) {
std::regex_token_iterator<std::string::iterator> a{
input.begin(), input.end(),
e, 1,
regex_constants::match_continuous
};
std::regex_token_iterator<std::string::iterator> end;
while (a != end) {
cout << *a << " - ";
++a;
}
cout << '\n';
}
return 0;
}
https://wandbox.org/permlink/TzVEqykXP1eYdo1c
I was about to ask this exact same question, and I kind of found a solution.
Let's say you have an arbitrary number of words you want to capture.
"there are four lights"
and
"captain picard is the bomb"
You might think that the solution is:
/((\w+)\s?)+/
But this will only match the whole input string and the last captured group.
What you can do is use the "g" switch.
So, an example in Perl:
use strict;
use warnings;
my $str1 = "there are four lights";
my $str2 = "captain picard is the bomb";
foreach ( $str1, $str2 ) {
my @a = ( $_ =~ /(\w+)\s?/g );
print "captured groups are: " . join( "|", @a ) . "\n";
}
Output is:
captured groups are: there|are|four|lights
captured groups are: captain|picard|is|the|bomb
So, there is a solution if your language of choice supports an equivalent of "g" (and I guess most do...).
Hope this helps someone who was in the same position as me!
S