问题
I am trying to find all floating number (could be in exponential forms with -/+ prefix or not). For example, the following is the valid format: -1.2 +1.2 .2 -3 3E4 -3e5 e-5
The source of text contains several numbers separated with space or comma. I need to use regular expression to tell
- tell if there is any invalid number (e.g. 1.2 3.2 s3) s3 is not a valid one
- list every single valid number
I have no idea how to get (1) done but for (2), I am using boost::regex and the following code
wstring strre("[-+]?\\b[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?\\b");
wstring src("1.2 -3.4 3.2 3 2 1e-3 3e3");
boost::wregex regexp(strre);
boost::match_results<std::wstring::const_iterator> what;
regex_search(src, what, regexp, boost::match_continuous);
wcout << "RE: " << strre << endl << endl;
wcout << "SOURCE: [" << src << "]" << endl;
for (int i=0; i<what.size(); i++)
wcout << "OUTPUT: [" << wstring(what[i].first, what[i].second) << "]"<< endl;
But this code only show me the first number (1.2). I also try boost::match_all, boost::match_default, the same result.
ADDITIONAL INFO: Hi all, let's not worry about double backslash issue, it is correctly expressed in my code (because in my testing code, I read the string from a text not by explicit string). Anyway, I modify the code as follow
wstring strre("[-+]?\\b[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?\\b");
boost::wregex regexp(strre);
boost::match_results<std::wstring::const_iterator> what;
wcout << "RE: " << strre << endl << endl;
while (src.length()>0)
{
wcout << "SOURCE: [" << src << "]" << endl;
regex_search(src, what, regexp, boost::match_default);
wcout << "OUTPUT: [" << wstring(what[0].first, what[0].second) << endl;
src = wstring(what[0].second, src.end());
}
Now, it is correctly show everything single numbers but I have to run regex_search several time due to it only give one number at a time. Well, I just don't understand why regex_search won't give me all results instead. Is that any way to run the search once and get all the results back?
回答1:
You normally have to double-escape backslash things in a C++ string. So your "\."
turns into just .
. You would need it to be "\\."
, etc. Similarly, your "\b"
becomes not a word-boundary but rather a literal backspace! Fix the same way: "\\b"
.
Also, where’s the doc for that strre
class? Are you sure it understands the language you are using?
Apparently the new C++ standard has raw string literals. These work like `backticked` strings in Go, or like 'single-quoted' strings or /patterns/ in Perl. See this answer for details.
EDIT
Here’s a somewhat fancier pattern for detecting floating-point literals, but which uses no backslashes:
[+-]?(?=[.]?[0-9])[0-9]*(?:[.][0-9]*)?(?:[Ee][+-]?[0-9]+)?
Note that it does require lookaheads, which EREs don’t support. You should probably use the PCRE library, which does. Broken down, that’s
[+-]? # optional leading sign
(?=[.]?[0-9]) # lookahead for a digit, maybe with an intervening dot
[0-9]* # maybe some digits
(?:[.][0-9]*)? # maybe a (dot plus maybe some digits)
(?:[Ee][+-]?[0-9]+)? # maybe an exponent, which may have a sign and must have digits
Pattern courtesy of Perl’s Regexp::Common library.
来源:https://stackoverflow.com/questions/9853950/this-regular-expression-fail-to-parse-all-valid-floating-numbers