std::regex, to match begin/end of string

前端未结

关注

 4  1339

In JS regular expressions symbols ^ and $ designate start and end of the string. And only with /m modifier (multiline

相关标签:

4条回答

隐瞒了意图╮

2020-12-31 06:22
The following code snippet matches email addresses starting [a-z] followed by 0 or 1 dot, then by 0 or more a-z letters, then ending with "@gmail.com". I tested it.
```
string reg = "^[a-z]+\\.*[a-z]*@gmail\\.com$";

regex reg1(reg, regex_constants::icase);
reg1(regex_str, regex_constants::icase);
string email;
cin>>email;
if (regex_search(email, reg1))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
梦毁少年i

2020-12-31 06:23

You can emulate Perl/Python/PCRE \A, which matches at beginning of string but not after a newline, with the Javascript regex ^(?<!(.|\n)]), which translates to English as "match the beginning of a line which has no preceding character".

You can emulate Perl/Python/PCRE \z, which matches only at end-of-string, using (?!(.|\n))$. To get the effect of \Z, which matches only at end-of-string but allows a single newline just before that end-of-string, just add an optional newline: \n?(?!(.|\n))$.

0 讨论(0)
发布评论:

提交评论
- 加载中...
悲哀的现实

2020-12-31 06:39
TL;DR
- MSVC: the ^ and $ already match start and end of lines
- C++17: use std::regex_constants::multiline option
- Other compilers only match start of string with ^ and end of string with $ with no a possibility to redefine their behavior.
In all std::regex implementations other than MSVC and before C++17, the ^ and $ match beginning and end of the string, not a line. See this demo that does not find any match in "1\n2\n3" with ^\d+$ regex. When you add alternations (see below), there are 3 matches.

However, in MSVC and C++17, the ^ and $ may match start/end of the line.

C++17

Use the std::regex_constants::multiline option.

MSVC compiler

In a C++ project in Visual Studio, the following
```
std::regex r("^\\d+$");
std::string st("1\n2\n3");
for (std::sregex_iterator i = std::sregex_iterator(st.begin(), st.end(), r);
    i != std::sregex_iterator();
    ++i)
{
    std::smatch m = *i;
    std::cout << "Match value: " << m.str() << " at Position " << m.position() << '\n';
}
```
will output
```
Match value: 1 at Position 0
Match value: 2 at Position 2
Match value: 3 at Position 4
```
Workarounds that work across C++ compilers

There is no universal option in std::regex to make the anchors match start/end of the line across all compilers. You need to emulate it with alternations:
```
^ -> (^|\n)
$ -> (?=\n|$)
```
Note that $ can be "emulated" fully with (?=\n|$) (where you may add more line terminator symbols or symbol sequences, like (?=\r?\n|\r|$)), but with ^, you cannot find a 100% workaround.

Since there is no lookbehind support, you might have to adjust other parts of your regex pattern because of (^|\n) like using capturing groups more often than you could with a lookbehind support.
0 讨论(0)
发布评论:

提交评论
- 加载中...
粉色の甜心

2020-12-31 06:39
By default, ECMAscript mode already treats ^ as both beginning-of-input and beginning-of-line, and $ as both end-of-input and end-of-line. There is no way to make them match only beginning or end-of-input, but it is possible to make them match only beginning or end-of-line:

When invoking std::regex_match, std::regex_search, or std::regex_replace, there is an argument of type std::regex_constants::match_flag_type that defaults to std::regex_constants::match_default.
- To specify that ^ matches only beginning-of-line, specify std::regex_constants::match_not_bol
- To specify that $ matches only end-of-line, specify std::regex_constants::match_not_eol
- As these values are bitflags, to specify both, simply bitwise-or them together (std::regex_constants::match_not_bol | std::regex_constants::match_not_eol)
- Note that beginning-of-input can be implied without using ^ and regardless of the presence of std::regex_constants::match_not_bol by specifying std::regex_constants::match_continuous
This is explained well in the ECMAScript grammar documentation on cppreference.com, which I highly recommend over cplusplus.com in general.

Caveat: I've tested with MSVC, Clang + libc++, and Clang + libstdc++, and only MSVC has the correct behavior at present.
0 讨论(0)
发布评论:

提交评论
- 加载中...