In JS regular expressions symbols ^
and $
designate start and end of the string. And only with /m
modifier (multiline
The following code snippet matches email addresses starting [a-z] followed by 0 or 1 dot, then by 0 or more a-z letters, then ending with "@gmail.com". I tested it.
string reg = "^[a-z]+\\.*[a-z]*@gmail\\.com$";
regex reg1(reg, regex_constants::icase);
reg1(regex_str, regex_constants::icase);
string email;
cin>>email;
if (regex_search(email, reg1))
You can emulate Perl/Python/PCRE \A
, which matches at beginning of string but not after a newline, with the Javascript regex ^(?<!(.|\n)])
, which translates to English as "match the beginning of a line which has no preceding character".
You can emulate Perl/Python/PCRE \z
, which matches only at end-of-string, using (?!(.|\n))$
. To get the effect of \Z
, which matches only at end-of-string but allows a single newline just before that end-of-string, just add an optional newline: \n?(?!(.|\n))$
.
TL;DR
^
and $
already match start and end of lines^
and end of string with $
with no a possibility to redefine their behavior.In all std::regex
implementations other than MSVC and before C++17, the ^
and $
match beginning and end of the string, not a line. See this demo that does not find any match in "1\n2\n3"
with ^\d+$
regex. When you add alternations (see below), there are 3 matches.
However, in MSVC and C++17, the ^
and $
may match start/end of the line.
C++17
Use the std::regex_constants::multiline option.
MSVC compiler
In a C++ project in Visual Studio, the following
std::regex r("^\\d+$");
std::string st("1\n2\n3");
for (std::sregex_iterator i = std::sregex_iterator(st.begin(), st.end(), r);
i != std::sregex_iterator();
++i)
{
std::smatch m = *i;
std::cout << "Match value: " << m.str() << " at Position " << m.position() << '\n';
}
will output
Match value: 1 at Position 0
Match value: 2 at Position 2
Match value: 3 at Position 4
Workarounds that work across C++ compilers
There is no universal option in std::regex
to make the anchors match start/end of the line across all compilers. You need to emulate it with alternations:
^ -> (^|\n)
$ -> (?=\n|$)
Note that $
can be "emulated" fully with (?=\n|$)
(where you may add more line terminator symbols or symbol sequences, like (?=\r?\n|\r|$)
), but with ^
, you cannot find a 100% workaround.
Since there is no lookbehind support, you might have to adjust other parts of your regex pattern because of (^|\n)
like using capturing groups more often than you could with a lookbehind support.
By default, ECMAscript mode already treats ^
as both beginning-of-input and beginning-of-line, and $
as both end-of-input and end-of-line. There is no way to make them match only beginning or end-of-input, but it is possible to make them match only beginning or end-of-line:
When invoking std::regex_match, std::regex_search, or std::regex_replace, there is an argument of type std::regex_constants::match_flag_type that defaults to std::regex_constants::match_default
.
^
matches only beginning-of-line, specify std::regex_constants::match_not_bol
$
matches only end-of-line, specify std::regex_constants::match_not_eol
std::regex_constants::match_not_bol | std::regex_constants::match_not_eol
)^
and regardless of the presence of std::regex_constants::match_not_bol
by specifying std::regex_constants::match_continuous
This is explained well in the ECMAScript grammar documentation on cppreference.com, which I highly recommend over cplusplus.com in general.
Caveat: I've tested with MSVC, Clang + libc++, and Clang + libstdc++, and only MSVC has the correct behavior at present.