Visual Studio regex_iterator Bug?

眉间皱痕 提交于 2019-12-10 12:54:51

问题


I'm on Visual Studio 2013 and I'm seeing what I think is a bug, I was hoping someone could confirm?

string foo{ "A\nB\rC\n\r" };
vector<string> bar;

for (sregex_iterator i(foo.cbegin(), foo.cend(), regex("(.*)[\n\r]{1,2}")); i != sregex_iterator(); ++i){
    bar.push_back(i->operator[](1).str());
}

This code hits a Debug Assertion in the Visual Studio regex library:

regex_iterator orphaned

If I define the regex outside the for-loop it's fine:

string foo{ "A\nB\rC\n\r" };
vector<string> bar;
regex bug("(.*)[\n\r]{1,2}");

for (sregex_iterator i(foo.cbegin(), foo.cend(), bug); i != sregex_iterator(); ++i){
    bar.push_back(i->operator[](1).str());
}

Alternatively this works fine in a transform as shown in this question:

string foo{ "A\nB\rC\n\r" };
vector<string> bar;

// This puts {"A", "B", "C"} into bar
transform(sregex_iterator(foo.cbegin(), foo.cend(), regex("(.*)[\n\r]{1,2}")), sregex_iterator(), back_inserter(bar), [](const smatch& i){ return i[1].str(); });

Can someone confirm this is a bug?


回答1:


In C++11 you are allowed to bind a temporary regex to const regex & which can lead to undefined behavior if the iterator is used outside of the lifetime of the temporary since it will store a pointer to it. This is a defect in the specification and it is not an error, although Visual Studio catches this with a debug assert.

sregex_iterator i(foo.cbegin(), foo.cend(), regex("(.*)[\n\r]{1,2}"))
                                            ^^^^^
                                            temporary

The following deleted overload was adding in C++14 to prevent this case, from cppreference:

regex_iterator(BidirIt, BidirIt,
           const regex_type&&,
           std::regex_constants::match_flag_type =
           std::regex_constants::match_default) = delete;       (since C++14)

and it says:

The overload 2 is not allowed to be called with a temporary regex, since the returned iterator would be immediately invalidated.

So this is not a Visual Studio bug since it is implementing the C++11 standard and this was not addressed via a defect report till later on. Both clang and gcc using -std=c++14 or greater will produce an error with your first(see it live) and third(see it live) example. Visual Studio only started supporting some C++14 in VS 2015:

[...]and initial support for certain C++14 features.[...]

We can see that LWG defect 2332: regex_iterator/regex_token_iterator should forbid temporary regexes deals with this:

Users can write "for(sregex_iterator i(s.begin(), s.end(), regex("meow")), end; i != end; ++i)", binding a temporary regex to const regex& and storing a pointer to it. This will compile silently, triggering undefined behavior at runtime. We now have the technology to prevent this from compiling, like how reference_wrapper refuses to bind to temporaries.

As T.C. points out the last example you show is actually ok, even though you are binding a temporary its lifetime extends to the end of the expression.




回答2:


No, this is not a bug. See LWG 2329 regex_match()/regex_search() with match_results should forbid temporary strings. This construct exhibits undefined behavior since it binds a temporary regex to const regex& and stores a pointer to it.

Also see C++14 STL Features, Fixes, And Breaking Changes In Visual Studio 14 CTP1 where this is listed as a fix.



来源:https://stackoverflow.com/questions/29895747/visual-studio-regex-iterator-bug

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!