问题
I am writing the loading procedure for my application and it involves reading data from a file and creating an appropriate object with appropriate properties.
The file consists of sequential entries (separated by a newline) in the following format:
=== OBJECT TYPE ===
<Property 1>: Value1
<Property 2>: Value2
=== END OBJECT TYPE ===
Where the values are often strings which may consist of arbitrary characters, new-lines, etc.
I want to create a std::regex
which can match this format and allow me to use std::regex_iterator
to read each of the objects into the file in turn.
However, I am having trouble creating a regex which matches this type of format; I have looked at the ECMAScript syntax and create my regex in the following way, but it does not match the string in my test application:
const std::regex regexTest( "=== ([^=]+) ===\\n([.\\n]*)\\n=== END \\1 ===" );
And when using this in the following test application, it fails to match the regex to the string:
int main()
{
std::string testString = "=== TEST ===\n<Random Example>:This is a =test=\n<Another Example>:Another Test||\n=== END TEST ===";
std::cout << testString << std::endl;
const std::regex regexTest( "=== ([^=]+) ===\\n([.\\n]*)\\n=== END \\1 ===" );
std::smatch regexMatch;
if( std::regex_match( testString, regexMatch, regexTest ) )
{
std::cout << "Prefix: \"" << regexMatch[1] << "\"" << std::endl;
std::cout << "Main Body: \"" << regexMatch[2] << "\"" << std::endl;
}
return 0;
}
回答1:
Your problem is quite simpler than it looks. This:
const std::regex regexTest( "=== ([^=]+) ===\\n((?:.|\\n)*)\\n=== END \\1 ===" );
worked perfectly on clang++/libc++. It seems that \n
does not fit into []
brackets in ECMAscript regexen. Remember to use while regex_search
instead of if regex_match
if you want to look for more than one instance of the regex inside the string!
回答2:
Try to use:
lazy quantifiers:
=== (.+?) ===\\n([\\s\\S]*?)\\n=== END \\1 ===
negative classes and negative lookaheads:
=== ((?:[^ ]+| (?!===))+) ===\\n((?:[^\\n]+|\\n(?!=== END \\1 ===))*)
POSIX:
=== (.+?) ===\n((.|\n)*?)\n=== END [^=]+? ===
来源:https://stackoverflow.com/questions/17133296/ecmascript-regex-for-a-multilined-string