C++ vs .NET regex performance

前端 未结 1 1927
一整个雨季
一整个雨季 2021-01-14 01:23

Prompted by a comment from Konrad Rudolph on a related question, I wrote the following program to benchmark regular expression performance in F#:

open System         


        
相关标签:
1条回答
  • 2021-01-14 02:06

    These benchmarks aren't really comparable -- C++ and .NET implement completely different regular expression languages (ECMAScript vs. Perl), and are powered by completely different regular expression engines. .NET (to my understanding) is benefiting from the GRETA project here, which produced an absolutely fantastic regular expression engine which has been tuned for years. The C++ std::regex in comparison is a recent addition (at least on MSVC++, which I'm assuming you're using given the nonstandard types __int64 and friends).

    You can see how GRETA did vs. a more mature std::regex implementation, boost::regex, here (though that test was done on Visual Studio 2003).

    You also should keep in mind that regex performance is highly dependent on your source string and on your regex. Some regex engines spend lots of time parsing the regex to go faster through more source text; a tradeoff that makes sense only if you are parsing lots of text. Some regex engines trade off scanning speed for being relatively expensive to make matches (so number of matches would have an effect). There are huge numbers of tradeoffs here; one pair of inputs really is going to cloud the story.

    So to answer your question more explicitly: this kind of variation is normal across regex engines, be they compiled or interpreted. Looking at boost's tests above, often the difference between the fastest and slowest implementations were hundreds of times different -- 17x isn't all that strange depending on your use case.

    0 讨论(0)
提交回复
热议问题