I want to loop through a set of strings. On each string i want to loop through a set of regular expression to determine which expressions match on the string I\'m on. Howe
When you're after implementation details, and when the source code is available, the best way to tell is to simply look at it. :)
The short answer is: not exactly.
The optimization implemented in the .NET regex implementation is a Boyer-Moore string search as the first phase of matching when possible. Take a look at the source code for the gory details.
From the code itself:
// The RegexBoyerMoore object precomputes the Boyer-Moore
// tables for fast string scanning. These tables allow
// you to scan for the first occurance of a string within
// a large body of text without examining every character.
// The performance of the heuristic depends on the actual
// string and the text being searched, but usually, the longer
// the string that is being searched for, the fewer characters
// need to be examined.
This requires an anchoring prefix, which is searched for by this function, whose comment says:
/*
* This is the one of the only two functions that should be called from outside.
* It takes a RegexTree and computes the set of chars that can start it.
*/
The matching algorithm contains code which returns a no match result immediately if the input string is shorter than the computed prefix.
Note that it's also looking for anchors and optimizing for these, of course.
I did not find a minimum length optimization in the code, but I admit I didn't read it thoroughly (gotta do that one day). But I know other regex implementations which do this kind of optimization (PCRE comes to mind). Anyway, the .NET implementation has its own way of optimizing things, you should rely on that.