I was asked today if there was a library to take a list of strings and to compute the most efficient regex to match only those strings. I think it\'s an NP Complete problem by i
The Perl regex compiler builds a branching trie data structure out of patterns with parts in common across alternatives:
$ perl -Mre=debug -ce '"whatever" =~ /appserver1\.domain\.tld|appserver2\.domain\.tld|appserver3\.domain\.tld/'
Compiling REx "appserver1\.domain\.tld|appserver2\.domain\.tld|appserver3\."...
Final program:
1: EXACT (5)
5: TRIEC-EXACT[123] (25)
<1.domain.tld>
<2.domain.tld>
<3.domain.tld>
25: END (0)
anchored "appserver" at 0 (checking anchored) minlen 21
-e syntax OK
Freeing REx: "appserver1\.domain\.tld|appserver2\.domain\.tld|appserver3\."...