I have a list of about 120 thousand english words (basically every word in the language).
I need a regular expression that would allow searching through these words
This is what I use:
String wildcardToRegex(String wildcardString) {
// The 12 is arbitrary, you may adjust it to fit your needs depending
// on how many special characters you expect in a single pattern.
StringBuilder sb = new StringBuilder(wildcardString.length() + 12);
sb.append('^');
for (int i = 0; i < wildcardString.length(); ++i) {
char c = wildcardString.charAt(i);
if (c == '*') {
sb.append(".*");
} else if (c == '?') {
sb.append('.');
} else if ("\\.[]{}()+-^$|".indexOf(c) >= 0) {
sb.append('\\');
sb.append(c);
} else {
sb.append(c);
}
}
sb.append('$');
return sb.toString();
}
Special character list from https://stackoverflow.com/a/26228852/1808989.
function matchWild(wild,name)
{
if (wild == '*') return true;
wild = wild.replace(/\./g,'\\.');
wild = wild.replace(/\?/g,'.');
wild = wild.replace(/\\/g,'\\\\');
wild = wild.replace(/\//g,'\\/');
wild = wild.replace(/\*/g,'(.+?)');
var re = new RegExp(wild,'i');
return re.test(name);
}
Replace ?
with .
and *
with .*
.