In Javascript I have defined a regular expression and now a user is typing in a string. I want to tell him if his string still could match the RegExp if he continues typing
Very interesting question. In my quick search, I didn't find anything predefined (not even in Perl) that solves this problem.
EDIT: Ouch, it seems Java has something similar called hitEnd() -- see Alan M's answer. What hitEnd() does is say that the result of match() (either true or false) might be modified by additional input. The book 'Mastering Regular Expressions" says it's not very reliable though (not sure why, page 392 not available in google books).
Depending on what features of regular expressions you use, a quick hack like writing some sort of prefixes of your regexp:
e.g. for a+a*b+c your prefixes will be:
a+ a+a* a+a*b+ a+a*b+c
and try to match any of them with your string might work. This quick hack is made difficult if you use the choice operator, if you use the range operator {n,m} or back-references.
That being said, I think the good solution is to slightly modify the matching algorithm.
The matching algorithm normally employed is a backtracking algorithm (which works well in practice, even if worst case behavior is exponential). This algorithm successfully terminates whenever it has reached the end of the regexp (even if not the entire string was consumed). What you need to do is to modify the termination condition such that it also terminates successfully when it has consumed all of the input.
That being said, you'd probably have to actually implement the algorithm in JavaScript. Hopefully this will become part of libraries such as Jquery.
For more references and theory on the algorithm, check this article out:
http://swtch.com/~rsc/regexp/regexp1.html
(even if it makes a case against the backtracking algorithm and suggests a FA based algorithm (but the FA cannot handle back-references)).
I think your best bet here is to make your Regex prefix-proof. For the example you gave, /a*b/
, I think you could probably use /a*b?/.test(userinput)
. For more complex patterns this could get increasingly difficult, but I still think it can be done by nesting each subexpression in a series of optional quantifiers (?
). For instance:
/a*bcd*e/
The prefix regex could be:
/a*(b(c(d*e?)?)?)?/
Its a little messy, but will solve your problem rather well I think.
People seem to be splitting evenly on how they interpret this question, so I'll demonstrate the concept with a Java example.
import java.util.regex.*;
public class Test
{
public static void main(String[] args) throws Exception
{
tryMatch("^a*b+$", "a", "ab", "abc");
}
public static void tryMatch(String regex, String... targets)
{
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher("");
System.out.printf("%nregex: %s%n", regex);
System.out.printf("target | matches() | hitEnd()%n");
for (String str : targets)
{
m.reset(str);
System.out.printf("%-6s | %-9B | %-9B%n",
str, m.matches(), m.hitEnd());
}
}
}
output:
regex: ^a*b+$
target | matches() | hitEnd()
a | FALSE | TRUE
ab | TRUE | TRUE
abc | FALSE | FALSE
Target string "a" doesn't match because the regex requires at least one b
, but it could be the prefix of a successful match, so hitEnd()
returns true
. String "ab" has all that's required for a match, but it would also match if we added more b
's to the end, so hitEnd()
still returns true
. With "abc" the match attempt fails before it reaches the end of the target string, so the regex couldn't match any string that starts with "abc".
As far as I know, Javascript doesn't have anything like Java's hitEnd()
method, but it might be possible to fake it. If anyone knows how, it'll be that Flagrant Badass, Steven Levithan.
First you define your regular expression as: var re = new RegExp(/^(regexp here)$/);
on the onKeypress event, you check the regexp like this:
text.match(regexp) - where the text is the string entered.
Is this clear?
One way of doing this could be to hook to the onKeyUp event of a text box and .test the text against the regular expression. My assumption is of course that you want to do a regular expression matching. I'm not sure if this is exactly what you need, in fact your code:
"a".isPrefixOf( re ); // true
will never match since it's required to also have a subsequent "b" character (you may want to modify the regular expression). For instance, this code will test against any string matching this format:
a-n(n)-b
Here is the code, save it as a page and load it in your browser:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="it">
<body>
<input type="text" size="20" id="txtData" onkeyup="showResult()" />
<div id="dvResult" />
</body>
</html>
<script type="text/javascript">
//<![CDATA[
theRegExp = /^a\-\d{1,2}\-b$/;
function isPrefixOf( aText, aRegExp )
{
return aRegExp.test( aText );
}
function showResult()
{
res = document.getElementById( "dvResult" );
res.innerHTML = isPrefixOf( document.getElementById( "txtData" ).value, theRegExp ) ? "Correct" : "Bad input";
}
//]]>
</script>