I\'d like to find the longest repeating string within a string, implemented in JavaScript and using a regular-expression based approach.
I have an PHP implementation
Javascript matches only return the first match -- you have to loop in order to find multiple results. A little testing shows this gets the expected results:
function maxRepeat(input) {
var reg = /(?=((.+)(?:.*?\2)+))/g;
var sub = ""; //somewhere to stick temp results
var maxstr = ""; // our maximum length repeated string
reg.lastIndex = 0; // because reg previously existed, we may need to reset this
sub = reg.exec(input); // find the first repeated string
while (!(sub == null)){
if ((!(sub == null)) && (sub[2].length > maxstr.length)){
maxstr = sub[2];
}
sub = reg.exec(input);
reg.lastIndex++; // start searching from the next position
}
return maxstr;
}
// I'm logging to console for convenience
console.log(maxRepeat("aabcd")); //aa
console.log(maxRepeat("inputinput")); //input
console.log(maxRepeat("7inputinput")); //input
console.log(maxRepeat("inputinput7")); //input
console.log(maxRepeat("7inputinput7")); //input
console.log(maxRepeat("xxabcdyy")); //x
console.log(maxRepeat("XXinputinputYY")); //input
Note that for "xxabcdyy" you only get "x" back, as it returns the first string of maximum length.
It seems JS regexes are a bit weird. I don't have a complete answer, but here's what I found.
Although I thought they did the same thing re.exec() and "string".match(re) behave differently. Exec seems to only return the first match it finds, whereas match seems to return all of them (using /g in both cases).
On the other hand, exec seems to work correctly with ?= in the regex whereas match returns all empty strings. Removing the ?= leaves us with
re = /((.+)(?:.*?\2)+)/g
Using that
"XXinputinputYY".match(re);
returns
["XX", "inputinput", "YY"]
whereas
re.exec("XXinputinputYY");
returns
["XX", "XX", "X"]
So at least with match you get inputinput as one of your values. Obviously, this neither pulls out the longest, nor removes the redundancy, but maybe it helps nonetheless.
One other thing, I tested in firebug's console which threw an error about not supporting $1, so maybe there's something in the $ vars worth looking at.