问题
I am building a very basic profanity filter that I only want to apply on some fields on my application (fullName, userDescription) on the serverside.
Does anyone have experience with a profanity filter in production? I only want it to:
'ass hello' <- match
'asster' <- NOT match
Below is my current code but it returns true and false on in succession for some reason.
var badWords = [ 'ass', 'whore', 'slut' ]
, check = new Regexp(badWords.join('|'), 'gi');
function filterString(string) {
return check.test(string);
}
filterString('ass'); // Returns true / false in succession.
How can I fix this "in succession" bug?
回答1:
The test method sets the lastIndex property of the regex to the current matched position, so that further invocations will match further occurrences (if there were any).
check.lastIndex // 0 (init)
filterString('ass'); // true
check.lastIndex // 3
filterString('ass'); // false
check.lastIndex // now 0 again
So, you will need to reset it manually in your filterString
function if you don't recreate the RegExp each time:
function filterString(string) {
check.lastIndex = 0;
return check.test(string);
}
Btw, to match only full words (like "ass", but not "asster"), you should wrap your matches in word boundaries like WTK suggested, i.e.
var check = new Regexp("\\b(?:"+badWords.join('|')+")\\b", 'gi');
回答2:
You are matching via a substring comparison. Your Regex needs to be modified to match for whole words instead
回答3:
How about with fixed regexp:
check = new Regexp('(^|\b)'+badWords.join('|')+'($|\b)', 'gi');
check.test('ass') // true
check.test('suckass') // false
check.test('mass of whore') // true
check.test('massive') // false
check.test('slut is massive') // true
I'm using \b
match here to match for word boundry (and start or end of whole string).
来源:https://stackoverflow.com/questions/12798961/javascript-profanity-match-not-replace