Since i cant use preg_match (UTF8 support is somehow broken, it works locally but breaks at production) i want to find another way to match word against blacklist. Problem is, i
Assuming you could do some pre-processing, you could use replace all your punctuation marks with white spaces and put everything in lowercase and then either:
strpos
with something like so strpos(' badword ', $string)
in a while loop to keep on iterating through your entire document;So if you where trying the first option, it would something like so (untested pseudo code)
$documet = body of text to process . ' '
$document.replace('!@#$%^&*(),./...', ' ')
$document.toLowerCase()
$arr_badWords = [...]
foreach($word in badwords)
{
$badwordIndex = strpos(' ' . $word . ' ', $document)
while(!badWordIndex)
{
//
$badwordIndex = strpos($word, $document)
}
}
EDIT: As per @jonhopkins suggestion, adding a white space at the end should cater for the scenario where there wanted word is at the end of the document and is not proceeded by a punctuation mark.