match whole word only without regex

前端 未结 4 1869
既然无缘
既然无缘 2021-01-26 07:27

Since i cant use preg_match (UTF8 support is somehow broken, it works locally but breaks at production) i want to find another way to match word against blacklist. Problem is, i

4条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-01-26 08:01

    Assuming you could do some pre-processing, you could use replace all your punctuation marks with white spaces and put everything in lowercase and then either:

    • Use strpos with something like so strpos(' badword ', $string) in a while loop to keep on iterating through your entire document;
    • Split the string at white spaces and compare each word with a list of bad words you have.

    So if you where trying the first option, it would something like so (untested pseudo code)

    $documet = body of text to process . ' ' 
    $document.replace('!@#$%^&*(),./...', ' ')
    $document.toLowerCase()
    $arr_badWords = [...]
    foreach($word in badwords)
    {
        $badwordIndex = strpos(' ' . $word . ' ', $document)
        while(!badWordIndex)
        {
            //
            $badwordIndex = strpos($word, $document)
        }
    }
    

    EDIT: As per @jonhopkins suggestion, adding a white space at the end should cater for the scenario where there wanted word is at the end of the document and is not proceeded by a punctuation mark.

提交回复
热议问题