Filter a set of bad words out of a PHP array

前端 未结 5 693
你的背包
你的背包 2021-01-16 03:42

I have a PHP array of about 20,000 names, I need to filter through it and remove any name that has the word job, freelance, or project

相关标签:
5条回答
  • 2021-01-16 04:24

    A regular expression is not really necessary here — it'd likely be faster to use a few stripos calls. (Performance matters on this level because the search occurs for each of the 20,000 names.)

    With array_filter, which only keeps elements in the array for which the callback returns true:

    $data1 = array_filter($data1, function($el) {
            return stripos($el, 'job') === FALSE
                && stripos($el, 'freelance') === FALSE
                && stripos($el, 'project') === FALSE;
    });
    

    Here's a more extensible / maintainable version, where the list of bad words can be loaded from an array rather than having to be explicitly denoted in the code:

    $data1 = array_filter($data1, function($el) {
            $bad_words = array('job', 'freelance', 'project');
            $word_okay = true;
    
            foreach ( $bad_words as $bad_word ) {
                if ( stripos($el, $bad_word) !== FALSE ) {
                    $word_okay = false;
                    break;
                }
            }
    
            return $word_okay;
    });
    
    0 讨论(0)
  • 2021-01-16 04:30

    Use of the preg_match() function and some regular expressions should do the trick; this is what I came up with and it worked fine on my end:

    <?php
        $data1=array('JoomlaFreelance','PhillyWebJobs','web2project','cleanname');
        $cleanArray=array();
        $badWords='/(job|freelance|project)/i';
        foreach($data1 as $name) {
            if(!preg_match($badWords,$name)) {
                $cleanArray[]=$name;
            }
        }
        echo(implode($cleanArray,','));
    ?>
    

    Which returned:

    cleanname
    
    0 讨论(0)
  • 2021-01-16 04:36

    Personally, I would do something like this:

    $badWords = ['job', 'freelance', 'project'];
    $names = ['JoomlaFreelance', 'PhillyWebJobs', 'web2project', 'cleanname'];
    
    // Escape characters with special meaning in regular expressions.
    $quotedBadWords = array_map(function($word) {
        return preg_quote($word, '/');
    }, $badWords);
    
    // Create the regular expression.
    $badWordsRegex = implode('|', $quotedBadWords);
    
    // Filter out any names that match the bad words.
    $cleanNames = array_filter($names, function($name) use ($badWordsRegex) {
        return preg_match('/' . $badWordsRegex . '/i', $name) === FALSE;
    });
    
    0 讨论(0)
  • 2021-01-16 04:42

    I'd be inclined to use the array_filter function and change the regex to not match on word boundaries

    $data1 = array('Phillyfreelance' , 'PhillyWebJobs', 'web2project', 'cleanname');
    
    $cleanArray = array_filter($data1, function($w) { 
         return !preg_match('~(freelance|project|job)~i', $w); 
    });
    
    0 讨论(0)
  • 2021-01-16 04:46

    This should be what you want:

    if (!preg_match('/(freelance|job|project)/i', $name)) {
        $cleanArray[] = $name;
    }
    
    0 讨论(0)
提交回复
热议问题