regex to replace a given word with space at either side or not at all

前端 未结 4 1051
后悔当初
后悔当初 2021-01-21 18:48

I am working with some code in PHP that grabs the referrer data from a search engine, giving me the query that the user entered.

I would then like to remove certain stop

相关标签:
4条回答
  • 2021-01-21 19:22

    Try it this way

    $keywords = preg_replace( '/(?!\w)(for|sale)(?>!\w)/', '', $keywords );
    
    0 讨论(0)
  • 2021-01-21 19:23

    While Armel's answer will work, it is not performing optimally. Yes, your desired output will require wordboundaries and probably case-insensitive matching, but:

    1. Wordboundaries gain nothing from being wrapped in parentheses.
    2. Performing iterated preg_match() calls for each element in the blacklist array is not efficient. Doing so will ask the regex engine to perform wave after wave of individual keyword checks on the full string.

    I recommend building a single regex pattern that will check for all keywords during each step of traversing the string -- one time. To generate the single pattern dynamically, you only need to implode your blacklist array of elements with | (pipes) which represent the "OR" command in regex. By wrapping all of the pipe-delimited keywords in a non-capturing group ((?:...)), the wordboundaries (\b) serve their purpose for all keywords in the blacklist array.

    Code: (Demo)

    $string = "Each person wants peaches for themselves forever";
    $blacklist = array("for", "each");
    // if you might have non-letter characters that have special meaning to the regex engine
    //$blacklist = array_map(function($v){return preg_quote($v, '/');}, $blacklist);
    //print_r($blacklist);
    echo "Without wordboundaries:\n";
    var_export(preg_replace('/' . implode('|', $blacklist) . '/i', '', $string));
    
    echo "\n\n---\n";
    echo "With wordboundaries:\n";
    var_export(preg_replace('/\b(?:' . implode('|', $blacklist) . ')\b/i', '', $string));
    
    echo "\n\n---\n";
    echo "With wordboundaries and consecutive space mop up:\n";
    var_export(trim(preg_replace(array('/\b(?:' . implode('|', $blacklist) . ')\b/i', '/ \K +/'), '', $string)));
    

    Output:

    Without wordboundaries:
    ' person wants pes  themselves ever'
    
    ---
    With wordboundaries:
    ' person wants peaches  themselves forever'
    
    ---
    With wordboundaries and consecutive space mop up:
    'person wants peaches themselves forever'
    

    p.s. / \K +/ is the second pattern fed to preg_replace() which means the input string will be read a second time to search for 2 or more consecutive spaces. \K means "restart the fullstring match from here"; effectively it releases the previously matched space. Then one or more spaces to follow are matched and replaced with an empty string.

    0 讨论(0)
  • 2021-01-21 19:36

    You can use word boundaries for this

    $keywords = preg_replace('/\bfor\b/', '', $keywords);
    

    or with multiple words

    $keywords = preg_replace('/\b(?:for|sale)\b/', '', $keywords);
    
    0 讨论(0)
  • 2021-01-21 19:40
    $keywords = "...";
    $stopWords = array("for","sale");
    foreach($stopWords as $stopWord){
        $keywords = preg_replace("/(\b)$stopWord(\b)/", "", $keywords);
    }
    
    0 讨论(0)
提交回复
热议问题