Negative look ahead not working as expected

∥☆過路亽.° 提交于 2021-02-17 01:55:12

问题


I have a bizarre situation where positive lookahead works as expected but negative lookahead doesn't. Please take a look at the following code:

<?php

$tweet = "RT @Startup_Collab: @RiseOfRest is headed to OMA & LNK to #showcase our emerging #startup ecosystem. Learn more! https://example.net #Riseof…";

$patterns=array(
    '/#\w+(?=…$)/',
);

$tweet = preg_replace_callback($patterns,function($m)
{
    switch($m[0][0])
    {
        case "#":
            return strtoupper($m[0]);
        break;
    }
},$tweet);


echo $tweet;

I want to match any hashtag not followed by …$ and upper case it (in reality it will be parsed out with an href but for simplicity's sake just upper case it for now ).

These are regexes with their corresponding outputs:

'/#\w+(?=…$)/' Match any hashtag ending with …$ and upper-case it, works as expected:

RT @Startup_Collab: @RiseOfRest is headed to OMA & LNK to #showcase our emerging #startup ecosystem. Learn more! https://example.net #RISEOF…

'/#\w+(?!…$)/' Match any hashtag NOT ending with …$ and upper-case it, does not work, all hashtags are uppercased:

RT @Startup_Collab: @RiseOfRest is headed to OMA & LNK to #SHOWCASE our emerging #STARTUP ecosystem. Learn more! https://example.net #RISEOf…

Thank you ver much for any help, suggestions, ideas and patience.

-- Angel


回答1:


That is because of backtracking that matches the part of a hashtag. Use a possessive quantifier to avoid backtracking into the \w+ subpattern:

/#\w++(?!…$)/
    ^^

See the regex demo

Now, 1 or more word chars are matched, and the (?!…$) negative lookahead is only executed once after these word chars matched. If there is a false result, no backtracking occurs, and the whole match is failed.

See more on possessive quantifiers here.



来源:https://stackoverflow.com/questions/39709879/negative-look-ahead-not-working-as-expected

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!