Preg_Replace and UTF8

后端 未结 3 957
清酒与你
清酒与你 2021-01-05 22:55

I\'m enhancing our video search page to highlight the search term(s) in the results. Because user can enter judas priest and a video has Judas Priest

相关标签:
3条回答
  • 2021-01-05 23:43

    Not sure what your problem is stemming from, but I just put together this little test case:

    <?php
    
    $uc = "SREČA";
    
    mb_internal_encoding('utf-8');
    echo $uc."\n";
    $lc = mb_strtolower($uc);
    echo $lc."\n";
    
    echo preg_replace("/\b(".preg_quote($uc).")\b/ui", "<span class='test'>$1</span>", "test:".$lc." end test");
    

    It's output on my machine:

    SREČA
    sreča
    test:<span class='test'>sreča</span> end test
    

    Seems to be working properly?

    0 讨论(0)
  • 2021-01-05 23:51

    I feel really stupid right about now but the problem wasn't with Preg_* functions at all. I don't know why but I first checked if the given term is even in the string with StriPos and since that function is not multi-byte safe it returned false if the case of the text was not the same as the search term, so the Preg_Replace wasn't even called.

    So the lesson to be learned here is that always use multi-byte versions of functions if you have UTF8 strings.

    0 讨论(0)
  • 2021-01-05 23:53

    If I'm not mistaken, preg_match uses the current locale. Try setting the locale to the language which these characters belongs to. You probably need a utf8 based locale too. If you have mixed languages in your page, you may be able to find a generic international locale that works.

    See also: http://www.phpwact.org/php/i18n/utf-8

    0 讨论(0)
提交回复
热议问题