php preg_grep and umlaut/accent

拥有回忆 提交于 2019-12-04 13:40:09

This is not possible with a regular expression pattern only. It is not because you can not tell the regex engine to match all "e" and similars. However, it is possible to first normalize the input data (both the array as well as the search input) and then search the normalized data but return the results for the non-normalized data.

In the following example I use transliteration to do this kind of normalization, I guess that is what you're looking for:

$data = ['Napoléon', 'Café'];

$result = array_translit_search('le', $data);
print_r($result);

$result = array_translit_search('leó', $data);
print_r($result);

The exemplary output is:

Array
(
    [0] => Napoléon
)
Array
(
    [0] => Napoléon
)

The search function itself is rather straight forward as written above, transliterating the inputs, doing the preg_grep and then returning the original intputs matches:

/**
 * @param string $search
 * @param array $data
 * @return array
 */
function array_translit_search($search, array $data) {

    $transliterator = Transliterator::create('ASCII-Latin', Transliterator::REVERSE);
    $normalize      = function ($string) use ($transliterator) {

        return $transliterator->transliterate($string);
    };

    $dataTrans   = array_map($normalize, $data);
    $searchTrans = $normalize($search);
    $pattern     = sprintf('/%s/i', preg_quote($searchTrans));
    $result      = preg_grep($pattern, $dataTrans);
    return array_intersect_key($data, $result);
}

This code requires the Transliterator from the Intl extension, you can replace it with any other similar transliteration or translation function.

I can not suggest to use str_replace here btw., if you need to fall-back to a translation table, use strtr instead. That is what you're looking for. But I prefer a library that brings the translation with it's own, especially if it's the Intl lib, you normally can't beat it.

You can write something like this:

$data = array('Napoléon','Café');
// do something with your input, but for testing purposes it will be simply as you wrote in your example
$input = 'le';

foreach($data as $var) {
  if (preg_match("/".str_replace(array("é"....), array("e"....), $input)."/i", str_replace(array("é"....), array("e"....), $var))) 
    //do something as there is a match
}

Actually you even don't need regex in this case, simple strpos will be enough.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!