How to remove diacritics from text?

前端 未结 9 1616
余生分开走
余生分开走 2020-11-29 04:56

I am making a swedish website, and swedish letters are å, ä, and ö.

I need to make a string entered by a user to become url-safe with PHP.

Basically, need to

相关标签:
9条回答
  • 2020-11-29 05:34

    You don't need fancy regexps to filter the swedish chars, just use the strtr function to "translate" them, like:

    $your_URL = "www.mäåö.com";
    $good_URL = strtr($your_URL, "äåöë etc...", "aaoe etc...");
    echo $good_URL;
    

    ->output: www.maao.com :)

    0 讨论(0)
  • 2020-11-29 05:35

    This should be useful which handles almost all the cases.

    function Unaccent($string)
    {
        return preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml|caron);~i', '$1', htmlentities($string, ENT_COMPAT, 'UTF-8'));
    }
    
    0 讨论(0)
  • 2020-11-29 05:39

    and all swedish should be converted like this:

    'å' to 'a' and 'ä' to 'a' and 'ö' to 'o' (just remove the dots above).

    Use normalizer_normalize() to get rid of diacritical marks.

    The rest should become underscores as I said.

    Use preg_replace() with a pattern of [\W] (i.o.w: any character which doesn't match letters, digits or underscore) to replace them by underscores.

    Final result should look like:

    $data = preg_replace('[\W]', '_', normalizer_normalize($data));
    
    0 讨论(0)
提交回复
热议问题