PHP replacing special characters like à->a, è->e

前端 未结 8 1742
隐瞒了意图╮
隐瞒了意图╮ 2020-11-27 14:20

I have php document signup.php which save the content from form (in form.php document) to MySQL base. The problem arises when I want to reformat the input content. I want do

相关标签:
8条回答
  • 2020-11-27 14:57
    function correctedText($txt=''){
      $ss = str_split($txt);
        
      for($i=0; $i<count($ss); $i++){
        $asciiNumber = ord($ss[$i]);// get the ascii dec of a single character
        
        // asciiNumber will be from the DEC column showing at https://www.ascii-code.com
        
        // capital letters only checked 
        if($asciiNumber >= 192 && $asciiNumber <= 197)$ss[$i] = 'A';
        elseif($asciiNumber == 198)$ss[$i] = 'AE';
        elseif($asciiNumber == 199)$ss[$i] = 'C';
        elseif($asciiNumber >= 200 && $asciiNumber <= 203)$ss[$i] = 'E';
        elseif($asciiNumber >= 204 && $asciiNumber <= 207)$ss[$i] = 'I';
        elseif($asciiNumber == 209)$ss[$i] = 'N';
        elseif($asciiNumber >= 210 && $asciiNumber <= 214)$ss[$i] = 'O';
        elseif($asciiNumber == 216)$ss[$i] = 'O';
        elseif($asciiNumber >= 217 && $asciiNumber <= 220)$ss[$i] = 'U';
        elseif($asciiNumber == 221)$ss[$i] = 'Y';
      }
        
      $txt = implode('', $ss);
        
      return $txt;
    }
    
    0 讨论(0)
  • 2020-11-27 14:58

    The string $chain is in the same character encoding as the characters in the array - it's possible, even likely, that the $first_name string is in a different encoding, and so those characters don't match. You might want to try using the multibyte string functions instead.

    Try mb_convert_encoding. You might also want to try using HTML_ENTITIES as the to_encoding parameter, then you don't need to worry about how the characters will get converted - it will be very predictable.

    Assuming your input to this script is in UTF-8, probably not a bad place to start...

    $first_name = mb_convert_encoding($first_name, "HTML-ENTITIES", "UTF-8"); 
    
    0 讨论(0)
  • 2020-11-27 14:58

    CodeIgniter way:

    $this->load->helper('text');
    
    $string = convert_accented_characters($string);
    

    This function uses a companion config file application/config/foreign_chars.php to define the to and from array for transliteration.

    https://www.codeigniter.com/user_guide/helpers/text_helper.html#ascii_to_entities

    0 讨论(0)
  • 2020-11-27 15:03

    Wish I found this thread sooner. The function I made (that took me way too long) is below:

    function CheckLetters($field){
        $letters = [
            0 => "a à á â ä æ ã å ā",
            1 => "c ç ć č",
            2 => "e é è ê ë ę ė ē",
            3 => "i ī į í ì ï î",
            4 => "l ł",
            5 => "n ñ ń",
            6 => "o ō ø œ õ ó ò ö ô",
            7 => "s ß ś š",
            8 => "u ū ú ù ü û",
            9 => "w ŵ",
            10 => "y ŷ ÿ",
            11 => "z ź ž ż",
        ];
        foreach ($letters as &$values){
            $newValue = substr($values, 0, 1);
            $values = substr($values, 2, strlen($values));
            $values = explode(" ", $values);
            foreach ($values as &$oldValue){
                while (strpos($field,$oldValue) !== false){
                    $field = preg_replace("/" . $oldValue . '/', $newValue, $field, 1);
                }
            }
        }
        return $field;
    }
    
    0 讨论(0)
  • 2020-11-27 15:06

    As of PHP >= 5.4.0

    $translatedString = transliterator_transliterate('Any-Latin; Latin-ASCII; [\u0080-\u7fff] remove', $string);
    
    0 讨论(0)
  • 2020-11-27 15:07

    There's a much easier way to do this, using iconv - from the user notes, this seems to be what you want to do: characters transliteration

    // PHP.net User notes
    <?php
        $string = "ʿABBĀSĀBĀD";
    
        echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $string);
        // output: [nothing, and you get a notice]
    
        echo iconv('UTF-8', 'ISO-8859-1//IGNORE', $string);
        // output: ABBSBD
    
        echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT//IGNORE', $string);
        // output: ABBASABAD
        // Yay! That's what I wanted!
    ?>
    

    Be very conscientious with your character encodings, so you are keeping the same encoding at all stages in the process - front end, form submission, encoding of the source files. Default encoding in PHP and in forms is ISO-8859-1, before PHP 5.4 where it changed to be UTF8 (finally!).

    There's a couple of functions you can play around with for ideas. First is from CakePHP's inflector class, called slug:

    public static function slug($string, $replacement = '_') {
        $quotedReplacement = preg_quote($replacement, '/');
    
        $merge = array(
            '/[^\s\p{Ll}\p{Lm}\p{Lo}\p{Lt}\p{Lu}\p{Nd}]/mu' => ' ',
            '/\\s+/' => $replacement,
            sprintf('/^[%s]+|[%s]+$/', $quotedReplacement, $quotedReplacement) => '',
        );
    
        $map = self::$_transliteration + $merge;
        return preg_replace(array_keys($map), array_values($map), $string);
    }
    

    It depends on a self::$_transliteration array which is similar to what you were doing in your question - you can see the source for inflector on github.

    Another is a function I use personally, which comes from here.

    function slugify($text,$strict = false) {
        $text = html_entity_decode($text, ENT_QUOTES, 'UTF-8');
        // replace non letter or digits by -
        $text = preg_replace('~[^\\pL\d.]+~u', '-', $text);
    
        // trim
        $text = trim($text, '-');
        setlocale(LC_CTYPE, 'en_GB.utf8');
        // transliterate
        if (function_exists('iconv')) {
            $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);
        }
    
        // lowercase
        $text = strtolower($text);
        // remove unwanted characters
        $text = preg_replace('~[^-\w.]+~', '', $text);
        if (empty($text)) {
            return 'empty_$';
        }
        if ($strict) {
            $text = str_replace(".", "_", $text);
        }
        return $text;
    }
    

    What those functions do is transliterate and create 'slugs' from arbitrary text input, which is a very very useful thing to have in your toolchest when making web apps. Hope this helps!

    0 讨论(0)
提交回复
热议问题