Remove non-utf8 characters from string

后端 未结 18 1377
心在旅途
心在旅途 2020-11-22 11:56

Im having a problem with removing non-utf8 characters from string, which are not displaying properly. Characters are like this 0x97 0x61 0x6C 0x6F (hex representation)

18条回答
  •  隐瞒了意图╮
    2020-11-22 12:50

    static $preg = <<<'END'
    %(
    [\x09\x0A\x0D\x20-\x7E]
    | [\xC2-\xDF][\x80-\xBF]
    | \xE0[\xA0-\xBF][\x80-\xBF]
    | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}
    | \xED[\x80-\x9F][\x80-\xBF]
    | \xF0[\x90-\xBF][\x80-\xBF]{2}
    | [\xF1-\xF3][\x80-\xBF]{3}
    | \xF4[\x80-\x8F][\x80-\xBF]{2}
    )%xs
    END;
    if (preg_match_all($preg, $string, $match)) {
        $string = implode('', $match[0]);
    } else {
        $string = '';
    }
    

    it work on our service

提交回复
热议问题