Regex to remove non alphanumeric characters from UTF8 strings

后端 未结 4 443
傲寒
傲寒 2021-01-11 11:36

How can I remove characters, like punctuation, commas, dashes etc from a string, in a multibyte safe manner?

I will be working with input from many different languag

4条回答
  •  伪装坚强ぢ
    2021-01-11 12:18

    I used this:

    $clean = preg_replace( "/[^\p{L}|\p{N}]+/u", " ", $raw );
    $clean = preg_replace( "/[\p{Z}]{2,}/u", " ", $clean );
    

提交回复
热议问题