How can I delete special characters?

前端 未结 5 1055
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-02-02 06:18

I\'m practicing with Ruby and regex to delete certain unwanted characters. For example:

input = input.gsub(/<\\/?[^>]*>/, \'\')

and fo

5条回答
  •  南方客
    南方客 (楼主)
    2021-02-02 06:21

    First of all, I think it might be easier to define what constitutes "correct input" and remove everything else. For example:

    input = input.gsub(/[^0-9A-Za-z]/, '')
    

    If that's not what you want (you want to support non-latin alphabets, etc.), then I think you should make a list of the glyphs you want to remove (like ™ or ☻), and remove them one-by-one, since it's hard to distinguish between a Chinese, Arabic, etc. character and a pictograph programmatically.

    Finally, you might want to normalize your input by converting to or from HTML escape sequences.

提交回复
热议问题