How can I delete special characters?

前端 未结 5 1054
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-02-02 06:18

I\'m practicing with Ruby and regex to delete certain unwanted characters. For example:

input = input.gsub(/<\\/?[^>]*>/, \'\')

and fo

相关标签:
5条回答
  • 2021-02-02 06:21

    First of all, I think it might be easier to define what constitutes "correct input" and remove everything else. For example:

    input = input.gsub(/[^0-9A-Za-z]/, '')
    

    If that's not what you want (you want to support non-latin alphabets, etc.), then I think you should make a list of the glyphs you want to remove (like ™ or ☻), and remove them one-by-one, since it's hard to distinguish between a Chinese, Arabic, etc. character and a pictograph programmatically.

    Finally, you might want to normalize your input by converting to or from HTML escape sequences.

    0 讨论(0)
  • 2021-02-02 06:25

    An easier way to do this inspirated by Can Berk Güder answer is:

    In order to delete special characters:

    input = input.gsub(/\W/, '')
    

    In order to keep word characters:

    input = input.scan(/\w/)
    

    At the end input is the same! Try it on : http://rubular.com/

    0 讨论(0)
  • 2021-02-02 06:40

    You can match all the characters you want, and then join them together, like this:

    original = "aøbæcå"
    stripped = original.scan(/[a-zA-Z]/).to_s
    puts stripped
    

    which outputs "abc"

    0 讨论(0)
  • 2021-02-02 06:45

    If you just wanted ASCII characters, then you can use:

    original = "aøbauhrhræoeuacå" 
    cleaned = ""
    original.each_byte { |x|  cleaned << x unless x > 127   }
    cleaned   # => "abauhrhroeuac"
    
    0 讨论(0)
  • 2021-02-02 06:46

    You can use parameterize:

    '@!#$%^&*()111'.parameterize
     => "111" 
    
    0 讨论(0)
提交回复
热议问题