Regular expression to restrict Extended ASCII character set

前端 未结 3 1057
粉色の甜心
粉色の甜心 2021-01-23 16:09

I have multi lingual application which creates xml files but Extended ASCII characters from 168 to 254 (¿⌐¬½¼¡«»░▓│┤╡╢╖╕╣║╗╜╛┐└┴┬├) are not supposed in XML tags so, I would like

相关标签:
3条回答
  • 2021-01-23 16:42

    Rather you can make use of range in character class, to exclude specific range of characters using their Hex Codes: -

    [^\xA8-\xFE]
    

    The above regex will match any character except those in the given range. Those are the hex codes for the range you posted - [168, 254]

    0 讨论(0)
  • 2021-01-23 16:57

    Although @Oded suggest was applicable but I used following solution:

    Dim filteredInput as string
    
    Private const XML_RESTRICTED_CHARACTERS as string ="[☺☻♥♦♣♠•◘○◙♂♀♪♫☼►◄↕‼¶§▬↨↑↓→←∟↔▲▼#$%&()*+,-./:;<=>?@[\]^_`¢£¥₧ƒªº¿⌐¬½¼¡«»░▒▓│┤╡╢╖╕╣║╗╝╜╛┐└┴┬├─┼╞╟╚╔╩╦╠═╬╧╨╤╥╙╘╒╓╫╪┘┌█▄▌▐▀αßΓπΣσµτΦΩδ∞φε∩≡±≥≤⌠⌡÷≈°∙·√ⁿ²■""}{]"
    
    filteredInput =Regex.Replace(strInput.ToLower(), XML_RESTRICTED_CHARACTERS, "")
    
    0 讨论(0)
  • 2021-01-23 16:59

    Second option was to create a string of all symbols from 168 to 254 and check if string contains any of them but not sure if it is reliable and accurate solution.

    Yes, this is a reliable and accurate solution. It is also more lightweight than regular expressions.

    0 讨论(0)
提交回复
热议问题