How do I remove emoji from string

前端 未结 10 2072
你的背包
你的背包 2020-11-28 10:44

My problem is to remove emoji from a string, but not CJK (Chinese, Japanese, Korean) characters from a string using regex. I tried to use this regex:

REGEX =         


        
相关标签:
10条回答
  • 2020-11-28 11:27

    One more alternative

    "Scheiße! I hate emoji                                                                     
    0 讨论(0)
  • 2020-11-28 11:28

    This very short Regex covers all Emoji in getemoji.com so far:

    [\u{1F300}-\u{1F5FF}|\u{1F1E6}-\u{1F1FF}|\u{2700}-\u{27BF}|\u{1F900}-\u{1F9FF}|\u{1F600}-\u{1F64F}|\u{1F680}-\u{1F6FF}|\u{2600}-\u{26FF}]
    
    0 讨论(0)
  • 2020-11-28 11:28

    CARE the answer from Aray have some side effects.

    "-".gsub(/[^\p{L}\s]+/, '').squeeze(' ').strip
    => ""
    

    even when this is suppose to be a simple minus (-)

    0 讨论(0)
  • 2020-11-28 11:30
    REGEX = /[^\u{1F600}-\u{1F6FF}\s]/
    

    or

    REGEX = /[\u{1F600}-\u{1F6FF}\s]/
    REGEX = /[\u{1F600}-\u{1F6FF}]/
    REGEX = /[^\u{1F600}-\u{1F6FF}]/
    

    because your original regex seems to indicate you try to find everything that is not an amoji and not a whitespace and I don't know why would you want to do it.

    Also:

    • the emoji are 1F300-1F6FF rather than 1F600-1F6FF; you may want to change that

    • if you want to remove all astral characters (for example you deal with a software that doesn't support all of Unicode), you should use 10000-10FFFF.

    EDIT: You almost certainly want REGEX = /[\u{1F600}-\u{1F6FF}]/ or similar. Your original regex matched everything that is not a whitespace, and not in range 0-\u1F6F. Since spaces are whitespace, and English letters are in range 0-\u1F6F, and Chinese characters are in neither, the regex matched Chinese characters and removed them.

    0 讨论(0)
提交回复
热议问题