C# - Regular expression to find a surrogate pair of a unicode codepoint from any string?
问题 I am trying to parse a message that possibly contains emojis in it. An example message that could be received looks like: {"type":"chat","msg":"UserName:\u00a0\ud83d\ude0b \n"} What should match is \u00a0 as a single character, and \ud83d\ude0b as a pair. I have regex that can pull individual codes, but not pairs to match the full emoji: \\u[a-z0-9]{4} Is there a clean way to account for any/multiple emojis in a sentence so I can replace the surrogate pair with the function I have? Thanks!