Does anyone know how Facebook encodes emoji with high-surrogate pairs in the Graph API?
Low surrogate pairs seem fine. For example, ❤️ (HEAVY BLACK HEART, though it
Answering my own question though most of the credit belongs to @bobince for showing me the way in the comments above.
The answer is that Facebook encodes emoji using the "Google" encoding as seen on this Unicode table.
I have created a ruby gem called emojivert that can convert from one encoding to another, including from "Google" to "Unified". It is based on another existing project called rails-emoji.
So the failing example above would be fixed by doing:
string = ActiveSupport::JSON.decode('"\udbba\udf59"')
> "