In python these three commands print the same emoji:
print \"\\xF0\\x9F\\x8C\\x80\"
See Unicode Literals in Python Source Code
In Python source code, Unicode literals are written as strings prefixed with the ‘u’ or ‘U’ character:
u'abcdefghijk'
. Specific code points can be written using the\u
escape sequence, which is followed by four hex digits giving the code point. The\U
escape sequence is similar, but expects 8 hex digits, not 4.
In [1]: "\xF0\x9F\x8C\x80".decode('utf-8')
Out[1]: u'\U0001f300'
In [2]: u'\U0001F300'.encode('utf-8')
Out[2]: '\xf0\x9f\x8c\x80'
In [3]: u'\ud83c\udf00'.encode('utf-8')
Out[3]: '\xf0\x9f\x8c\x80'
\uhhhh --> Unicode character with 16-bit hex value
\Uhhhhhhhh --> Unicode character with 32-bit hex value
In Unicode escapes, the first form gives four hex digits to encode a 2-byte (16-bit) character code point, and the second gives eight hex digits for a 4-byte (32-bit) code point. Byte strings support only hex escapes for encoded text and other forms of byte-based data