Please clarify for me, how does UTF16 work? I am a little confused, considering these points:
This Wikipedia article seems to be a good intro.
UTF-16 (16-bit Unicode Transformation Format) is a character encoding for Unicode capable of encoding 1,112,064 numbers (called code points) in the Unicode code space from 0 to 0x10FFFF. It produces a variable-length result of either one or two 16-bit code units per code point.