I am little bit new to unicode many other languages handle them in a very nice way internally. But in c we have something like wchar_t(2 byte unsigned short) and many more. char