I have the character \"ö\". If I look in this UTF-8 table I see it has the hex value F6
. If I look in the Unicode table I see that \"ö\" has the indices E
unsigned cha_latin2utf8(unsigned char *dst, unsigned cha)
{
if (cha < 0x80) { *dst = cha; return 1; }
/* all 11 bit codepoints (0x0 -- 0x7ff)
** fit within a 2byte utf8 char
** firstbyte = 110 +xxxxx := 0xc0 + (char>>6) MSB
** second = 10 +xxxxxx := 0x80 + (char& 63) LSB
*/
*dst++ = 0xc0 | (cha >>6) & 0x1f; /* 2+1+5 bits */
*dst++ = 0x80 | (cha) & 0x3f; /* 1+1+6 bits */
return 2; /* number of bytes produced */
}
To test it:
#include
int main (void)
{
char buff[12];
cha_latin2utf8 ( buff, 0xf6);
fprintf(stdout, "%02x %02x\n"
, (unsigned) buff[0] & 0xff
, (unsigned) buff[1] & 0xff );
return 0;
}
The result:
c3 b6