I need to read Unicode characters from a file. The only thing I need to do from them is to extract their Unicode number. I am running on Windows XP using CodeBlock Mingw .
Well, the code in your question only reads the first character of your file, so you will have to implement some kind of looping construct in order to process the whole contents of that file.
Now, fgetwc() is returning 255
(0xFF
) for three reasons:
You're not taking the byte-order mark of the file into account, so you end up reading it instead of the actual file contents,
You're not specifying a translation mode flag in the mode
argument to _wfopen(), so it defaults to text
and fgetwc()
accordingly tries to read a multibyte character instead of a wide character,
0xFF
(the first byte of a little-endian UTF-16 BOM) is probably not a lead byte in your program's current code page, so fgetwc()
returns it without further processing.