问题
I'm trying to create wide chars file using MinGW C on Windows, however wide chars seem to be omitted. My code:
const wchar_t* str = L"příšerně žluťoučký kůň úpěl ďábelské ódy";
FILE* fd = fopen("file.txt","w");
// FILE* fd = _wfopen(L"demo.txgs",L"w"); // attempt to open wide file doesn't help
fwide(fd,1); // attempt to force wide mode, doesn't help
fwprintf(fd,L"%ls",str);
// fputws(p,fd); // stops output after writing "p" (1B file size)
fclose(fd);
File contents
píern luouký k úpl ábelské ódy
The file size is 30B, so the wide chars are really missing. How to convince the compiler to write them?
As @chqrlie suggests in the comments: the result of
fwrite(str, 1, sizeof(L"příšerně žluťoučký kůň úpěl ďábelské ódy"), fd);
is 82 (I guess 2*30 + 2*10 (ommited chars) + 2 (wide trailing zero)).
It also might be useful to quote from here
The external representation of wide characters in files are multibyte characters: These are obtained as if wcrtomb was called to convert each wide character (using the stream's internal mbstate_t object).
Which explains why the ISO-8859-1 chars are single byte in the file, but I don't know how to use this information to solve my problem. Doing the opposite task (reading multibyte UTF-8 into wide chars) I failed to use mbtowc and ended up using winAPI's MultiByteToWideChar.
回答1:
I figured this out. The internal use of wcrtomb (mentioned in details of my question) needs setlocale call, but that call fails with UTF-8 on Windows. So I used winAPI here:
char output[100]; // not wchar_t, write byte-by-byte
int len = WideCharToMultiByte(CP_UTF8,0,str,-1,NULL,0,NULL,NULL);
if(len>100) len = 100;
WideCharToMultiByte(CP_UTF8,0,str,-1,output,len,NULL,NULL);
fputs(output,fd);
And voila! The file is 56B long with expected UTF-8 contents:
příšerně žluťoučký kůň úpěl ďábelské ódy
I hope this will save some nerves to Windows coders.
回答2:
I am not a Windows user, but you might try this:
const wchar_t *str = L"příšerně žluťoučký kůň úpěl ďábelské ódy";
FILE *fd = fopen("file.txt", "w,ccs=UTF-8");
fwprintf(fd, L"%ls", str);
fclose(fd);
I got this idea from this question: How do I write a UTF-8 encoded string to a file in windows, in C++
来源:https://stackoverflow.com/questions/35928843/fwprintf-omits-wide-chars