fwprintf omits wide chars

Deadly 提交于 2020-01-04 05:49:10

问题


I'm trying to create wide chars file using MinGW C on Windows, however wide chars seem to be omitted. My code:

const wchar_t* str = L"příšerně žluťoučký kůň úpěl ďábelské ódy";
FILE* fd = fopen("file.txt","w");
// FILE* fd = _wfopen(L"demo.txgs",L"w"); // attempt to open wide file doesn't help
fwide(fd,1); // attempt to force wide mode, doesn't help
fwprintf(fd,L"%ls",str);
// fputws(p,fd); // stops output after writing "p" (1B file size)
fclose(fd);

File contents

píern luouký k úpl ábelské ódy

The file size is 30B, so the wide chars are really missing. How to convince the compiler to write them?

As @chqrlie suggests in the comments: the result of

fwrite(str, 1, sizeof(L"příšerně žluťoučký kůň úpěl ďábelské ódy"), fd);

is 82 (I guess 2*30 + 2*10 (ommited chars) + 2 (wide trailing zero)).

It also might be useful to quote from here

The external representation of wide characters in files are multibyte characters: These are obtained as if wcrtomb was called to convert each wide character (using the stream's internal mbstate_t object).

Which explains why the ISO-8859-1 chars are single byte in the file, but I don't know how to use this information to solve my problem. Doing the opposite task (reading multibyte UTF-8 into wide chars) I failed to use mbtowc and ended up using winAPI's MultiByteToWideChar.


回答1:


I figured this out. The internal use of wcrtomb (mentioned in details of my question) needs setlocale call, but that call fails with UTF-8 on Windows. So I used winAPI here:

char output[100]; // not wchar_t, write byte-by-byte
int len = WideCharToMultiByte(CP_UTF8,0,str,-1,NULL,0,NULL,NULL);
if(len>100) len = 100;
WideCharToMultiByte(CP_UTF8,0,str,-1,output,len,NULL,NULL);
fputs(output,fd);

And voila! The file is 56B long with expected UTF-8 contents:

příšerně žluťoučký kůň úpěl ďábelské ódy

I hope this will save some nerves to Windows coders.




回答2:


I am not a Windows user, but you might try this:

const wchar_t *str = L"příšerně žluťoučký kůň úpěl ďábelské ódy";
FILE *fd = fopen("file.txt", "w,ccs=UTF-8");
fwprintf(fd, L"%ls", str);
fclose(fd);

I got this idea from this question: How do I write a UTF-8 encoded string to a file in windows, in C++



来源:https://stackoverflow.com/questions/35928843/fwprintf-omits-wide-chars

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!