swprintf chokes on characters outside 8-bit range

问题

This happens on OS X, though I suspect it applies to any UNIX-y OS. I have two strings that look like this:

const wchar_t *test1 = (const wchar_t *)"\x44\x00\x00\x00\x73\x00\x00\x00\x00\x00\x00\x00";
const wchar_t *test2 = (const wchar_t *)"\x44\x00\x00\x00\x19\x20\x00\x00\x73\x00\x00\x00\x00\x00\x00\x00";

In the debugger, test1 looks like "Ds" and test2 looks like "D's" (with the curly apostrophe). I then call this code:

wchar_t buf1[100], buf2[100];
int ret1 = swprintf(buf1, 100, L"%ls", test1);
int ret2 = swprintf(buf2, 100, L"%ls", test2);

The first swprintf call works fine. The second one returns -1 (and the buffer is unchanged).

I'm guessing the problem has something to do with locales but googling around didn't provide me with anything useful. This is the simplest way to reproduce the problem I'm seeing. What I'm really interested in is vswprintf(), but I assume that's closely related.

Why does swprintf choke on the unicode character that is outside of the 8-bit range? Is there anyway to work around this?

回答1:

Try explicitly set the locale to UTF-8.

setlocale(LC_CTYPE, "UTF-8");
...
const wchar_t* test2 = L"D\x2019s";
int ret2 = swprintf(buf2, 100, L"%ls", test2);
...

来源：https://stackoverflow.com/questions/3085751/swprintf-chokes-on-characters-outside-8-bit-range

标签

macos

unicode

wchar-t

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!