C CSV API for unicode

天大地大妈咪最大 提交于 2019-12-11 06:24:45

问题


I need a C API for manipulating CSV data that can work with unicode. I am aware of libcsv (sourceforge.net/projects/libcsv), but I don't think that will work for unicode (please correct me if I'm wrong) because don't see wchar_t being used.

Please advise.


回答1:


It looks like libcsv does not use the C string functions to do its work, so it almost works out of the box, in spite of its mbcs/ws ignorance. It treats the string as an array of bytes with an explicit length. This might mostly work for certain wide character encodings that pad out ASCII bytes to fill the width (so newline might be encoded as "\0\n" and space as "\0 "). You could also encode your wide data as UTF-8, which should make things a bit easier. But both approaches might founder on the way libcsv identifies space and line terminator tokens: it expects you to tell it on a byte-to-byte basis whether it's looking at a space or terminator, which doesn't allow for multibyte space/term encodings. You could fix this by modifying the library to pass a pointer into the string and the length left in the string to its space/term test functions, which would be pretty straightforward.



来源:https://stackoverflow.com/questions/7758761/c-csv-api-for-unicode

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!