C++ unicode file io

前端 未结 5 1475
广开言路
广开言路 2021-02-06 16:10

I need a file io library that can give my program a utf-16 (little endian) interface, but can handle files in other encodings, mainly ascii(input only), utf-8, utf-16, utf-32/uc

相关标签:
5条回答
  • 2021-02-06 16:46

    The problem you see comes from the linefeed conversion. Sadly, it is made at the byte level (after the code conversion) and is not aware of the encoding. IOWs, you have to disable the automatic conversion (by opening the file in binary mode, with the "b" flag) and, if you want 0A00 to be expanded to 0D00A00, you'll have to do it yourself.

    You mention that you'd prefer a C++ wide-stream interface, so I'll outline what I did to achieve that in our software:

    • Write a std::codecvt facet using an ICU UConverter to perform the conversions.
    • Use an std::wfstream to open the file
    • imbue() your custom codecvt in the wfstream
    • Open the wfstream with the binary flag, to turn off the automatic (and erroneous) linefeed conversion.
    • Write a "WNewlineFilter" to perform linefeed conversion on wchars. Use inspiration from boost::iostreams::newline_filter
    • Use a boost::iostreams::filtering_wstream to tie the wfstream and the WNewlineFilter together as a stream.
    0 讨论(0)
  • 2021-02-06 16:50

    You can try the iconv (libiconv) library.

    0 讨论(0)
  • 2021-02-06 16:54

    I think the problems come from the 0D 0A 00 linebreaks. You could try if other linebreaks like \r\n or using LF or CR alone do work (best bet would be using \r, I suppose)

    EDIT: It seems 0D 00 0A 00 is what you want, so you can try

    std::wstring str = L"Hello World in UTF-16!\15\12Another line.\15\12";
    
    0 讨论(0)
  • 2021-02-06 16:56

    I successfully worked with the EZUTF library posted on CodeProject: High Performance Unicode Text File I/O Routines for C++

    0 讨论(0)
  • 2021-02-06 17:10

    UTF8-CPP gives you conversion between UTF-8, 16 and 32. Very nice and light library.

    About ICU, some comments by the UTF8-CPP creator :

    ICU Library. It is very powerful, complete, feature-rich, mature, and widely used. Also big, intrusive, non-generic, and doesn't play well with the Standard Library. I definitelly recommend looking at ICU even if you don't plan to use it.

    :)

    0 讨论(0)
提交回复
热议问题