How to write a std::string to a UTF-8 text file

前端 未结 9 1038
盖世英雄少女心
盖世英雄少女心 2020-12-01 01:04

I just want to write some few simple lines to a text file in C++, but I want them to be encoded in UTF-8. What is the easiest and simple way to do so?

相关标签:
9条回答
  • 2020-12-01 01:57

    My preference is to convert to and from a std::u32string and work with codepoints internally, then convert to utf8 when writing out to a file using these converting iterators I put on github.

    #include <utf/utf.h>
    
    int main()
    {
        using namespace utf;
    
        u32string u32_text = U"ɦΈ˪˪ʘ";
        // do stuff with string
        // convert to utf8 string
        utf32_to_utf8_iterator<u32string::iterator> pos(u32_text.begin());
        utf32_to_utf8_iterator<u32string::iterator> end(u32_text.end());
    
        u8string u8_text(pos, end);
    
        // write out utf8 to file.
        // ...
    }
    
    0 讨论(0)
  • 2020-12-01 01:59

    The only way UTF-8 affects std::string is that size(), length(), and all the indices are measured in bytes, not characters.

    And, as sbi points out, incrementing the iterator provided by std::string will step forward by byte, not by character, so it can actually point into the middle of a multibyte UTF-8 codepoint. There's no UTF-8-aware iterator provided in the standard library, but there are a few available on the 'Net.

    If you remember that, you can put UTF-8 into std::string, write it to a file, etc. all in the usual way (by which I mean the way you'd use a std::string without UTF-8 inside).

    You may want to start your file with a byte order mark so that other programs will know it is UTF-8.

    0 讨论(0)
  • 2020-12-01 02:02

    libiconv is a great library for all our encoding and decoding needs.

    If you are using Windows you can use WideCharToMultiByte and specify that you want UTF8.

    0 讨论(0)
提交回复
热议问题