How to write a std::string to a UTF-8 text file

前端 未结 9 1054
盖世英雄少女心
盖世英雄少女心 2020-12-01 01:04

I just want to write some few simple lines to a text file in C++, but I want them to be encoded in UTF-8. What is the easiest and simple way to do so?

相关标签:
9条回答
  • 2020-12-01 01:40

    If by "simple" you mean ASCII, there is no need to do any encoding, since characters with an ASCII value of 127 or less are the same in UTF-8.

    0 讨论(0)
  • 2020-12-01 01:44

    Use Glib::ustring from glibmm.

    It is the only widespread UTF-8 string container (AFAIK). While glyph (not byte) based, it has the same method signatures as std::string so the port should be simple search and replace (just make sure that your data is valid UTF-8 before loading it into a ustring).

    0 讨论(0)
  • 2020-12-01 01:46

    As to UTF-8 is multibite characters string and so you get some problems to work and it's a bad idea/ Instead use normal Unicode.

    So by my opinion best is use ordinary ASCII char text with some codding set. Need to use Unicode if you use more than 2 sets of different symbols (languages) in single.

    It's rather rare case. In most cases enough 2 sets of symbols. For this common case use ASCII chars, not Unicode.

    Effect of using multibute chars like UTF-8 you get only China traditional, arabic or some hieroglyphic text. It's very very rare case!!!

    I don't think there are many peoples needs that. So never use UTF-8!!! It's avoid strong headache of manipulate such strings.

    0 讨论(0)
  • 2020-12-01 01:48

    There is nice tiny library to work with utf8 from c++: utfcpp

    0 讨论(0)
  • 2020-12-01 01:55

    What is the easiest and simple way to do so?

    The most intuitive and thus easiest handling of utf8 in C++ is for sure using a drop-in replacement for std::string. As the internet still lacks of one, I went to implement the functionality on my own:

    tinyutf8 (EDIT: now Github).

    This library provides a very lightweight drop-in preplacement for std::string (or std::u32string if you will, because you iterate over codepoints rather that chars). Ity is implemented succesfully in the middle between fast access and small memory consumption, while being very robust. This robustness to 'invalid' UTF8-sequences makes it (nearly completely) compatible with ANSI (0-255).

    Hope this helps!

    0 讨论(0)
  • 2020-12-01 01:56
    std::wstring text = L"Привет";
    QString qstr = QString::fromStdWString(text);
    QByteArray byteArray(qstr.toUtf8());    
    std::string str_std( byteArray.constData(), byteArray.length());
    
    0 讨论(0)
提交回复
热议问题