Read unicode file with special characters using std::wifstream

前端 未结 1 1499
有刺的猬
有刺的猬 2021-01-03 16:24

In a Linux environment, I have a piece of code for reading unicode files, similar as shown below.

However, special characters (like danish letters æ, ø and å) are no

相关标签:
1条回答
  • 2021-01-03 16:51

    You have to use the imbue() method to tell wifstream that the file is encoded as UTF-16, and let it consume the BOM for you. You do not have to seekg() past the BOM manually. For example:

    #include <fstream>
    #include <string>
    #include <locale>
    #include <codecvt>
    
    // open as a byte stream
    std::wifstream wif("myfile.txt", std::ios::binary);
    if (wif.is_open())
    {
        // apply BOM-sensitive UTF-16 facet
        wif.imbue(std::locale(wif.getloc(), new std::codecvt_utf16<wchar_t, 0x10ffff, std::consume_header>));
    
        std::wstring wline;
        while (std::getline(wif, wline))
        {
            std::wstring convert;
            for (auto c : wline)
            {
                if (c != L'\0')
                    convert += c;
            }
        }
    
        wif.close();
    }
    
    0 讨论(0)
提交回复
热议问题