In a Linux environment, I have a piece of code for reading unicode files, similar as shown below.
However, special characters (like danish letters æ, ø and å) are no
You have to use the imbue() method to tell wifstream
that the file is encoded as UTF-16, and let it consume the BOM for you. You do not have to seekg()
past the BOM manually. For example:
#include <fstream>
#include <string>
#include <locale>
#include <codecvt>
// open as a byte stream
std::wifstream wif("myfile.txt", std::ios::binary);
if (wif.is_open())
{
// apply BOM-sensitive UTF-16 facet
wif.imbue(std::locale(wif.getloc(), new std::codecvt_utf16<wchar_t, 0x10ffff, std::consume_header>));
std::wstring wline;
while (std::getline(wif, wline))
{
std::wstring convert;
for (auto c : wline)
{
if (c != L'\0')
convert += c;
}
}
wif.close();
}