I use wchar_t
for internal strings and UTF-8 for storage in files. I need to use STL to input/output text to screen and also do it by using full Lithua
get FILE* or integer file handle form a std::basic_*fstream?
Answered elsewhere.
You can't make STL to directly work with UTF-8. The basic reason is that STL indirectly forbids multi-char characters. Each character has to be one char/wchar_t.
Microsoft actually breaks the standard with their UTF-16 encoding, so maybe you can get some inspiration there.
Use std::codecvt_facet template to perform the conversion.
You may use standard std::codecvt_byname, or a non-standard codecvt_facet implementation.
#include <locale>
using namespace std;
typedef codecvt_facet<wchar_t, char, mbstate_t> Cvt;
locale utf8locale(locale(), new codecvt_byname<wchar_t, char, mbstate_t> ("en_US.UTF-8"));
wcout.pubimbue(utf8locale);
wcout << L"Hello, wide to multybyte world!" << endl;
Beware that on some platforms codecvt_byname can only emit conversion only for locales that are installed in the system.
The easiest way would be to do the conversion to UTF-8 yourself before trying to output. You might get some inspiration from this question: UTF8 to/from wide char conversion in STL
Well, after some testing I figured out that FILE
is accepted for _iobuf
(in the w*fstream
constructor). So, the following code does what I need.
#
include <iostream>
#
include <fstream>
#
include <io.h>
#
include <fcntl.h>
//For writing
FILE* fp;
_wfopen_s (&fp, L"utf-8_out_test.txt", L"w");
_setmode (_fileno (fp), _O_U8TEXT);
wofstream fs (fp);
fs << L"ąfl";
fclose (fp);
//And reading
FILE* fp;
_wfopen_s (&fp, L"utf-8_in_test.txt", L"r");
_setmode (_fileno (fp), _O_U8TEXT);
wifstream fs (fp);
wchar_t array[6];
fs.getline (array, 5);
wcout << array << endl;//For debug
fclose (fp);
This sample reads and writes legit UTF-8 files (without BOM) in Windows compiled with Visual Studio 2k8.
Can someone give any comments about portability? Improvements?