问题
RapidXML is one of the available libraries for parsing XML in c++. For getting the values, we can use something like:
node->first_node("xmlnode")->value()
This command returns a char* data type. Is there any way to read the value as Unicode so I can assign it in a WCHAR or wstring variable?
回答1:
From the manual
RapidXml is character type agnostic, and can work both with narrow and wide characters. Current version does not fully support UTF-16 or UTF-32, so use of wide characters is somewhat incapacitated. However, it should succesfully parse wchar_t strings containing UTF-16 or UTF-32 if endianness of the data matches that of the machine.
so I just use the following:
#include <rapidxml/rapidxml.hpp>
typedef rapidxml::xml_node<wchar_t> const * xml_node_cptr;
typedef rapidxml::xml_node<wchar_t> * xml_node_ptr;
typedef rapidxml::xml_attribute<wchar_t> const * xml_attribute_cptr;
typedef rapidxml::xml_attribute<wchar_t> * xml_attribute_ptr;
typedef rapidxml::xml_document<wchar_t> xml_doc;
Note that if you do this, all parameters will be wchar_t, so the call to first_node() also needs to wchar_t. i.e.
node->first_node(L"xmlnode")->value()
回答2:
here you need convert str to wstr. you can use for this standart std
#include <string>
#include <codecvt>
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
std::string strSample; // convert str to wstr
std::wstring wstrValue = converter.from_bytes(strSample);
std::wstring wstrSample; // convert wstr to str
std::string strValue = converter.to_bytes(wstrSample);
hope this help
回答3:
Another solution is using the functions given in: http://msmvps.com/blogs/gdicanio/archive/2010/01/04/conversion-between-unicode-utf-16-and-utf-8-in-c-win32.aspx
CStringW ConvertUTF8ToUTF16( __in const CHAR * pszTextUTF8 )
{
//
// Special case of NULL or empty input string
//
if ( (pszTextUTF8 == NULL) || (*pszTextUTF8 == '\0') )
{
// Return empty string
return L"";
}
//
// Consider CHAR's count corresponding to total input string length,
// including end-of-string (\0) character
//
const size_t cchUTF8Max = INT_MAX - 1;
size_t cchUTF8;
HRESULT hr = ::StringCchLengthA( pszTextUTF8, cchUTF8Max, &cchUTF8 );
if ( FAILED( hr ) )
{
AtlThrow( hr );
}
// Consider also terminating \0
++cchUTF8;
// Convert to 'int' for use with MultiByteToWideChar API
int cbUTF8 = static_cast<int>( cchUTF8 );
//
// Get size of destination UTF-16 buffer, in WCHAR's
//
int cchUTF16 = ::MultiByteToWideChar(
CP_UTF8, // convert from UTF-8
MB_ERR_INVALID_CHARS, // error on invalid chars
pszTextUTF8, // source UTF-8 string
cbUTF8, // total length of source UTF-8 string,
// in CHAR's (= bytes), including end-of-string \0
NULL, // unused - no conversion done in this step
0 // request size of destination buffer, in WCHAR's
);
ATLASSERT( cchUTF16 != 0 );
if ( cchUTF16 == 0 )
{
AtlThrowLastWin32();
}
//
// Allocate destination buffer to store UTF-16 string
//
CStringW strUTF16;
WCHAR * pszUTF16 = strUTF16.GetBuffer( cchUTF16 );
//
// Do the conversion from UTF-8 to UTF-16
//
int result = ::MultiByteToWideChar(
CP_UTF8, // convert from UTF-8
MB_ERR_INVALID_CHARS, // error on invalid chars
pszTextUTF8, // source UTF-8 string
cbUTF8, // total length of source UTF-8 string,
// in CHAR's (= bytes), including end-of-string \0
pszUTF16, // destination buffer
cchUTF16 // size of destination buffer, in WCHAR's
);
ATLASSERT( result != 0 );
if ( result == 0 )
{
AtlThrowLastWin32();
}
// Release internal CString buffer
strUTF16.ReleaseBuffer();
// Return resulting UTF16 string
return strUTF16;
}
来源:https://stackoverflow.com/questions/18514242/how-to-read-unicode-xml-values-with-rapidxml