问题
All,
this is my code
//declare string pointer
BSTR markup;
//initialize markup to some well formed XML <-
//declare and initialize XML Document
MSXML2::IXMLDOMDocument2Ptr pXMLDoc;
HRESULT hr;
hr = pXMLDoc.CreateInstance(__uuidof(MSXML2::DOMDocument40));
pXMLDoc->async = VARIANT_FALSE;
pXMLDoc->validateOnParse = VARIANT_TRUE;
pXMLDoc->preserveWhiteSpace = VARIANT_TRUE;
//load markup into XML document
vtBoolResult = pXMLDoc->loadXML(markup);
//do some changes to the XML file<-
//get back string from XML doc
markup = pXMLDoc->Getxml(); //<-- this retrieves RUBBISH
At this point my string is mangled (just a few chinese characters at the start then rubbish) . Looks like an encoding issue.
I also tried the following:
_bstr_t superMarkup = _bstr_t(markup);
//did my stuff
superMarkup = pXMLDoc->Getxml();
markup = superMarkup;
but still I am getting the same result.
Even if I call GetXML() without changing anything in the xml document I still get rubbish.
At this point if I try to assign the mangled pointer to another pointer it will trow an error:
Attempted to restore write protected memory. this is often an indication that other memory is corrupted.
Any suggestion?
EDIT1:
I found out this is happening in relation to the size of the XML string. If it happens on a given XML string and I reduce the size (keeping the same schema) it will work fine. Looks like MSXML2::DOMDocument40 has a limitation on size? In detail it happens if I have more than 16407 characters. I have one more GetXML will retrieve RUBBISH - if it's <= 16407 everything works fine.
EDIT2:
Roddy was right - I was missing that _bstr_t
is a class ...
Rings any bell?
Cheers
回答1:
Try replacing
BSTR Markup;
with
bstr_t Markup;
BSTR is pretty much a dumb pointer, and I think that the return result of GetXML() is being converted to a temporary which is then destroyed by the time you get to see it. bstr_t wraps that with some smart-pointer goodness...
Note: Your "SuperMarkup" thing did NOT do what I suggested. Again, BSTR is just a pointer, and doesn't "own" what it points to. bstr_t, on the other hand does. I think your GetXML() function is returning a bstr_t, which is then being deleted as it goes out of scope, leaving your BSTR pointing to memory that is no longer valid.
回答2:
Ok, I think Patrick is right. I took your code and made a quick ATL EXE project named getxmltest. I added this line after #include directives
#import "MSXML3.DLL"
removed the post-build event which registers the component because I dont want to expose any component from the exe but only have all ATL headers and libs already referenced and added the following code to the _tWinMain
extern "C" int WINAPI _tWinMain(HINSTANCE /*hInstance*/, HINSTANCE /*hPrevInstance*/,
LPTSTR /*lpCmdLine*/, int nShowCmd)
{
CoInitialize(NULL);
{
//declare string pointer
_bstr_t markup;
//initialize markup to some well formed XML <-
//declare and initialize XML Document
MSXML2::IXMLDOMDocument2Ptr pXMLDoc;
HRESULT hr = pXMLDoc.CreateInstance(__uuidof(MSXML2::DOMDocument));
pXMLDoc->async = VARIANT_FALSE;
pXMLDoc->validateOnParse = VARIANT_TRUE;
pXMLDoc->preserveWhiteSpace = VARIANT_TRUE;
//load markup into XML document
VARIANT_BOOL vtBoolResult = pXMLDoc->loadXML(L"<XML></XML>");
//do some changes to the XML file<-
//get back string from XML doc
markup = pXMLDoc->Getxml(); //<-- this retrieves RUBBISH (not anymore...)
ATLTRACE("%S", (BSTR)markup.GetBSTR());
}
CoUninitialize();
return _AtlModule.WinMain(nShowCmd);
}
The resulting trace lines were the following...
'getxmltest.exe': Loaded 'C:\Windows\winsxs\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.6001.18000_none_5cdbaa5a083979cc\comctl32.dll'
<XML></XML>
'getxmltest.exe': Unloaded 'C:\Windows\SysWOW64\msxml3.dll'
The program '[6040] getxmltest.exe: Native' has exited with code 0 (0x0).
Where we can see the string we entered initially.. I didnt add any logic to the code because I though this was enough to display the resulting xml after processing it with the MSXML engine. Obviously you may do some more testing using this code and see what happens next.
回答3:
I'm not proficient with this particular xml library, however:
Something to note here is the original question overwrote the variable 'markup' as it retrieved the result. Many XML parsers return pointers to the initial input (i.e markup), so when you replace it with the output, you also delete the input to the XML parser.
It seems possible that this process would invalidate the string that you just received. You will notice that Eugenio Miró does not make this mistake in his example, as he allocates a different variable to hold the input (pXMLDoc).
A quick test you might like to do is to change
//get back string from XML doc
markup = pXMLDoc->Getxml(); //<-- this retrieves RUBBISH
to
//get back string from XML doc
BSTR output = pXMLDoc->Getxml(); //<-- perhaps this doesn't
and see if that makes a difference.
回答4:
This is the code I wrote before with a little modification which adds 20000 'child' elements :) and it works well.
extern "C" int WINAPI _tWinMain(HINSTANCE /*hInstance*/, HINSTANCE /*hPrevInstance*/,
LPTSTR /*lpCmdLine*/, int nShowCmd)
{
CoInitialize(NULL);
{
//declare string pointer
_bstr_t markup;
//initialize markup to some well formed XML <-
//declare and initialize XML Document
try {
MSXML2::IXMLDOMDocument2Ptr pXMLDoc;
HRESULT hr = pXMLDoc.CreateInstance(__uuidof(MSXML2::DOMDocument));
pXMLDoc->async = VARIANT_FALSE;
pXMLDoc->validateOnParse = VARIANT_TRUE;
pXMLDoc->preserveWhiteSpace = VARIANT_TRUE;
//load markup into XML document
VARIANT_BOOL vtBoolResult = pXMLDoc->loadXML(L"<XML></XML>");
for (int i = 0; i < 20000; i++) {
MSXML2::IXMLDOMNodePtr node = pXMLDoc->createNode(_variant_t("element"), _bstr_t("child"), _bstr_t(""));
if (node)
pXMLDoc->documentElement->appendChild(node);
}
//do some changes to the XML file<-
//get back string from XML doc
markup = pXMLDoc->Getxml(); //<-- th
ATLTRACE("XML lenght = %d, xml=%S\n", markup.length(), (BSTR)markup.GetBSTR());
} catch(_com_error e) {
ATLTRACE("error = %S\n", (BSTR)e.ErrorMessage());
}
}
CoUninitialize();
return _AtlModule.WinMain(nShowCmd);
}
this produces a 1024 output line however in the debugger but this could easely print the xml to stdoutput if you wish. This is the output I get so far
'getxmltest.exe': Loaded 'C:\Windows\winsxs\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.6001.18000_none_5cdbaa5a083979cc\comctl32.dll'
XML lenght = 160013, xml=<XML><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><'getxmltest.exe': Unloaded 'C:\Windows\SysWOW64\msxml3.dll'
The program '[4884] getxmltest.exe: Native' has exited with code 0 (0x0).
来源:https://stackoverflow.com/questions/324168/msxml2ixmldomdocument2ptr-getxml-messing-up-my-string