MSXML2::IXMLDOMDocument2Ptr->GetXML() messing up my string!

浪子不回头ぞ 提交于 2019-12-12 23:12:43

问题


All,

this is my code

//declare string pointer
BSTR markup;

//initialize markup to some well formed XML <-

//declare and initialize XML Document
MSXML2::IXMLDOMDocument2Ptr pXMLDoc;
HRESULT hr;
hr = pXMLDoc.CreateInstance(__uuidof(MSXML2::DOMDocument40));
pXMLDoc->async = VARIANT_FALSE;
pXMLDoc->validateOnParse = VARIANT_TRUE;
pXMLDoc->preserveWhiteSpace = VARIANT_TRUE;    

//load markup into XML document
vtBoolResult = pXMLDoc->loadXML(markup);

//do some changes to the XML file<-

//get back string from XML doc
markup = pXMLDoc->Getxml(); //<-- this retrieves RUBBISH

At this point my string is mangled (just a few chinese characters at the start then rubbish) . Looks like an encoding issue.

I also tried the following:

_bstr_t superMarkup = _bstr_t(markup);

//did my stuff

superMarkup = pXMLDoc->Getxml();

markup = superMarkup; 

but still I am getting the same result.

Even if I call GetXML() without changing anything in the xml document I still get rubbish.

At this point if I try to assign the mangled pointer to another pointer it will trow an error:

Attempted to restore write protected memory. this is often an indication that other memory is corrupted.

Any suggestion?

EDIT1:

I found out this is happening in relation to the size of the XML string. If it happens on a given XML string and I reduce the size (keeping the same schema) it will work fine. Looks like MSXML2::DOMDocument40 has a limitation on size? In detail it happens if I have more than 16407 characters. I have one more GetXML will retrieve RUBBISH - if it's <= 16407 everything works fine.

EDIT2:

Roddy was right - I was missing that _bstr_t is a class ...

Rings any bell?

Cheers


回答1:


Try replacing

 BSTR Markup;

with

 bstr_t Markup;

BSTR is pretty much a dumb pointer, and I think that the return result of GetXML() is being converted to a temporary which is then destroyed by the time you get to see it. bstr_t wraps that with some smart-pointer goodness...

Note: Your "SuperMarkup" thing did NOT do what I suggested. Again, BSTR is just a pointer, and doesn't "own" what it points to. bstr_t, on the other hand does. I think your GetXML() function is returning a bstr_t, which is then being deleted as it goes out of scope, leaving your BSTR pointing to memory that is no longer valid.




回答2:


Ok, I think Patrick is right. I took your code and made a quick ATL EXE project named getxmltest. I added this line after #include directives

#import "MSXML3.DLL"

removed the post-build event which registers the component because I dont want to expose any component from the exe but only have all ATL headers and libs already referenced and added the following code to the _tWinMain

extern "C" int WINAPI _tWinMain(HINSTANCE /*hInstance*/, HINSTANCE /*hPrevInstance*/, 
                                LPTSTR /*lpCmdLine*/, int nShowCmd)
{
    CoInitialize(NULL);
    {
        //declare string pointer
        _bstr_t                     markup;
        //initialize markup to some well formed XML <-
        //declare and initialize XML Document
        MSXML2::IXMLDOMDocument2Ptr pXMLDoc;
        HRESULT                     hr              = pXMLDoc.CreateInstance(__uuidof(MSXML2::DOMDocument));

        pXMLDoc->async              = VARIANT_FALSE;
        pXMLDoc->validateOnParse    = VARIANT_TRUE;
        pXMLDoc->preserveWhiteSpace = VARIANT_TRUE;    

        //load markup into XML document
        VARIANT_BOOL                vtBoolResult    = pXMLDoc->loadXML(L"<XML></XML>");

        //do some changes to the XML file<-
        //get back string from XML doc
        markup = pXMLDoc->Getxml(); //<-- this retrieves RUBBISH (not anymore...)
        ATLTRACE("%S", (BSTR)markup.GetBSTR());
    }
    CoUninitialize();
    return _AtlModule.WinMain(nShowCmd);
}

The resulting trace lines were the following...

'getxmltest.exe': Loaded 'C:\Windows\winsxs\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.6001.18000_none_5cdbaa5a083979cc\comctl32.dll'
<XML></XML>
'getxmltest.exe': Unloaded 'C:\Windows\SysWOW64\msxml3.dll'
The program '[6040] getxmltest.exe: Native' has exited with code 0 (0x0).

Where we can see the string we entered initially.. I didnt add any logic to the code because I though this was enough to display the resulting xml after processing it with the MSXML engine. Obviously you may do some more testing using this code and see what happens next.




回答3:


I'm not proficient with this particular xml library, however:

Something to note here is the original question overwrote the variable 'markup' as it retrieved the result. Many XML parsers return pointers to the initial input (i.e markup), so when you replace it with the output, you also delete the input to the XML parser.

It seems possible that this process would invalidate the string that you just received. You will notice that Eugenio Miró does not make this mistake in his example, as he allocates a different variable to hold the input (pXMLDoc).

A quick test you might like to do is to change

//get back string from XML doc
markup = pXMLDoc->Getxml(); //<-- this retrieves RUBBISH

to

//get back string from XML doc
BSTR output = pXMLDoc->Getxml(); //<-- perhaps this doesn't

and see if that makes a difference.




回答4:


This is the code I wrote before with a little modification which adds 20000 'child' elements :) and it works well.

extern "C" int WINAPI _tWinMain(HINSTANCE /*hInstance*/, HINSTANCE /*hPrevInstance*/, 
                                LPTSTR /*lpCmdLine*/, int nShowCmd)
{
    CoInitialize(NULL);
    {
        //declare string pointer
        _bstr_t                           markup;
        //initialize markup to some well formed XML <-
        //declare and initialize XML Document
        try {
            MSXML2::IXMLDOMDocument2Ptr   pXMLDoc;
            HRESULT                       hr              = pXMLDoc.CreateInstance(__uuidof(MSXML2::DOMDocument));    
            pXMLDoc->async                = VARIANT_FALSE;
            pXMLDoc->validateOnParse      = VARIANT_TRUE;
            pXMLDoc->preserveWhiteSpace   = VARIANT_TRUE;    

            //load markup into XML document
            VARIANT_BOOL                  vtBoolResult    = pXMLDoc->loadXML(L"<XML></XML>");

            for (int i = 0; i < 20000; i++) {
                MSXML2::IXMLDOMNodePtr    node            = pXMLDoc->createNode(_variant_t("element"), _bstr_t("child"), _bstr_t(""));

                if (node)
                    pXMLDoc->documentElement->appendChild(node);
            }

            //do some changes to the XML file<-
            //get back string from XML doc
            markup = pXMLDoc->Getxml(); //<-- th
            ATLTRACE("XML lenght = %d, xml=%S\n", markup.length(), (BSTR)markup.GetBSTR());
        } catch(_com_error e) {
            ATLTRACE("error = %S\n", (BSTR)e.ErrorMessage());
        }
    }
    CoUninitialize();
    return _AtlModule.WinMain(nShowCmd);
}

this produces a 1024 output line however in the debugger but this could easely print the xml to stdoutput if you wish. This is the output I get so far

'getxmltest.exe': Loaded 'C:\Windows\winsxs\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.6001.18000_none_5cdbaa5a083979cc\comctl32.dll'
XML lenght = 160013, xml=<XML><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><child/><'getxmltest.exe': Unloaded 'C:\Windows\SysWOW64\msxml3.dll'
The program '[4884] getxmltest.exe: Native' has exited with code 0 (0x0).


来源:https://stackoverflow.com/questions/324168/msxml2ixmldomdocument2ptr-getxml-messing-up-my-string

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!