问题
Hope you all are doing well.
Previously I asked a question how to import a XML file to SQL Server thanks to all you responses.
As my source file come with heavy amount of data, I am trying to load by SSIS. Below are the steps I followed:
- Imported XML by BulkLoad to a XML type column
- Created XSD Schema out of that XML file in SQL Server
- Now in SSIS, used XML Source and provided the XML schema for mapping to OLEDB destination.
But the execution failed saying
"Error: 0xC02090E7 at Load XML, XML Source 1: The component "XML Source" (1) was unable to read the XML data. DTD is prohibited in this XML document."
BOL says SSIS doesn't support DTD, and there is no way we can avoid DTD in my source file.
Please anyone help me to resolve this issue.
Here is my XML file:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE _line_feed [
<!ELEMENT FeedTime (#PCDATA)>
<!ELEMENT lastContest (#PCDATA)>
<!ELEMENT lastGame (#PCDATA)>
<!ELEMENT contest_maximum (#PCDATA)>
<!ELEMENT contestantnum (#PCDATA)>
<!ELEMENT description (#PCDATA)>
<!ELEMENT event (event_datetimeGMT, gamenumber, sporttype, league, contest_maximum?, description?, (participants | periods | total)+)>
<!ELEMENT event_datetimeGMT (#PCDATA)>
<!ELEMENT gamenumber (#PCDATA)>
<!ELEMENT league (#PCDATA)>
<!ELEMENT odds (moneyline_value, to_base?)>
<!ELEMENT over_adjust (#PCDATA)>
<!ELEMENT participants (participant*)>
<!ELEMENT participant (participant_name, contestantnum, rotnum, visiting_home_draw?, odds?, pitcher?)>
<!ELEMENT participant_name (#PCDATA)>
<!ELEMENT periods (period*)>
<!ELEMENT period (period_number, period_description, periodcutoff_datetimeGMT, period_status, period_update, spread_maximum?, moneyline_maximum?, total_maximum?, moneyline?, spread?, total?)>
<!ELEMENT period_number (#PCDATA)>
<!ELEMENT period_description (#PCDATA)>
<!ELEMENT period_status (#PCDATA)>
<!ELEMENT period_update (#PCDATA)>
<!ELEMENT periodcutoff_datetimeGMT (#PCDATA)>
<!ELEMENT _line_feed (FeedTime, lastContest, lastGame, events)>
<!ELEMENT events (event*)>
<!ELEMENT pitcher (#PCDATA)>
<!ELEMENT rotnum (#PCDATA)>
<!ELEMENT sporttype (#PCDATA)>
<!ELEMENT moneyline (moneyline_visiting, moneyline_home, moneyline_draw?)>
<!ELEMENT moneyline_value (#PCDATA)>
<!ELEMENT moneyline_visiting (#PCDATA)>
<!ELEMENT moneyline_home (#PCDATA)>
<!ELEMENT moneyline_draw (#PCDATA)>
<!ELEMENT moneyline_maximum (#PCDATA)>
<!ELEMENT spread (spread_visiting, spread_adjust_visiting, spread_home, spread_adjust_home)>
<!ELEMENT spread_adjust_home (#PCDATA)>
<!ELEMENT spread_adjust_visiting (#PCDATA)>
<!ELEMENT spread_home (#PCDATA)>
<!ELEMENT spread_maximum (#PCDATA)>
<!ELEMENT spread_visiting (#PCDATA)>
<!ELEMENT to_base (#PCDATA)>
<!ELEMENT total (total_points, over_adjust?, under_adjust?, units?)>
<!ELEMENT total_maximum (#PCDATA)>
<!ELEMENT total_points (#PCDATA)>
<!ELEMENT under_adjust (#PCDATA)>
<!ELEMENT units (#PCDATA)>
<!ELEMENT visiting_home_draw (#PCDATA)>
]>
<_line_feed>
<FeedTime>1279783821193</FeedTime>
<lastContest>4118567</lastContest>
<lastGame>58681915</lastGame>
<events>
<event>
<event_datetimeGMT>2010-07-22 20:05</event_datetimeGMT>
<gamenumber>174201668</gamenumber>
<sporttype>Tennis</sporttype>
<league>M Atlanta 16</league>
<participants>
<participant>
<participant_name>A. Roddick</participant_name>
<contestantnum>4333</contestantnum>
<rotnum>4333</rotnum>
<visiting_home_draw>Visiting</visiting_home_draw>
</participant>
<participant>
<participant_name>R. Ram</participant_name>
<contestantnum>4334</contestantnum>
<rotnum>4334</rotnum>
<visiting_home_draw>Home</visiting_home_draw>
</participant>
</participants>
<periods>
<period>
<period_number>0</period_number>
<period_description>Game</period_description>
<periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
<period_status>I</period_status>
<period_update>open</period_update>
<spread_maximum>500</spread_maximum>
<moneyline_maximum>1500</moneyline_maximum>
<total_maximum>500</total_maximum>
<moneyline>
<moneyline_visiting>-1850</moneyline_visiting>
<moneyline_home>1290</moneyline_home>
</moneyline>
</period>
<period>
<period_number>0</period_number>
<period_description>Game</period_description>
<periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
<period_status>O</period_status>
<period_update>open</period_update>
<spread_maximum>500</spread_maximum>
<moneyline_maximum>2000</moneyline_maximum>
<total_maximum>500</total_maximum>
<spread>
<spread_visiting>-5.5</spread_visiting>
<spread_adjust_visiting>-124</spread_adjust_visiting>
<spread_home>5.5</spread_home>
<spread_adjust_home>106</spread_adjust_home>
</spread>
<total>
<total_points>19.5</total_points>
<over_adjust>113</over_adjust>
<under_adjust>-132</under_adjust>
</total>
</period>
<period>
<period_number>1</period_number>
<period_description>1st Set</period_description>
<periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
<period_status>I</period_status>
<period_update>open</period_update>
<spread_maximum>5000</spread_maximum>
<moneyline_maximum>250</moneyline_maximum>
<total_maximum>5000</total_maximum>
<moneyline>
<moneyline_visiting>-675</moneyline_visiting>
<moneyline_home>497</moneyline_home>
</moneyline>
</period>
</periods>
</event>
<event>
<event_datetimeGMT>2010-07-22 20:05</event_datetimeGMT>
<gamenumber>174263209</gamenumber>
<sporttype>Tennis</sporttype>
<league>M Atlanta 16</league>
<participants>
<participant>
<participant_name>I. Marchenko</participant_name>
<contestantnum>4335</contestantnum>
<rotnum>4335</rotnum>
<visiting_home_draw>Visiting</visiting_home_draw>
</participant>
<participant>
<participant_name>X. Malisse</participant_name>
<contestantnum>4336</contestantnum>
<rotnum>4336</rotnum>
<visiting_home_draw>Home</visiting_home_draw>
</participant>
</participants>
<periods>
<period>
<period_number>0</period_number>
<period_description>Game</period_description>
<periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
<period_status>O</period_status>
<period_update>open</period_update>
<spread_maximum>500</spread_maximum>
<moneyline_maximum>2000</moneyline_maximum>
<total_maximum>500</total_maximum>
<moneyline>
<moneyline_visiting>139</moneyline_visiting>
<moneyline_home>-151</moneyline_home>
</moneyline>
</period>
<period>
<period_number>0</period_number>
<period_description>Game</period_description>
<periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
<period_status>O</period_status>
<period_update>open</period_update>
<spread_maximum>500</spread_maximum>
<moneyline_maximum>2000</moneyline_maximum>
<total_maximum>500</total_maximum>
<spread>
<spread_visiting>2</spread_visiting>
<spread_adjust_visiting>100</spread_adjust_visiting>
<spread_home>-2</spread_home>
<spread_adjust_home>-117</spread_adjust_home>
</spread>
<total>
<total_points>22.5</total_points>
<over_adjust>-108</over_adjust>
<under_adjust>-108</under_adjust>
</total>
</period>
<period>
<period_number>1</period_number>
<period_description>1st Set</period_description>
<periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
<period_status>O</period_status>
<period_update>open</period_update>
<spread_maximum>5000</spread_maximum>
<moneyline_maximum>500</moneyline_maximum>
<total_maximum>5000</total_maximum>
<moneyline>
<moneyline_visiting>115</moneyline_visiting>
<moneyline_home>-134</moneyline_home>
</moneyline>
</period>
</periods>
</event>
<event>
<event_datetimeGMT>2010-07-22 21:30</event_datetimeGMT>
<gamenumber>174271178</gamenumber>
<sporttype>Tennis</sporttype>
<league>M Atlanta 16</league>
<participants>
<participant>
<participant_name>K. Andersoõn</participant_name>
<contestantnum>4341</contestantnum>
<rotnum>4341</rotnum>
<visiting_home_draw>Visiting</visiting_home_draw>
</participant>
<participant>
<participant_name>D. Young</participant_name>
<contestantnum>4342</contestantnum>
<rotnum>4342</rotnum>
<visiting_home_draw>Home</visiting_home_draw>
</participant>
</participants>
<periods>
<period>
<period_number>0</period_number>
<period_description>Game</period_description>
<periodcutoff_datetimeGMT>2010-07-22 21:30</periodcutoff_datetimeGMT>
<period_status>O</period_status>
<period_update>open</period_update>
<spread_maximum>500</spread_maximum>
<moneyline_maximum>2000</moneyline_maximum>
<total_maximum>500</total_maximum>
<moneyline>
<moneyline_visiting>-148</moneyline_visiting>
<moneyline_home>136</moneyline_home>
</moneyline>
</period>
<period>
<period_number>0</period_number>
<period_description>Game</period_description>
<periodcutoff_datetimeGMT>2010-07-22 21:30</periodcutoff_datetimeGMT>
<period_status>O</period_status>
<period_update>open</period_update>
<spread_maximum>500</spread_maximum>
<moneyline_maximum>2000</moneyline_maximum>
<total_maximum>500</total_maximum>
<spread>
<spread_visiting>-2</spread_visiting>
<spread_adjust_visiting>-121</spread_adjust_visiting>
<spread_home>2</spread_home>
<spread_adjust_home>104</spread_adjust_home>
</spread>
<total>
<total_points>22.5</total_points>
<over_adjust>-111</over_adjust>
<under_adjust>-105</under_adjust>
</total>
</period>
<period>
<period_number>1</period_number>
<period_description>1st Set</period_description>
<periodcutoff_datetimeGMT>2010-07-22 21:30</periodcutoff_datetimeGMT>
<period_status>O</period_status>
<period_update>open</period_update>
<spread_maximum>5000</spread_maximum>
<moneyline_maximum>500</moneyline_maximum>
<total_maximum>5000</total_maximum>
<moneyline>
<moneyline_visiting>-126</moneyline_visiting>
<moneyline_home>108</moneyline_home>
</moneyline>
</period>
</periods>
</event>
...
...
</events>
</_line_feed>
XSD schema file used:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="_line_feed">
<xs:complexType>
<xs:sequence>
<xs:element name="FeedTime" type="xs:unsignedLong" />
<xs:element name="lastContest" type="xs:unsignedInt" />
<xs:element name="lastGame" type="xs:unsignedInt" />
<xs:element name="events">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="event">
<xs:complexType>
<xs:sequence>
<xs:element name="event_datetimeGMT" type="xs:string" />
<xs:element name="gamenumber" type="xs:unsignedInt" />
<xs:element name="sporttype" type="xs:string" />
<xs:element name="league" type="xs:string" />
<xs:element minOccurs="0" name="contest_maximum" type="xs:unsignedShort" />
<xs:element minOccurs="0" name="description" type="xs:string" />
<xs:element name="participants">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="participant">
<xs:complexType>
<xs:sequence>
<xs:element name="participant_name" type="xs:string" />
<xs:element name="contestantnum" type="xs:unsignedInt" />
<xs:element name="rotnum" type="xs:unsignedShort" />
<xs:element minOccurs="0" name="odds">
<xs:complexType>
<xs:sequence>
<xs:element name="moneyline_value" type="xs:string" />
<xs:element name="to_base" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element minOccurs="0" name="visiting_home_draw" type="xs:string" />
<xs:element minOccurs="0" name="pitcher" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element minOccurs="0" name="total">
<xs:complexType>
<xs:sequence>
<xs:element name="total_points" type="xs:decimal" />
<xs:element name="units" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element minOccurs="0" name="periods">
<xs:complexType>
<xs:sequence minOccurs="0">
<xs:element maxOccurs="unbounded" name="period">
<xs:complexType>
<xs:sequence>
<xs:element name="period_number" type="xs:unsignedByte" />
<xs:element name="period_description" type="xs:string" />
<xs:element name="periodcutoff_datetimeGMT" type="xs:string" />
<xs:element name="period_status" type="xs:string" />
<xs:element name="period_update" type="xs:string" />
<xs:element name="spread_maximum" type="xs:unsignedShort" />
<xs:element name="moneyline_maximum" type="xs:unsignedShort" />
<xs:element name="total_maximum" type="xs:unsignedShort" />
<xs:element minOccurs="0" name="moneyline">
<xs:complexType>
<xs:sequence>
<xs:element name="moneyline_visiting" type="xs:short" />
<xs:element name="moneyline_home" type="xs:short" />
<xs:element minOccurs="0" name="moneyline_draw" type="xs:unsignedShort" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element minOccurs="0" name="spread">
<xs:complexType>
<xs:sequence>
<xs:element name="spread_visiting" type="xs:decimal" />
<xs:element name="spread_adjust_visiting" type="xs:short" />
<xs:element name="spread_home" type="xs:decimal" />
<xs:element name="spread_adjust_home" type="xs:short" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element minOccurs="0" name="total">
<xs:complexType>
<xs:sequence>
<xs:element name="total_points" type="xs:decimal" />
<xs:element name="over_adjust" type="xs:short" />
<xs:element name="under_adjust" type="xs:short" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Awaiting your kind response. Also let me know if need some other details to understand the scenario.
Thanks, Prashant
回答1:
Have you tried loading the raw XML into a XML data type in a database table and then converting the data into database tables via stored procedures? I generally prefer this method so that I can preserve the original XML code in case something changes in the schema that I am unaware of at the time of the import such as the source system adding new nodes to the file. If you go with this approach, then you could use sp_xml_preparedocument and OPENXML to convert the data. Please note OPENXML supports DTD to infer the output data types.
回答2:
In SSRS, this error can happen if there's a memory shortage. Don't know whether that goes for SSIS as well. (But you mention the large file explicitly, so perhaps it's relevant. You could try with a smaller file).
This link indicates you'll have to make your own workaround. Perhaps with some preprocessing of the XML file?
来源:https://stackoverflow.com/questions/3350514/load-xml-by-ssis-dtd-is-prohibited-in-this-xml-document-error