XSD: XML files pass validation, but so do XSLs and XSDs

ぃ、小莉子 提交于 2019-12-29 02:02:07

问题


So there's an XSD schema that validates a data file. It declares root element of the document, and then go complexType's that describe structure. The schema has empty target namespace, document nodes are not supposed to be qualified with a namespace.

Recently someone by mistake sent an XSL template in place of an XML data file. That xsl passed validation no problem and was therefore directed to the XSLT processor. Result was basically the free-form text found in the validated XSL.

We then sent all sorts of XML documents to the validator (like, various XSD schemas and XSL templates), and they all passed validation.

We tried different ways of validation (XPathDocument.CheckValidity and XMLDocument.Validate), no difference.

What is happening anyway? Is our validation schema happy to pass any documents whose root nodes are qualified to a namespace different to what the schema describes? How do we prevent that?

EDIT

Validation code (version 1):

Dim data As XPathDocument
....
If Not data.CreateNavigator.CheckValidity(ValidationSchemaSet, AddressOf vh.ValidationHandler) Then
    result = "Validation failed." & ControlChars.NewLine & String.Join(ControlChars.NewLine, vh.Messages.ToArray)
    Return False
End If

, where vh is:

Private Class VHandler
    Public Messages As New List(Of String)

    Public Sub ValidationHandler(ByVal sender As Object, ByVal e As ValidationEventArgs)
        If e.Severity = XmlSeverityType.Error Then
            Messages.Add(e.Message)
        End If
    End Sub
End Class

XSD schema:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:include schemaLocation="CarrierLabel_Type_1.xsd" />
  <xs:include schemaLocation="CarrierLabel_Type_2.xsd" />
  <xs:include schemaLocation="CarrierLabel_Type_3.xsd" />

  <!-- Schema definition -->
  <xs:element name="PrintJob" type="printJobType" />


  <!-- Types declaration -->
  <xs:simpleType name="nonEmptyString">
    <xs:restriction base="xs:string">
      <xs:minLength value="1"/>
    </xs:restriction>
  </xs:simpleType>

  <xs:complexType name="printJobType">
    <xs:sequence minOccurs="1" maxOccurs="unbounded">
      <xs:choice>
        <xs:element name="CarrierLabel_type_1" type="CarrierLabel_type_1" />
        <xs:element name="CarrierLabel_type_2" type="CarrierLabel_type_2" />
        <xs:element name="CarrierLabel_type_3" type="CarrierLabel_type_3" />
      </xs:choice>
    </xs:sequence>

    <xs:attribute name="printer" type="nonEmptyString" use="required" />
    <xs:attribute name="res" type="xs:positiveInteger" use="required" />
  </xs:complexType>

</xs:schema>

Should (and will) pass:

<?xml version='1.0' encoding='utf-8'?>
<PrintJob printer="printer_1" res="200">
  <CarrierLabel_type_1>
    <print_job_id>123456</print_job_id>
    <notes></notes>
    <labels_count>1</labels_count>
    <cases_indicator>2xCASE</cases_indicator>
  </CarrierLabel_type_1>
  <CarrierLabel_type_2>
    <next_location>Go there now!</next_location>
  </CarrierLabel_type_2>
</PrintJob>

Should not pass, but WILL PASS AS VALID DATA:

<?xml version='1.0' encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="text"/>

  <xsl:template match="WrongLabel">
    <xsl:param name="context"/>
    <xsl:param name="res"/>
    WRONG LABEL
  </xsl:template>

</xsl:stylesheet>

回答1:


XML schemas really validate elements within a namespace, not documents. There's no XML Schema rule that says that the top-level element of the instance document must be within a specific namespace. This fits in with the general idea that a namespace is its own little world, and it prevents me from writing a schema in my namespace that will invalidate documents in yours. If an element's not in my namespace, it's none of my business

This means that when validating instance documents, you have to check to make sure that the top-level element of the document you're validating is in a namespace that your application accepts - which, in your application, is simply the default namespace.




回答2:


Without having seen any code, I'm going to take a stab and suggest that it just may be because your validation is setting the ValidationType on the XmlReaderSettings object, but you're either not wiring up the ValidationEventHandler to check for validation errors or simply not doing anything with these validation events.

Even with XmlDocument.Validate, you need to wire up this ValidationEventHandler.

See MSDN here.




回答3:


My understanding is that XML Schema (XSD) does not give any way of requiring that the root node of a document is a certain element -- the only way to do that is to restrict what elements are defined at "global level" to just one element. Is it possible that your validation code is importing the schema for XSLT, so that when it sees an XSLT document it validates because the XSLT elements have been defined at global level.




回答4:


Right.

It turned out, validation has three possible results, not two -- valid, invalid and unknown. So Boolean return value of CheckValidity function is somewhat surprising.

If the root node of the document is not described by the schema, the document passes validation without errors, and no validation events occur, but the root node receives "unknown" status. This, for our purpose, is a fail. So we also need to check the XMLNode.SchemaInfo.Validity member of the root node.

I wish Validate() method documentation was a bit clearer on that.



来源:https://stackoverflow.com/questions/2110049/xsd-xml-files-pass-validation-but-so-do-xsls-and-xsds

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!