Content model ambiguity in a schema

本秂侑毒 提交于 2019-12-20 03:21:43

问题


Maybe I've been staring at this problem for too long, maybe there isn't an answer; either way I'm here now.

I'm trying to permit a set of possible combinations in an XSD, but I can't seem to find an approach that doesn't result in ambiguity.

Quick regexy respresentation:

foo+ ( bar baz* | bar? baz+ qux* )
  • foo is required (one-or-more)
  • If bar exists, baz is optional (zero-or-more)
  • If baz exists, bar is optional (zero-or-one) and qux is optional (zero-or-more)
  • qux can not exist if baz does not exist

Ambiguity arises given foo bar baz.

Ambiguous XSD document:

<xs:element name="parent">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="foo" minOccurs="1" maxOccurs="unbounded" />
            <xs:choice>
                <xs:sequence>
                    <xs:element name="bar" minOccurs="1" maxOccurs="1" />
                    <xs:element name="baz" minOccurs="0" maxOccurs="unbounded" />
                </xs:sequence>
                <xs:sequence>
                    <xs:element name="bar" minOccurs="0" maxOccurs="1" />
                    <xs:element name="baz" minOccurs="1" maxOccurs="unbounded" />
                    <xs:element name="qux" minOccurs="0" maxOccurs="unbounded" />
                </xs:sequence>
            </xs:choice>
        </xs:sequence>
    </xs:complexType>
</xs:element>

Screen capture for good measure:


Now, I'm beginning to realize that perhaps this is simply a constraint of the XSD content model. The reason for ambiguity is obvious; the solution not so.

Can anyone see a means with which I can permit this; by re-ordering the elements, through use of some schema design pattern to alleviate ambiguous scenarios like this?

The condition dependency of bar and baz is clearly the problem, but I can't think of any other way to do this.

Thanks so much in advance folks.


Edit: Currently reading "Schema Component Constraint: Unique Particle Attribution" in an attempt to find a loop-hole. Any other suggested reading welcome.


回答1:


IIRC there is a theorem in computer science that says every ambiguous grammar can be rewritten as an unambiguous grammar, so start with the hypothesis that it's possible. However, the unambiguous grammar can sometimes be hideously complex.

I think a good approach to handling this is to draw the "railroad diagram" of the grammar, that is, the finite state machine with its transitions. Then when you find a state in this machine that has two transitions labelled with the same symbol, you need to construct a new state that accepts both those transitions, and so on. In the CS literature this algorithm is called "determinization".

Another approach which is perhaps easier to explain without a whiteboard is to start by factoring out what is common between the two branches of your choice. When you hit the first element in the content, it has to be either a bar or a baz. So write two choices, one starting with bar and one with baz.

As far as I can see, your content model is euiqvalent to the unambiguous model

(bar, (baz+, qux*)?) | (baz+, qux*)

but I would check that carefully...



来源:https://stackoverflow.com/questions/10594643/content-model-ambiguity-in-a-schema

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!