can xsd schema validate encoding, e.g. UTF-8, possible?

六眼飞鱼酱① 提交于 2019-12-31 02:39:11

问题


By using schema, is there any simple/easy way to validate the encoding of an xml msg?

Assuming the 1st line of xml is "not" trustworthy? e.g. ignore ?xml version="1.0" encoding="UTF-8" ?


回答1:


No, schema can't dictate encoding type except in terms of the binary data element types, but this encoding is still going to be encapsulated by the high level encoding of the document itself. This makes sense if you realize that the schema is suppose to describe the information and not the transport format. The XML specification is what dictates that document transport information such as how information is represented and stored in the most generalized sense. Schema is for validating that the data stored via XML meets some kind of syntactical restraints between parties. The prolog (that first line you mention) as defined in the XML spec is the entity in which existence (or not) is what a complaint XML reader needs to know how the document is encoded. Encoding is simply the agreement between the endpoints to represent the correct unicode code points and XML specification specifies how this agreement is reached, not schema.

If you are interested, this is the relevant section of the XML 1.1 specification on how this agreement is reached and more interesting how a complaint reader can 'guess' at the encoding so it is good enough of a guess that the prolog can be read to read the actual encoding attribute: http://www.w3.org/TR/xml11/#sec-guessing



来源:https://stackoverflow.com/questions/4412343/can-xsd-schema-validate-encoding-e-g-utf-8-possible

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!