How to check if string is a valid XML element name?

前端 未结 9 1500
失恋的感觉
失恋的感觉 2020-11-27 06:56

I need a regex or a function in PHP that will validate a string to be a good XML element name.

Form w3schools:

XML elements must follow these

相关标签:
9条回答
  • If you are using the DotNet framework try XmlConvert.VerifyName. It will tell you if the name is valid, or use XmlConvert.EncodeName to actually convert an invalid name into a valid one...

    0 讨论(0)
  • 2020-11-27 07:22

    If you want to create valid XML, use the DOM Extension. This way you don't have to bother about any Regex. If you try to put in an invalid name to a DomElement, you'll get an error.

    function isValidXmlName($name)
    {
        try {
            new DOMElement($name);
            return TRUE;
        } catch(DOMException $e) {
            return FALSE;
        }
    }
    

    This will give

    var_dump( isValidXmlName('foo') );      // true   valid localName
    var_dump( isValidXmlName(':foo') );     // true   valid localName
    var_dump( isValidXmlName(':b:c') );     // true   valid localName
    var_dump( isValidXmlName('b:c') );      // false  assumes QName
    

    and is likely good enough for what you want to do.

    Pedantic note 1

    Note the distinction between localName and QName. ext/dom assumes you are using a namespaced element if there is a prefix before the colon, which adds constraints to how the name may be formed. Technically, b:b is a valid local name though because NameStartChar is part of NameChar. If you want to include these, change the function to

    function isValidXmlName($name)
    {
        try {
            new DOMElement(
                $name,
                null,
                strpos($name, ':') >= 1 ? 'http://example.com' : null
            );
            return TRUE;
        } catch(DOMException $e) {
            return FALSE;
        }
    }
    

    Pedantic note 2

    Note that elements may start with "xml". W3schools (who is not affiliated with the W3c) apparently got this part wrong (wouldn't be the first time). If you really want to exclude elements starting with xml add

    if(stripos($name, 'xml') === 0) return false;
    

    before the try/catch.

    0 讨论(0)
  • 2020-11-27 07:23

    How about

    /\A(?!XML)[a-z][\w0-9-]*/i
    

    Usage:

    if (preg_match('/\A(?!XML)[a-z][\w0-9-]*/i', $subject)) {
        # valid name
    } else {
        # invalid name
    }
    

    Explanation:

    \A  Beginning of the string
    (?!XML)  Negative lookahead (assert that it is impossible to match "XML")
    [a-z]  Match a non-digit, non-punctuation character
    [\w0-9-]*  Match an arbitrary number of allowed characters
    /i  make the whole thing case-insensitive
    
    0 讨论(0)
提交回复
热议问题