How can I say a file is SVG without using a magic number?

前端 未结 2 1704
伪装坚强ぢ
伪装坚强ぢ 2021-02-20 08:22

An SVG file is basically an XML file so I could use the string (or the hex representation: \'3c 3f 78 6d 6c\') as a

2条回答
  •  梦谈多话
    2021-02-20 09:01

    XML is not required to start with the preamble, so testing for that prefix is not a good detection technique — not to mention that it would identify every XML as SVG. A decent detection, and really easy to implement, is to use a real XML parser to test that the file is well-formed XML that contains the svg top-level element:

    import xml.etree.cElementTree as et
    
    def is_svg(filename):
        tag = None
        with open(filename, "r") as f:
            try:
                for event, el in et.iterparse(f, ('start',)):
                    tag = el.tag
                    break
            except et.ParseError:
                pass
        return tag == '{http://www.w3.org/2000/svg}svg'
    

    Using cElementTree ensures that the detection is efficient through the use of expat; timeit shows that an SVG file was detected as such in ~200μs, and a non-SVG in 35μs. The iterparse API enables the parser to forego creating the whole element tree (module name notwithstanding) and only read the initial portion of the document, regardless of total file size.

提交回复
热议问题