How to create/write a simple XML parser from scratch?

后端 未结 6 914
你的背包
你的背包 2021-02-01 15:44

How to create/write a simple XML parser from scratch?

Rather than code samples, I want to know what are the simplified, basic steps in English.

How is a good par

6条回答
  •  孤城傲影
    2021-02-01 16:39

    for and event based parser the user need to pass it some functions (startNode(name,attrs), endNode(name) and someText(txt) likely through an interface) and call them when needed as you pass over the file

    the parser will have a while loop that will alternate between reading until < and until > and do the proper conversions to the parameter types

    void parse(EventParser p, File file){
        string str;
        while((str = file.readln('<')).length !=0){
            //not using a rewritable buffer to take advantage of slicing 
            //but it's a quick conversion to a implementation with a rewritable buffer though
            if(str.length>1)p.someText(str.chomp('<'));
    
    
            str = file.readln('>');
            str = str.chomp('>');
    
            //split str in name and attrs
            auto parts = str.split();
            string name = parts[0];
            string[string] attrs;
            foreach(attribute;parts[1..$]){
                auto splitAtrr = attribute.split("=");
                attrs[splitAtrr[0]] = splitAtrr[1];
            }
    
            if(str[0] == '/')p.endNode(name);
            else {
                p.startNode(name,attrs);
                if(str[str.length-1]=='/')p.endNode(name);//self closing tag
            }
        }
    }
    

    you can build a DOM parser on top of a event based parser and the basic functionality you'll need for each node is getChildren and getParent getName and getAttributes (with setters when building ;) )

    the object for the dom parser with the above described methods:

    class DOMEventParser : EventParser{
        DOMNode current = new RootNode();
        overrides void startNode(string name,string[string] attrs){
            DOMNode tmp = new ElementNode(current,name,attrs);
            current.appendChild(tmp);
            current = tmp;
        }
        overrides void endNode(string name){
            asser(name == current.name);
            current = current.parent;
        }
        overrides void someText(string txt){
            current.appendChild(new TextNode(txt));
        }
    }
    

    when the parsing ends the rootnode will have the root of the DOM tree

    note: I didn't put any verification code in there to ensure correctness of the xml

    edit: the parsing of the attributes has a bug in it, instead of splitting on whitespace a regex is better for that

提交回复
热议问题