How to create/write a simple XML parser from scratch?
Rather than code samples, I want to know what are the simplified, basic steps in English.
How is a good par
The first element in the document should be the prolog. This states the xml version, the encoding, whether the file is standalone, and maybe some other stuff. The prolog opens with .
After the prolog, there's tags with metadata. The special tags, like comments, doctypes, and element definitions should start with . Processing instructions start with
. It is possible to have nested tags here, as the
tag can have
and
tags in a dtd style xml document--see Wikipedia for a thorough example.
There should be exactly one top level element. It's the only one without a or a
preceding it. There may be more metadata tags after the top level element; process those first.
For the explicit parsing: First identify tags--they all start with <
--then determine what kind of tag it is and what its closure looks like.