Is it possible and what tools could be used to parse an html document as a string or from a file and then to construct a DOM tree so that a developer can walk the tree throu
HTML Parser seems to support conversion from HTML to XML. Then you can build a DOM tree using the usual Java toolchain.