I have a collection of HTML documents for which I need to parse the contents of the tags in the
JTidy should provide a good starting point for this.