I have used LWP and HTML::TreeBuilder with Perl and have found them very useful.
LWP (short for libwww-perl) lets you connect to websites and scrape the HTML, you can get the module here and the O'Reilly book seems to be online here.
TreeBuilder allows you to construct a tree from the HTML, and documentation and source are available in HTML::TreeBuilder - Parser that builds a HTML syntax tree.
There might be too much heavy-lifting still to do with something like this approach though. I have not looked at the Mechanize module suggested by another answer, so I may well do that.