Here\'s the deal. Is there a way to have strings tokenized in a line based on multiple regexes?
One example:
I have to get all href tags, their corresponding
If your problem has anything at all to do with web scraping, I recommend looking at Web::Scraper , which provides easy element selection via XPath respectively CSS selectors. I have a (German) talk on Web::Scraper , but if you run it through babelfish or just look at the code samples, that can help you to get a quick overview of the syntax.
Hand-parsing HTML is onerous and won't give you much over using one of the premade HTML parsers. If your HTML is of very limited variation, you can get by by using clever regular expressions, but if you're already breaking out hard-core parser tools, it sounds as if your HTML is far more regular than what is sane to parse with regular expressions.