I want to crawl the page and check for the hyperlinks in that respective page and also follow those hyperlinks and capture data from the page
There is a client side approach for this, using Firefox Greasemonkey extention. with Greasemonkey you can create scripts to be executed each time you open specified urls.
here an example:
if you have urls like these:
http://www.example.com/products/pages/1
http://www.example.com/products/pages/2
then you can use something like this to open all pages containing product list(execute this manually)
var j = 0;
for(var i=1;i<5;i++)
{
setTimeout(function(){
j = j + 1;
window.open('http://www.example.com/products/pages/ + j, '_blank');
}, 15000 * i);
}
then you can create a script to open all products in new window for each product list page and include this url in Greasemonkey for that.
http://www.example.com/products/pages/*
and then a script for each product page to extract data and call a webservice passing data and close window and so on.