is it possible to write web crawler in javascript?

后端 未结 11 580
深忆病人
深忆病人 2021-02-01 07:48

I want to crawl the page and check for the hyperlinks in that respective page and also follow those hyperlinks and capture data from the page

11条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-02-01 08:38

    There is a client side approach for this, using Firefox Greasemonkey extention. with Greasemonkey you can create scripts to be executed each time you open specified urls.

    here an example:

    if you have urls like these:

    http://www.example.com/products/pages/1

    http://www.example.com/products/pages/2

    then you can use something like this to open all pages containing product list(execute this manually)

    var j = 0;
    for(var i=1;i<5;i++)
    { 
      setTimeout(function(){
      j = j + 1;
      window.open('http://www.example.com/products/pages/ + j, '_blank');
    
    }, 15000 * i);
    

    }

    then you can create a script to open all products in new window for each product list page and include this url in Greasemonkey for that.

    http://www.example.com/products/pages/*

    and then a script for each product page to extract data and call a webservice passing data and close window and so on.

提交回复
热议问题