How to extract javascript links in an HTML document?

匆匆过客 提交于 2021-01-29 06:11:20

问题


I am writing a small webspider for a website which uses a lot of javascript for links:

<htmlTag onclick="someFunction();">Click here</htmlTag>

where the function looks like:

function someFunction() {
  var _url;
  ...
  // _url constructed, maybe with reference to a value in the HTML doc
  // and/or a value passed as argument(s) to this function
  ...
  window.location.href = _url;
}

What is the best way of evaluating this function server-side so I can construct the value for _url?


回答1:


You could also use env.js and rhino to actually evaluate the JavaScript in the html and detect changes to the location object after manually firing a click event.




回答2:


Not exactly sure what you're trying to accomplish.

If you need to send these values to the server for processing, Ajax would be your best option.




回答3:


It should be a mess to do. But it depends on a lot of params:

  1. Where does the link is stored ? inside the element, in a javascript var, etc...
  2. Is the javascript function always be your own ?

Some hints that could do the trick, should to simply parse your html and use regex to catch http links where the onclick="someFunction();" attribute is present.




回答4:


If you need server-side processing, you need to either:

  1. Do the processing before the content is delivered to the user, and include its output in the response, or
  2. Use something like AJAX to make a new request back to the server


来源:https://stackoverflow.com/questions/897980/how-to-extract-javascript-links-in-an-html-document

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!