Converting window.open(Hyperlink) Javascript code to pure absolute url with JAVA

风流意气都作罢 提交于 2019-12-02 16:16:18

问题


I work on a website with JAVA Jsoup Library to extract some hyperlinks

Document doc = Jsoup.connect("http://www.saudisale.com/SS_a_mpg.aspx").get();
Elements script = doc.select("script") ;  

for(Element elementary :doc.select("table"))
{
System.out.println(""+elementary.select("tbody").select("tr").select("td").select("input").attr("onClick")+"");

Sample Output:-

window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://ads.saudisale.com/dyaralez.html ','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://ads.saudisale.com/dyaralez.html ','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');


window.open('http://ads.saudisale.com/dalel.html','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');

window.open('http://ads.saudisale.com/dalel.html','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
    window.open('SS_a_car.aspx?carid=37240','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
    window.open('SS_a_car.aspx?carid=37240','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');

Based on the fact that Jsoup does not support javascript, so I have to do some manual java code to convert window.open(hyperlink ) javascript code to absolute hyperlink

For example the following output JavaScript code has to be converted

window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode=1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1')

To: http://saudisale.com/arPrivatePage.aspx?id=21871638

and

window.open('SS_a_car.aspx?carid=37149','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1'); 

To http://www.saudisale.com/SS_a_car.aspx?carid=37149

Could someone guide me how to accomplish this task with JAVA?


回答1:


Use a regex. This will do what you want:

String input = "window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');";

String regex = "window.open\\(['\"]*(.*?)(\\s*['\"]*,.*?)";
Pattern pattern = Pattern.compile(regex); 
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {

    String output = (matcher.group().replaceAll(regex, "$1"));
    System.out.println(output);
}

Your last two URLs are relative, so you have to convert them to absolute URLs as described here.




回答2:


For relative URl I used this code. It works fine.

String input2 = "window.open('SS_a_car.aspx?carid=37149','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1')";        

    URL baseURL = new URL("http://saudisale.com/");

    String regex = "window.open\\(['\"]*(.*?)(\\s*['\"]*,.*?)";
    Pattern pattern = Pattern.compile(regex); 
    Matcher matcher = pattern.matcher(input2);
    while (matcher.find()) {

        String output = (matcher.group().replaceAll(regex, "$1"));
        URL url = new URL( baseURL ,output);
        System.out.println(url);
    }


来源:https://stackoverflow.com/questions/29326901/converting-window-openhyperlink-javascript-code-to-pure-absolute-url-with-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!