I work on a website with JAVA Jsoup Library to extract some hyperlinks
Document doc = Jsoup.connect(\"http://www.saudisale.com/SS_a_mpg.aspx\").get();
Elements
For relative URl I used this code. It works fine.
String input2 = "window.open('SS_a_car.aspx?carid=37149','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1')";
URL baseURL = new URL("http://saudisale.com/");
String regex = "window.open\\(['\"]*(.*?)(\\s*['\"]*,.*?)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input2);
while (matcher.find()) {
String output = (matcher.group().replaceAll(regex, "$1"));
URL url = new URL( baseURL ,output);
System.out.println(url);
}
Use a regex. This will do what you want:
String input = "window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');";
String regex = "window.open\\(['\"]*(.*?)(\\s*['\"]*,.*?)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
String output = (matcher.group().replaceAll(regex, "$1"));
System.out.println(output);
}
Your last two URLs are relative, so you have to convert them to absolute URLs as described here.