问题
HTML source (Note that it uses lazy load jQuery plugin):
1). When I run code below it fetches all image Urls from website:
Elements images=document.select("img[src~=(?i)\\.(png|jpe?g|gif)]");
2). But when I specify the class it fails, like below:
Elements images=document.select("div.newscat img[src~=(?i)\\.(png|jpe?g|gif)]");
And then I employ ( in second case it throws OutOfBoundsException):
for (int i=0;i<images.size();i++){
imageUrl[i]=images.get(i).attr("src");
}
Could, anyhow, lazy load be problem, if yes, How to solve?
回答1:
Finally , thanks to android: how can i scrap images (in url ) using jsoup?(Image tag contain attribute "data-original" which is url of image)
I found work around changing
Elements images=document.select("div.newscat img[src~=(?i)\\.(png|jpe?g|gif)]");
for (int i=0;i<images.size();i++){
imageUrl[i]=images.get(i).attr("src");
}
to
Elements images=document.select("div.newscat").select("img");
for (int i=0;i<images.size();i++){
imageUrl[i]=images.get(i).attr("data-original");
}
来源:https://stackoverflow.com/questions/36373775/cannot-fetch-image-url-defined-with-data-original-inside-specic-class-jsoup