Cannot fetch image Url (defined with data-original) inside specic class ( JSOUP)

半城伤御伤魂 提交于 2020-01-05 12:11:47

问题


HTML source (Note that it uses lazy load jQuery plugin):

1). When I run code below it fetches all image Urls from website:

    Elements images=document.select("img[src~=(?i)\\.(png|jpe?g|gif)]");

2). But when I specify the class it fails, like below:

    Elements images=document.select("div.newscat img[src~=(?i)\\.(png|jpe?g|gif)]");

And then I employ ( in second case it throws OutOfBoundsException):

for (int i=0;i<images.size();i++){
  imageUrl[i]=images.get(i).attr("src");
}

Could, anyhow, lazy load be problem, if yes, How to solve?


回答1:


Finally , thanks to android: how can i scrap images (in url ) using jsoup?(Image tag contain attribute "data-original" which is url of image)

I found work around changing

Elements images=document.select("div.newscat img[src~=(?i)\\.(png|jpe?g|gif)]");
for (int i=0;i<images.size();i++){
  imageUrl[i]=images.get(i).attr("src");
}

to

Elements images=document.select("div.newscat").select("img");
for (int i=0;i<images.size();i++){
  imageUrl[i]=images.get(i).attr("data-original");
}


来源:https://stackoverflow.com/questions/36373775/cannot-fetch-image-url-defined-with-data-original-inside-specic-class-jsoup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!