How to find and extract “main” image in website

后端 未结 5 1906
北荒
北荒 2021-02-03 12:29

I need help tackling a problem. I need a program which, given a site, finds and extracts the \"main\" picture, i.e. the one which represents the site. (To say it is the bigg

5条回答
  •  故里飘歌
    2021-02-03 12:49

    Another solution would be to extract the meta tags for social media sharing first, if they are present, you are lucky otherwise you stil can try the other solutions.

    
    
    
    

    If you are yousing JSOUP the code would be like that:

        String imageUrlOpenGraph = document.select("meta[property=og:image]").stream()
                .findFirst()
                .map(doc -> doc.attr("content").trim())
                .orElse(null);
    
        String imageUrlTwitter = document.select("meta[name=twitter:image]").stream()
                    .findFirst()
                    .map(doc -> doc.attr("content").trim())
                    .orElse(null);
    
        String imageUrlGooglePlus = document.select("meta[itemprop=image]").stream()
                    .findFirst()
                    .map(doc -> doc.attr("content").trim())
                    .orElse(null);
    

提交回复
热议问题