Jsoup getting background image path from css

前端 未结 3 2042
臣服心动
臣服心动 2021-01-19 07:23

I am looking for all of the images on a given website.

For this purpose i need to find the ones that are within the css for example:

   .gk-crop {
           


        
相关标签:
3条回答
  • 2021-01-19 07:28

    Jsoup doesn't parse css files.

    Have a look at this to know what Jsoup is responsible for.

    You need a separate css parser to extract url from css files. Have a look at this

    0 讨论(0)
  • 2021-01-19 07:31

    Just like Niranjan mentioned, Jsoup is not for parsing CSS but XML. If you really need to extract some images from CSS, you will need to use some some 3rd party library for that purpose OR write simple regex for grabbing URLs from CSS file - its still plain text isn't it? This is not flexible resolution to your problem, but it would be the fastest one:)

    0 讨论(0)
  • 2021-01-19 07:42

    If you want to select the URL's of all the images on a website you can select all the image tags and then get the absolute URL's.

    Example:

    String html = "http://www.bbc.co.uk";
    Document doc = Jsoup.connect(html).get();
    
    Elements titles = doc.select("img");
    
    for (Element e : titles) {
        System.out.println(e.absUrl("src"));
    }
    

    which will grab all the <img> elements and present it, such as

    http://sa.bbc.co.uk/bbc/bbc/s?name=SET-COUNTER&pal_route=index&ml_name=barlesque&app_type=web&language=en-GB&ml_version=0.16.1&pal_webapp=wwhp&blq_s=3.5&blq_r=3.5&blq_v=default-worldwide
    http://static.bbci.co.uk/frameworks/barlesque/2.50.2/desktop/3.5/img/blq-blocks_grey_alpha.png
    http://static.bbci.co.uk/frameworks/barlesque/2.50.2/desktop/3.5/img/blq-search_grey_alpha.png
    http://news.bbcimg.co.uk/media/images/69139000/jpg/_69139104_69139103.jpg
    http://news.bbcimg.co.uk/media/images/69134000/jpg/_69134575_waynerooney1.jpg
    

    If you only want the .JPG files, tell the selector that by including

    Elements titles = doc.select("img[src$=.jpg]");
    

    which result in only parsing the .JPG-urls.

    0 讨论(0)
提交回复
热议问题