BeautifulSoup HTML getting src link

后端 未结 1 868
慢半拍i
慢半拍i 2021-01-07 02:23

I\'m making a small web crawler using python 3.5.1 and requests module, which downloads all comics from a specific website.I\'m experimenting with one page. I parse the page

相关标签:
1条回答
  • I would do it in one go using a CSS selector:

    for img in soup.select("a.img-link img[src]"):
        print(img["src"])
    

    Here, we are getting all of the img elements having an src attribute located under an a element with a img-link class. It prints:

    http://2.p.mpcdn.net/352582/687224/1.jpg
    http://2.p.mpcdn.net/352582/687224/2.jpg
    http://2.p.mpcdn.net/352582/687224/3.jpg
    http://2.p.mpcdn.net/352582/687224/4.jpg
    ...
    http://2.p.mpcdn.net/352582/687224/20.jpg
    

    If you still want to use the find_all(), you would have to nest it:

    for link in soup.find_all("a", class_ = "img-link"):
        for img in link.find_all("a", src=True):  # searching for img with src attribute
            print(img["src"])
    
    0 讨论(0)
提交回复
热议问题