Getting the nth element using BeautifulSoup

后端 未结 5 1851
生来不讨喜
生来不讨喜 2021-01-31 04:09

From a large table I want to read rows 5, 10, 15, 20 ... using BeautifulSoup. How do I do this? Is findNextSibling and an incrementing counter the way to go?

5条回答
  •  清酒与你
    2021-01-31 04:24

    Here's how you could scrape every 5th distribution link on this Wikipedia page with gazpacho:

    from gazpacho import Soup
    
    url = "https://en.wikipedia.org/wiki/List_of_probability_distributions"
    soup = Soup.get(url)
    
    a_tags = soup.find("a", {"href": "distribution"})
    links = ["https://en.wikipedia.org" + a.attrs["href"] for a in a_tags]
    
    links[4::5] # start at 0,1,2,3,**4** and stride by 5
    

提交回复
热议问题