Using findAll in BS4 to create list

天大地大妈咪最大 提交于 2019-12-24 05:48:32

问题


I'll start by saying I'm sort of new with Python. I've been working on a Slack bot recently and here's where I'm at so far.

source = requests.get(url).content
soup = BeautifulSoup(source, 'html.parser')
price = soup.findAll("a", {"class":"pricing"})["quantity"]

Here is the HTML code I am trying to scrape.

<a class="pricing" saleprice="240.00" quantity="1" added="2017-01-01"> S </a>
<a class="pricing" saleprice="21.00" quantity="5" added="2017-03-14"> M </a>
<a class="pricing" saleprice="139.00" quantity="19" added="2017-06-21"> L </a>

When I only use soup.find(), I'm able to find the first quantity value but I need all of them within a list. I looked into using a different library like lxml instead of bs4 but didn't have any luck with that either. Any help is really appreciated as I've already spent a long time on this.


回答1:


The findAll method returns a list of bs4 Tag elements, so you can't select attributes directly. However you can select attributes from the items in that iterable with a simple list comprehension.

price = [a.get("quantity") for a in soup.findAll("a", {"class":"pricing"})] 

Note that it's best to use get when accessing attributes because it returns None (or you can set a default value) if the key does not exist in the attrs dictionary.

As pointed out by Jon Clements you could filter by 'class' and 'quantity' if you don't want your list to have None items, in case some items have no 'quantity' attribute.

price = [a["quantity"] for a in soup.find_all("a", {"class":"pricing", "quantity":True})] 


来源:https://stackoverflow.com/questions/45410774/using-findall-in-bs4-to-create-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!